MIS 306 - Data Analysis: Forecasting


Introduction




Cankaya University

I. Ozkan

Spring 2025

ECO665 Applied Time Series Analysis

Instructor: Ibrahim Ozkan, Ph.D.

Office: Room No=K210/K207
Tel: See Department Web Site

Course Web site: MIS 306

Lectures: Every Thursday between 13:20 - 16:00 (with up to two breaks)

Textbook:

Hyndman, R. J., & Athanasopoulos, G. (2021). Forecasting: principles and practice. OTexts.

Suggested Books:

Terence C. Mills and Raphael N. Markellos, “The Econometric Modelling of Financial Time Series”, Third+ edition, Cambridge University Press

Jonathan D. Cryer and Kung-Sik Chan, “Time Series Analysis with Applications in R”, second edition, Springer

R and Resources

Course Content

Topics

Time Series

“A time series can be thought of as a list of numbers (the observations), along with some information about what times those numbers were recorded (the index).” Forecasting: Principles and Practice

Book: Forecasting: Principles and Practice

Book: Forecasting: Principles and Practice

Time Series

An example, Unemployment Data: See ?lmtest::unemployment[*]

[*] J.D. Rea (1983), The Explanatory Power of Alternative Theories of Inflation and Unemployment, 1895-1979. Review of Economics and Statistics 65, 183–195

unemp <- lmtest::unemployment
window(unemp, start=1972, end=1976)
Time Series:
Start = 1972 
End = 1976 
Frequency = 1 
        UN     m     p     G       x
1972 5.593 500.9 1.000 253.1  77.500
1973 4.853 549.1 1.057 253.5 103.700
1974 5.577 595.4 1.149 261.2 127.219
1975 8.455 641.3 1.256 266.7 123.374
1976 7.690 704.6 1.321 266.8 129.359

Time Series

Time Series Analysis - Domains

Time Series Decomposition

Additive Decomposition

\(y_{t} = S_{t} + T_{t} + R_t,\)

Multiplicative Decomposition

\(y_{t} = S_{t} \times T_{t} \times R_t,\)

Forecastor’s Toolbox

Some Simple Forecasting Methods

\(\hat{y}_{T+h|T} = \bar{y} = (y_{1}+\dots+y_{T})/T\)

\(\hat{y}_{T+h|T} = y_{T}\)

\(\hat{y}_{T+h|T} = y_{T+h-m(k+1)}\)

\(\hat{y}_{T+h|T} = y_{T} + \frac{h}{T-1}\sum_{t=2}^T (y_{t}-y_{t-1}) = y_{T} + h \left( \frac{y_{T} -y_{1}}{T-1}\right)\)

Exponential Smoothing

Simple exponential smoothing is given as,

\(\hat{X}_{t+1|t} = \alpha X_t + \alpha(1-\alpha) x_{t-1} + \alpha(1-\alpha)^2 x_{t-2}+ \cdots\)

where \(0<\alpha<1\) is a smoothing parameter. This can be written in weighted average form as:

\(\hat{x}_{t+1|t} = \alpha x_t + (1-\alpha) \hat{x}_{t|t-1},\)

and in the component form (see chapter 8 of FPP):

\(\begin{align*} \text{Forecast equation} && \hat{x}_{t+h|t} & = \ell_{t}\\ \text{Smoothing equation} && \ell_{t} & = \alpha x_{t} + (1 - \alpha)\ell_{t-1}, \end{align*}\)


Forecast method: Simple exponential smoothing

Model Information:
Simple exponential smoothing 

Call:
ses(y = oildata, h = 8)

  Smoothing parameters:
    alpha = 0.8339 

  Initial states:
    l = 446.5868 

  sigma:  29.8282

     AIC     AICc      BIC 
178.1430 179.8573 180.8141 

Error measures:
                   ME     RMSE     MAE      MPE     MAPE      MASE        ACF1
Training set 6.401975 28.12234 22.2587 1.097574 4.610635 0.9256774 -0.03377748

Forecasts:
     Point Forecast    Lo 80    Hi 80    Lo 95    Hi 95
2014       542.6806 504.4541 580.9070 484.2183 601.1429
2015       542.6806 492.9073 592.4539 466.5589 618.8023
2016       542.6806 483.5747 601.7864 452.2860 633.0752
2017       542.6806 475.5269 609.8343 439.9778 645.3834
2018       542.6806 468.3452 617.0159 428.9945 656.3667
2019       542.6806 461.7988 623.5624 418.9826 666.3786
2020       542.6806 455.7439 629.6173 409.7224 675.6388
2021       542.6806 450.0841 635.2771 401.0665 684.2947

Trend and Seasonal Methods

Some Functions of forecast Package

ETS(M,A,M) 

Call:
ets(y = aust)

  Smoothing parameters:
    alpha = 0.1908 
    beta  = 0.0392 
    gamma = 2e-04 

  Initial states:
    l = 32.3679 
    b = 0.9281 
    s = 1.0218 0.9628 0.7683 1.2471

  sigma:  0.0383

     AIC     AICc      BIC 
224.8628 230.1569 240.9205 

Training set error measures:
                     ME     RMSE     MAE        MPE     MAPE     MASE      ACF1
Training set 0.04836907 1.670893 1.24954 -0.1845609 2.692849 0.409454 0.2005962

Time Series Models

\(Data=Pattern+Error\)

\(Dependent \: Variable=Function(Independent \: Variables)+Errors\)

\(y=f(x)+\varepsilon_t\)

\(y_t=f(y_{t-1},y_{t-2},..,y_{t-k}, x_t, x_{t-1}, x_{t-2},.., x_{t-h})+\varepsilon_t\)

Where function, \(f(.)\) represents pattern. Pattern depends on the Lagged values of both dependent and independent variables.

Time Series Models Covered in This Course

For univariate case, our models will be like:

\(x_t=\Phi_1 x_{t-1} + \Phi_2 x_{t-2} + ...+ \Phi_p x_{t-p} + a_t\)

where,

\(a_t \: i.i.d. \sim N(0,\sigma^2)\) called as white noise. Or we may re-write this model as:

\(\Phi(B)x_t=a_t\)

with,

\(\Phi(B)=Polynomial \: characteristic \: function \: of \: B\),

\(Bx_t \equiv x_{t-1}\) or in general \(B^kx_t \equiv x_{t-k}\).

\(B\) called as BackShift operator. Some textbook use \(L\) instead of \(B\) with the same functionality, i.e.,
\(\Phi(L)x_t=a_t\) with \(L^kx_t \equiv x_{t-k}\) and \(L\) is called Lag operator. This type of models are called AutoRegressive, \(AR(p)\), models. An autoregressive model of order \(p\).

Another type is,

\(x_t=a_t + \theta_1 a_{t-1} + \theta_2 a_{t-2} + ...+ \theta_q a_{t-q}\)

\(x_t=\Theta(B)a_t\)

sames as the previous one, \(\Theta(B)\) is a q^th order polynomial function of \(B\) and these models are called as Moving Average models. This example is a moving average model of order \(q\).

ARIMA Models

Non-stationary stochastic univariate time series models called as AutoRegressive Integrated Moving Average (\(ARIMA(p,d,q)\)) models.

\(w_{t} = \phi_{1}w_{t-1} + \cdots + \phi_{p}w_{t-p} + a_t + \theta_{1}a_{t-1} + \cdots +\theta_{q}a_{t-q}\)

where \(w_t\) is d-times differenced series (one difference: \(w_{t}=x_t-x_{t-1}=(1-B)x_t\)) and \(a_t\) is White Noise. In \(ARIMA(p,d,q)\) notation, d stands for integration order, or in other terms number of differences. In formal notation;

\(\Phi(B)(1-B)^dx_t=\Theta(B)a_t\)

where, \(\Phi(B)\) and \(\Theta(B)\) are polynomial functions of order \(p\) and \(q\) respectively.

Time Series Regression

\(y_t = \beta_0 + \beta_1 x_{1,t} + \dots + \beta_k x_{k,t} + \varepsilon_t\)

where, \(x_1\) to \(x_k\) are exogenous variables (also called as regressors, predictor or explanatory variables.) that explains \(y_t\). \(y_t\) is calles, regressand, dependent, explained or forecast variable. \(\varepsilon_t\) is assumed to be white noise, \(a_t\).

\(\begin{align*} y_t &= \beta_0 + \beta_1 x_{1,t} + \dots + \beta_k x_{k,t} + \eta_t,\\ & (1-\phi_1B)(1-B)\eta_t = (1+\theta_1B)a_t, \end{align*}\)

where, \(\eta_t\) is an \(ARIMA(1,1,1)\).