I. Ozkan
Spring 2025
Read:
Chapter 5 of FPP: The forecaster’s toolbox
This topic was discussed in different courses. This step includes
loading in data, identifying missing values, filtering the time series,
and other pre-processing tasks. Useful packages are tsibble
and tidyverse
Example: Obtain GDP per capita in global_economy
data
coming with tsibbledata
package
gdppc <- global_economy |>
mutate(GDP_per_capita = GDP / Population)
Example: Plot the GDP per capita for Sweden
gdppc |>
filter(Country == "Sweden") |>
autoplot(GDP_per_capita) +
labs(y = "$US", title = "GDP per capita for Sweden") +
theme_bw()
Example: Linear Model Definition (using fable
package
function). Check the formula specification
TSLM(GDP_per_capita ~ trend())
One or more model specifications can be estimated using the
model()
function of fabletools
package
fit <- gdppc |>
model(trend_model = TSLM(GDP_per_capita ~ trend()))
# A mable: 263 x 2
# Key: Country [263]
Country trend_model
<fct> <model>
1 Afghanistan <TSLM>
2 Albania <TSLM>
3 Algeria <TSLM>
4 American Samoa <TSLM>
5 Andorra <TSLM>
6 Angola <TSLM>
7 Antigua and Barbuda <TSLM>
8 Arab World <TSLM>
9 Argentina <TSLM>
10 Armenia <TSLM>
# ℹ 253 more rows
How well the model has performed on the data
forecast()
function with argument horizon, for example
10 observation ahead h=10
, can be used to forecast
# A fable: 789 x 5 [1Y]
# Key: Country, .model [263]
Country .model Year
<fct> <chr> <dbl>
1 Afghanistan trend_model 2018
2 Afghanistan trend_model 2019
3 Afghanistan trend_model 2020
4 Albania trend_model 2018
5 Albania trend_model 2019
6 Albania trend_model 2020
7 Algeria trend_model 2018
8 Algeria trend_model 2019
9 Algeria trend_model 2020
10 American Samoa trend_model 2018
# ℹ 779 more rows
# ℹ 2 more variables: GDP_per_capita <dist>, .mean <dbl>
The forecast can be plotted to observe the performance
Mean method
\(\hat{y}_{T+h|T} = \bar{y} = (y_{1}+\dots+y_{T})/T\)
Notation: \(\hat{y}_{T+h|T}\) means estimate of \(\hat{y}_{T+h}\) based on \(y_1,\dots,y_T\)
Example: Australian clay brick production
Seasonal naïve method
A similar method is useful for highly seasonal data
Each forecast to be equal to the last observed value from the same season
\(\hat{y}_{T+h|T} = y_{T+h-m(k+1)}\)
where \(m\) is the seasonal period, \(k\) is the integer part of \((h-1)/m\) (the number of complete years)
Drift method
\(\hat{y}_{T+h|T} = y_{T} + \frac{h}{T-1}\sum_{t=2}^T (y_{t}-y_{t-1}) = y_{T} + h \left( \frac{y_{T} -y_{1}}{T-1}\right)\)
Mean, Naïve and Seasonal naïve together
Fitted Values \(\hat{y}_{t|t-1}\)
Each observation in a time series can be forecast using all previous observations, called fitted values \(\hat{y}_{t|t-1}\). Sometimes we write \(\hat{y}_{t}\) instead of \(\hat{y}_{t|t-1}\)
For example, for the mean method, \(\hat{y}_{t}=\hat c\) where \(\hat{c}= (y_T-y_1)/(T-1)\) is the average of \(y_t\) over the period
Residuals
\(e_{t} = y_{t}-\hat{y}_{t}\)
If transformation applied before fitting the model, \(w_t=log(y_t)\), then the residual
calculated as \(w_{t}-\hat{w}_{t}\) is
called innovation residuals and \(y_{t}-\hat{y}_{t}\) is called
residuals.
augment()
function can be used to obtain the fitted values
and residuals
.model | Quarter | Beer | .fitted | .resid | .innov |
---|---|---|---|---|---|
Mean | 1992 Q1 | 443 | 436.45 | 6.55 | 6.55 |
Mean | 1992 Q2 | 410 | 436.45 | -26.45 | -26.45 |
Mean | 1992 Q3 | 420 | 436.45 | -16.45 | -16.45 |
Mean | 1992 Q4 | 532 | 436.45 | 95.55 | 95.55 |
Mean | 1993 Q1 | 433 | 436.45 | -3.45 | -3.45 |
Mean | 1993 Q2 | 421 | 436.45 | -15.45 | -15.45 |
Mean | 1993 Q3 | 410 | 436.45 | -26.45 | -26.45 |
Good models yield:
In addition to these essential properties, it is useful (but not necessary) to have the following two properties:
Example, Google daily closing stock prices
\(\hat{y}_{t} = y_{t-1}\)
Symbol | Date | Open | High | Low | Close | Adj_Close | Volume | day |
---|---|---|---|---|---|---|---|---|
GOOG | 2015-01-02 | 526.11 | 528.36 | 521.23 | 521.94 | 521.94 | 1447600 | 1 |
GOOG | 2015-01-05 | 520.40 | 521.46 | 510.25 | 511.06 | 511.06 | 2059800 | 2 |
GOOG | 2015-01-06 | 512.18 | 513.35 | 498.31 | 499.21 | 499.21 | 2899900 | 3 |
GOOG | 2015-01-07 | 504.23 | 504.47 | 496.92 | 498.36 | 498.36 | 2065100 | 4 |
GOOG | 2015-01-08 | 495.26 | 500.72 | 488.31 | 499.93 | 499.93 | 3353600 | 5 |
GOOG | 2015-01-09 | 502.00 | 502.16 | 492.08 | 493.45 | 493.45 | 2069400 | 6 |
GOOG | 2015-01-12 | 492.23 | 493.26 | 484.89 | 489.85 | 489.85 | 2322400 | 7 |
Residual Distribution
The mean of the residuals is close to zero
There is no significant correlation in the residuals series
There appears a one potential outlier, hence the variance can be treated as constant
The histogram suggests that the residuals may not be normal
Forecasts from this method will probably be quite good, (prediction intervals may be inaccurate)
gg_tsresiduals()
function may be used to obtain all
these graphs
Portmanteau tests for autocorrelation
One of the formal test for autocorrelation by considering a whole set of autocorrelation estimates, \(r_k\), as a group
Box-Pierce test
\(Q = T \sum_{k=1}^\ell r_k^2\) in which first \(\ell\) (it may be set for 10 for non seasonal data and \(2m\) for seasonal data) autocorrelation values are considered. \(T\) is the number of observations
\(Q^* = T(T+2) \sum_{k=1}^\ell (T-k)^{-1}r_k^2\)
Example: Box-Pierce with \(\ell=10\)
# A tibble: 1 × 4
Symbol .model bp_stat bp_pvalue
<chr> <chr> <dbl> <dbl>
1 GOOG NAIVE(Close) 7.74 0.654
Example: Ljung-Box with \(\ell=10\)
# A tibble: 1 × 4
Symbol .model lb_stat lb_pvalue
<chr> <chr> <dbl> <dbl>
1 GOOG NAIVE(Close) 7.91 0.637
Prediction intervals
\(\hat{y}_{T+h|T} \pm 1.96 \hat\sigma_h\)
where, \(\hat\sigma_h\) is an estimate of the standard deviation of the \(h\)-step forecast distribution
or in general,
\(\hat{y}_{T+h|T} \pm c \hat\sigma_h\)
\(c\) depends on the coverage probability
One-step prediction intervals
\(\hat{\sigma} = \sqrt{\frac{1}{T-K-M}\sum_{t=1}^T e_t^2}\)
\(K\): Number of parameters estimated in the forecasting method
\(M\): Number of missing values in residuals (\(M=1\) for a naive forecast)
Benchmark method | \(\hat\sigma_h\) |
---|---|
Mean | \(\hat\sigma_h = \hat\sigma\sqrt{1 + 1/T}\) |
Naïve | \(\hat\sigma_h = \hat\sigma\sqrt{h}\) |
Seasonal Naïve | \(\hat\sigma_h = \hat\sigma\sqrt{k+1}\) |
Drift | \(\hat\sigma_h = \hat\sigma\sqrt{h(1+h/(T-1))}\) |
Symbol | .model | day | Close | .mean | 80% | 95% |
---|---|---|---|---|---|---|
GOOG | NAIVE(Close) | 253 | N(759, 125) | 758.88 | [744.539977027039, 773.220032972961]80 | [736.948824765081, 780.811185234919]95 |
GOOG | NAIVE(Close) | 254 | N(759, 250) | 758.88 | [738.600142955829, 779.159867044171]80 | [727.864632472929, 789.895377527071]95 |
GOOG | NAIVE(Close) | 255 | N(759, 376) | 758.88 | [734.042347968872, 783.717662031128]80 | [720.89408656317, 796.86592343683]95 |
GOOG | NAIVE(Close) | 256 | N(759, 501) | 758.88 | [730.199949054077, 787.560060945922]80 | [715.017644530162, 802.742365469838]95 |
GOOG | NAIVE(Close) | 257 | N(759, 626) | 758.88 | [726.81472765321, 790.94528234679]80 | [709.840395167922, 807.919614832078]95 |
GOOG | NAIVE(Close) | 258 | N(759, 751) | 758.88 | [723.754253569008, 794.005756430992]80 | [705.159803967437, 812.600206032563]95 |
GOOG | NAIVE(Close) | 259 | N(759, 876) | 758.88 | [720.939857189835, 796.820152810165]80 | [700.85555614027, 816.90445385973]95 |
GOOG | NAIVE(Close) | 260 | N(759, 1002) | 758.88 | [718.320280911657, 799.439729088343]80 | [696.849259945858, 820.910750054142]95 |
GOOG | NAIVE(Close) | 261 | N(759, 1127) | 758.88 | [715.859921081116, 801.900088918884]80 | [693.086464295244, 824.673545704756]95 |
GOOG | NAIVE(Close) | 262 | N(759, 1252) | 758.88 | [713.532854894915, 804.227155105085]80 | [689.52752368199, 828.23248631801]95 |
Prediction intervals from bootstrapped residuals (Optional)
\(e_t = y_t - \hat{y}_{t|t-1}\)
Remember for naïve forecasting method, \(\hat{y}_{t|t-1} = y_{t-1}\) then
\(y_t = y_{t-1} + e_t\)
\(y^*_{T+1} = y_{T} + e^*_{T+1}\), where \(e^*_{T+1}\) is a randomly sampled error from the past
Adding, the second simulated observation, \(y^*_{T+2} = y_{T+1}^* + e^*_{T+2}\) and continuing the similar fashion, we can obtained the simulated path
# A tsibble: 150 x 5 [1]
# Key: Symbol, .model, .rep [5]
Symbol .model day .rep .sim
<chr> <chr> <dbl> <chr> <dbl>
1 GOOG NAIVE(Close) 253 1 752.
2 GOOG NAIVE(Close) 254 1 742.
3 GOOG NAIVE(Close) 255 1 729.
4 GOOG NAIVE(Close) 256 1 705.
5 GOOG NAIVE(Close) 257 1 683.
6 GOOG NAIVE(Close) 258 1 682.
7 GOOG NAIVE(Close) 259 1 670.
8 GOOG NAIVE(Close) 260 1 684.
9 GOOG NAIVE(Close) 261 1 679.
10 GOOG NAIVE(Close) 262 1 689.
# ℹ 140 more rows
Prediction intervals can be computed using percentiles of these simulated values
The result is called a bootstrapped prediction interval
Forecasting with Decomposition
\(y_t = \hat{S}_t + \hat{T}_t + \hat{R}_t\)
or
\(y_t = \hat{S}_t \times \hat{T}_t \times \hat{R}_t\)
Example: US retail employment, first obtain the seasonally adjusted series by decomposing with STL then forecast with naïve method
Forecast by forecastint additive components