MIS 306 - Data Analysis: Forecasting


Decomposition




Cankaya University

I. Ozkan

Spring 2025

Time Series Decomposition

Trend exists when there is a long-term [on average] increase or decrease in the data. It does not have to be linear. Sometimes trend change direction when it might go from an increasing trend to a decreasing trend.

cyclic pattern exists when data exhibit rises and falls that are not of fixed period. The duration of these fluctuations is usually of at least 1.5-2 years

Seasonal pattern exists when a series is influenced by seasonal factors (e.g., the quarter of the year, the month, or day of the week). Seasonality is always of a fixed and known period.

Transformations and adjustments

Calendar Adjustments are often called calendar effects, are the adjustment to remove these effects before analysis

Population adjustments are often use to remove the effect of population variation.

Inflation adjustments are often used for the data that are affected by the value of money

Mathematical Transformations

Mathematical transformations are often used for the data shows variation that increases or decreases with the level of the series, then a transformation can be useful. Most commonly used transformation is log transformation, \(w_t=log(y_t)\) (note: changes in a log value are relative (or percentage) changes on the original scale). Another example is power transformations (flexible family of transformations introduced by Box and Cox 1964, Box-Cox transformation)

Log Transformation, \(log(y_t)\)

Used when all \(Y_t>0\). Let \(E[Y_t]=\mu_t\) and \(\sqrt{V(Y_t)}=\mu_t \sigma\) then \(E[log(Y_t)]=log(\mu_t)\) and \(V(log(Y_t))=\sigma^2\)

\(\implies log(Y_t) \approx log(\mu_t) + \frac{Y_t-\mu_t}{\mu_t}\)

if the standard deviation of the series is proportional to the level of the series, then transforming to logarithms will produce a series with approximately constant variance over time

Box- transformations

\[\begin{equation} w_t = \begin{cases} \log(y_t) & \text{if $\lambda=0$}; \\ (y_t^\lambda-1)/\lambda & \text{otherwise}. \end{cases} \end{equation}\]

Time Series Components

Additive Decomposition

\(y_{t} = S_{t} + T_{t} + R_t\)

It is the most appropriate if the magnitude of the seasonal fluctuations, or the variation around the trend-cycle, does not vary with the level of the time series

If not,

Multiplicative Decomposition

\(y_{t} = S_{t} \times T_{t} \times R_t\)

equivalent

\(log(y_{t}) = log(S_{t}) + log(T_{t}) + log(R_t)\)

Just to show an example, let’s use STL decomposition and get the first 6 observations

# A dable: 6 x 7 [1M]
# Key:     .model [1]
# :        Employed = trend + season_year + remainder
  .model    Month Employed  trend season_year remainder season_adjust
  <chr>     <mth>    <dbl>  <dbl>       <dbl>     <dbl>         <dbl>
1 stl    1990 Jan   13256. 13288.       -33.0     0.836        13289.
2 stl    1990 Feb   12966. 13269.      -258.    -44.6          13224.
3 stl    1990 Mar   12938. 13250.      -290.    -22.1          13228.
4 stl    1990 Apr   13012. 13231.      -220.      1.05         13232.
5 stl    1990 May   13108. 13211.      -114.     11.3          13223.
6 stl    1990 Jun   13183. 13192.       -24.3    15.5          13207.

Seasonal Plots

Multiple seasonal periods

# A tsibble: 6 x 5 [30m] <Australia/Melbourne>
  Time                Demand Temperature Date       Holiday
  <dttm>               <dbl>       <dbl> <date>     <lgl>  
1 2012-01-01 00:00:00  4383.        21.4 2012-01-01 TRUE   
2 2012-01-01 00:30:00  4263.        21.0 2012-01-01 TRUE   
3 2012-01-01 01:00:00  4049.        20.7 2012-01-01 TRUE   
4 2012-01-01 01:30:00  3878.        20.6 2012-01-01 TRUE   
5 2012-01-01 02:00:00  4036.        20.4 2012-01-01 TRUE   
6 2012-01-01 02:30:00  3866.        20.2 2012-01-01 TRUE   

Sub-series Plot

Moving averages

\(\hat{T}_{t} = \frac{1}{m} \sum_{j=-k}^k y_{t+j}\)

where, \(m=2k+1\)

# A tsibble: 12 x 3 [1Y]
    Year Exports `5-MA`
   <dbl>   <dbl>  <dbl>
 1  1960    2.06  NA   
 2  1961    5.12  NA   
 3  1962    5.60   4.29
 4  1963    4.18   4.79
 5  1964    4.47   4.58
 6  1965    4.56   4.28
 7  1966    4.09   4.18
 8  1967    4.11   4.01
 9  1968    3.68   3.98
10  1969    3.60   4.23
11  1970    4.43   4.61
12  1971    5.32   5.28

# A tsibble: 10 x 4 [1Q]
   Quarter  Beer `4-MA` `2x4-MA`
     <qtr> <dbl>  <dbl>    <dbl>
 1 1992 Q1   443    NA       NA 
 2 1992 Q2   410   451.      NA 
 3 1992 Q3   420   449.     450 
 4 1992 Q4   532   452.     450.
 5 1993 Q1   433   449      450.
 6 1993 Q2   421   444      446.
 7 1993 Q3   410   448      446 
 8 1993 Q4   512   438      443 
 9 1994 Q1   449   441.     440.
10 1994 Q2   381   446      444.

\[\begin{align*} \hat{T}_{t} &= \frac{1}{2}\Big[ \frac{1}{4} (y_{t-2}+y_{t-1}+y_{t}+y_{t+1}) + \frac{1}{4} (y_{t-1}+y_{t}+y_{t+1}+y_{t+2})\Big] \\ &= \frac{1}{8}y_{t-2}+\frac14y_{t-1} + \frac14y_{t}+\frac14y_{t+1}+\frac18y_{t+2}. \end{align*}\]

Classical decomposition

Additive decomposition

\(y_{t} = T_{t} + S_{t} + R_t\)

Step 1: compute the trend-cycle component, \(\hat{T_t}\)

Step 2: Calculate the detrended series, \(Y_t - \hat{T_t}\)

Step 3: To estimate the seasonal component for each season, simply average the detrended values for that season, \(\hat{S_t}\)

Step 4: The remainder component is calculated by subtracting the estimated seasonal and trend-cycle components

\(\hat{R_t}=Y_t - \hat{T_t} - \hat{S_t}\)

Additive decomposition

\(y_{t} = T_{t} \times S_{t} \times R_t\)

Step 1: compute the trend-cycle component, \(\hat{T_t}\)

Step 2: Calculate the detrended series, \(Y_t / \hat{T_t}\)

Step 3: To estimate the seasonal component for each season, simply average the detrended values for that season, \(\hat{S_t}\)

Step 4: The remainder component is calculated by subtracting the estimated seasonal and trend-cycle components

\(\hat{R_t}=Y_t / \hat{T_t} \hat{S_t}\)