8: Linear Time Series Analysis
LINEAR TIME SERIES ANALYSIS Sivaramane, N. Indian Agricultural Statistics Research Institute, New Delhi-110012
Time series data The variable containing observations over time is called a time series variable and the dataset as time series data. Each observation is referenced with a point of time say, date or month or year or even in seconds and microseconds. Time series data may be evenly spaced like daily sales data or unevenly spaced, e.g. measuring the weight of animals at different periodicity, say, first few observations are taken at daily basis, next few observations at weekly basis, and subsequently at monthly and annual basis based on the variations in the data. However in this training session, our focus will be on the evenly spaced time series data. Some examples of a typical time series data in agricultural research • • • • •
Annual yield of a particular crop in a particular location over years Consumption of foodgrains over months Sale of pesticides over years Private investment on agriculture (annual data) Monthly data on employment in tea garden
Components of a time series Any time series can contain some or all of the following components: 1. Trend (T) 2. Cyclical (C) 3. Seasonal (S) 4. Irregular (I) These components may be combined in several ways. It may be combined in additive or multimplicate mode or a mix of both. yt = T* C* S* I yt = T + C + S + I yt=(T+I)*S The trend component may be linear or quadratic or curvilinear. The process of removing the trend in the original data is called detrending. Detrending may be accomplished either
93
8: Linear Time Series Analysis through provisioning a trend component in a regression model or through the use of MA model. This trend denotes the long run phenomenon. Cycles can be observed in long time series data. Cycles may be related to business phenomenon like business cycles or may be related to some biological phenomenon. Cycles may be sometimes difficult to detect and hence ignored. Seasonal components are common in a high frequency data. Seasonality may be based on agricultural seasons like kharif, rabi and zaid or weather based seasons like summer autumn spring and winter or it may be quarters or months, or any other periodicity within a year. Generally seasonality is assumed to follow a definite pattern across years. The unexplained part of the time series data or the residual left over after removing the cyclical, trend and seasonality component of a time series data is called Irregular component. Forecasting Forecasting (or prediction) refers to the process of generating future values of a particular event. Forecast is an important purpose of any time series analysis. Forecasting may be performed either through quantitative methods with the application of simple or complicated statistical procedure (exponential smoothing, ARIMA, etc.) or through simple guess or opinions based on subjective and judgmental methods (Delphi method). Forecasting can also be made through a mix of several approaches.
Time series methods Time series methods or models use different statistical methods or models to treat the time series data appropriately to draw inferences. These models may be univariate, that is, without the inclusion of explanatory variables or using explanatory variables and causative relationship between the variables. The time series models also include multivariate techniques with and without the inclusion of the explanatory variables. Recently, nonlinear time series models such as autoregressive conditional heteroscedasticity models have gained prominence in time series analysis. The focus of this training session will be on univariate time series models including exponential smoothing models and Autoregressive Integrated Moving Averages (ARIMA). Moving averages model also sometimes used for forecasting but its main purpose is to smoothen the series. Any univariate technique holds good only for a short-run forecast. 94
8: Linear Time Series Analysis However, relatively ARIMA produces forecasts for few more periods ahead as compared to simple exponential smoothing model which gives forecast for a very short period. Modeling Procedure involves (a) Training; (b) Validation and (c) Forecasting. A training data set of approximately two-third of the original data set is used for training. The remaining onethird of the data set is reserved for forecasting. Once the parameters are optimized, they are used for the forecast process. Exponential smoothing techniques The exponential smoothing models are used since long time and its strength lies in its simplicity. Over years, from experience the weights of exponential smoothing model were decided apriori and used to extract forecast values. However, with the advent of advanced computational techniques and power of computation, the optimal values of the weights can be determined statistically. There different kinds of exponential smoothing models and the most popular among them are simple exponential, Holt’s, Winter’s and Holt-Winter’s models. Naive model
Ft +i = Yt The naïve model is useful and will perform most satisfactorily when the actual historical data is very short and contains no systematic pattern, or a pattern that is changed very slowly. Mean Forecast model
Ft +i = Y The mean forecast model will perform most satisfactorily when the actual historical data is fluctuated around a constant or stationary value. Average change model
Ft +1 = Yt + average of changes Average of changes =
Ft+1 =Yt +
(ΔYt −1 + ΔYt ) 2
(ΔYt−1 +ΔYt ) 2
Or the current predicted value is 95
8: Linear Time Series Analysis
Ft = Yt −1 +
(ΔYt −2 + ΔYt −1 ) 2
where ΔYt = Yt − Yt −1 Average Percent Change model
Ft +i = Yt + Average of percent changes Average of percent changes =
(ΔYt −1 / Yt −1 ) + (ΔYt / Yt ) 2
Moving Average 1. Simple moving average (SMA): The idea of moving average is to find the trend of irregular data (smoothing the irregular data for better visual). Assume that a future value will equal an average of past values:
Ft = SMA(n) t =
(Yt −n + ... + Yt −3 + Yt −2 + Yt −1 ) n
Where Y is the actual value of a series, and F is the forecasted value. An n-period moving average denotes that each new forecast moves ahead one period by adding the newest actual and dropping the oldest actual. For example of a four-period moving average, the forecast value is formulated as
Ft = SMAt =
(Yt −4 + Yt −3 + Yt −2 + Yt −1 ) 4
Ft −1 = SMAt −1 = …
(Yt −5 + Yt −4 + Yt −3 + Yt −2 ) 4
…
Ft −k = SMAt −k =
(Yt −k −4 + Yt −k −3 + Yt −k −2 + Yt −k −1 ) 4
Therefore, a forecast series of simple moving average, SMAt or Ft, is constructed. The forecast error is
et = Yt − Ft
The accuracy of different models by choosing different average periods is to minimize the errors. In order to compare the accuracy of different models, some criterions are commonly used:
96
8: Linear Time Series Analysis
Mean error (ME) =
∑ (Y − F ) = e t
t
t
n
Sum of squared Error (SSE) =
∑ (Y − F ) = ∑ e 2
t
t
Square Root of Mean Squared Error (RMSE) =
2 t
∑ (Y − F ) t
2
t
n −1
In general, the optimal number of periods to have in a SMA is that the number minimizing the RSE. For examples, a four-period average only has a memory of four past periods, and an eight-period average only remembers eight past observations. A long-period SMA yields the lowest RSE when a series is very random and erratic. A short-period SMA will yield a lowest RSE when a series is random but moves smoothly up and down. (This series is also implied with highly autocorrelation).
2. Double moving average: It is always denoted as MA(pxq), an p-period moving average of q-period moving average, q is the length of the first moving average, and p is the second moving average. For example: A four-period simple moving average:
(Yt −4 + Yt −3 + Yt −2 + Yt −1 ) 4 (Y + Y + Y + Y ) SMAt −1 = t −5 t −4 t −3 t −2 4 SMAt =
……...etc. A 3-period double moving average:
Ft = DMAt = In general,
Ft = DMAt =
( SMAt −3 + SMAt −2 + SMAt −1 ) 3 ( SMAt −n + ... + SMAt −3 + SMAt −2 + SMAt −1 ) n
Advantage: It smoothes large random variations, it is less influenced by the outliers than the method of the first differences. Disadvantage: It does not model the seasonality of the series. Also, there is problem of
determining the optimal number of periods. Application of SMA in the financial market The moving average indicator is also known as trend indicator or primary indicator. Based
97
8: Linear Time Series Analysis on different period to calculate the n-period moving average is simply a mechanic arithmetical calculation, other than use for indicator, it is also use for as a supporting point or resisting indicator to the stock market fluctuation. For example: When the price level goes up as well as the MA curve goes up, it indicates the market has an upward trend. But when the market has a sudden drop, and the MA is still continue to go up, therefore, the MA has its function to indicate the market should has some supporting points for the price to reverse or drop. Therefore, the MA provides an alarm signal for the market to turn upward or downward.
Price buy
MA HSPI sell
Time
In general, the best MA (N) model is the one that provides the minimum SSE or RSE. It is because the most important general objective in forecasting is to decrease the width of confidence intervals used to make probability statement about future value. Actual = Forecast +/- Z*RMSE A longer period moving average yields the lowest RSE when a series is very random and erratic or pattern-less (no high autocorrelation). However, if the series is a random and moves smoothly up and down, that is random walk (high autocorrelation), a short period MA will yield a lower RSE. Generally, moving average models work well with pattern-less time series which has not a trend or seasonality in it.
Weight Moving Average It is normally true that the immediate past is most relevant in predict the immediate future. For this reason, weighted moving average (WMA) place more weight on the most recent observations.
Ft = WMA4 = [0.4Yt-1 + 0.3Yt-2 + 0.2Yt-3 + 0.1Yt-4] 98
8: Linear Time Series Analysis Advantages: The weights placed on past can be varied. However, a determination of the optimal weights can be costly. This type of model is most useful when the historical data are characterized by period-to-period changes that are approximately the same size. Limitations of the WMA models: They do not model seasonality and trend. It is very difficult to determine the optimal number of periods because the RSE may not be the best critical values and costly to determine the optimal weight. Therefore the WMA is not frequently used.
Single Exponential Smoothing Forecast requires only three piece of data, the most recent actual, the most recent forecast, and a smoothing constant. The smoothing constant (α) determines the weight given to the most recent past observations and therefore controls the rate of smoothing or averaging. (The lower the alpha, the more the smoothing is. If alpha is higher, it reflects more of recent past).
Ft = α Yt−1 + (1−α)Ft−1 (The current forecast equals to a weighted average of the most recent actual value and the most recent forecasted value. The first actual value is chosen as the forecast for the second period.) In another form: Ft = Ft-1 + α(Yt-1 – Ft-1) It means the current forecast is equal to the old forecast plus a fraction (α) of the error in the previous forecast. Where α always lies between zero (fully smoothened ie. Horizontal line) and one (no smoothing), the best α should be the one that minimize the criteria eg. RMSE. If greater weight is to be given to the most recent actual value, a high smoothing constant is chosen. This refers to as Low Smoothing. When α =1 provides no smoothing because the forecast equals the most recent actual value. This refers to zero smoothing and becomes a one–period MA (Naïve model). Principles to determine α in the SES: (1) For random and pattern-less and erratic time series data, use a larger value of α. (2) For random walk (randomly and smoothly walks up and down without any repeating patterns) time series data, use a smaller value of α. (3) A greater amount of smoothing is desired, Use longer period MA Î Use the smaller α in SES 99
8: Linear Time Series Analysis A smaller amount of smoothing is desired, Use shorter period MA Î Use the higher α in SES. (4) Try different values of α in fitting the SES and based on the minimum RMSE to choose the optimal α. Derivation of Exponential weight for the Past actual
Ft = αYt-1 + (1-α) Ft-1;
Ft-1 = αYt-2 + (1-α) Ft-2
Ft-2 = αYt-3 + (1-α) Ft-3;
Ft-3 = αYt-4 + (1-α) Ft-4
After substitution:
Ft =αYt-1 +α (1-α)Yt-2 +α(1-α)2Yt-3+α(1-α)3Yt-4+…+(1-α) Ft-∞ ∞
∞
Ft = α ∑ (1 − α ) s Yt −s−1
Where
s =0
(1-α)∞Ft-∞ =0
It simply states that the forecast in t-period is equal to a weighted average of all past actual values and one initial forecast (closer to zero). ∞
All weight will be summed to 1 since α ∑ (1 − α ) S = S =0
α 1 − (1 − α )
=1
Seasonal Simple Exponential Smoothing The SES model can easily be applied to seasonal data that does not possess a trend.
Ft = αYt-s + (1-α) Ft-s t-s = t-4 for quarterly data t-s = t-12 for monthly data t-s = t-7 for weekly data Similar derivation method like SES
Ft = αYt-s + α (1-α)Yt-2s + α (1-α)2Yt-3s + α(1-α)3Yt-4s +….+ (1-α)n Ft-ns Any good forecasting model should yield residuals that do not have significant patterns left in the residuals or in the ACFs
Brown’s Double Exponential Smoothing: Let S’ denotes a single smoothed and S” denotes the double smoothed value:
S t' = αYt + (1 − α ) S t'−1 100
8: Linear Time Series Analysis
S t'' = αS t' + (1 − α ) S t''−1 a t = S t' + ( S t' − S t'' ) = 2 S t' − S t'' Where at is the smoothed value of the end of period t.
bt =
α ( S t' − S t'' ) 1−α
Where bt is the estimated trend of the end of period t. Therefore, the m-period forecast is
Ft +m = at + mbt
Same as the SES, the Double smoothing estimation also requires a starting value to initialize the formulas. The common methods for estimating the starting values for Brown’s exponential smoothing are as follows: Let
S t' = S t'' = Y1 ;
a1 = Y1 ;
bt =
(Y2 − Y1 ) + (Y4 − Y3 ) 2
The choice of the initialization value can greatly affect the fits and forecasts. Advantages: It models the trends and level of a time series
It is computationally more efficient than double moving average It requires less data than double moving average. Because one parameter is used, parameter optimization is simple. Disadvantages:
There is some loss of flexibility due to the best smoothing constant for the level and trend may not be equal. It does not model the seasonality of a series.
Holt’s two-parameter Model (Exponential smoothing adjusted for trend)
This model uses a second smoothing constant, β, to separately smooth the linear trend. The exponential smoothing value at end of time period t is S t = αYt + (1 − α )( S t −1 + bt −1 )
where α = The level smoothing constant
101
8: Linear Time Series Analysis
The trend estimate: bt =
β ( S t − S t −1 ) + (1 − β )bt −1
Where β = The trend smoothing constant Then the m-period ahead forecast is Ft + m
= St + mbt
where m = The forecast horizon Initial values: S1 = X 1 ;
b1 =
(Y2 −Y1) + (Y4 −Y3 ) 2
Advantages: The smoothed level is adjusted by the trend from the same period. This adjustment eliminates the natural lag of single smoothing. Since α is used to smooth the new actual and trend adjusted from previous smoothed level, and β is used to smooth out or average the trend. This removes some of the random error that would be reflected in the un-smoothed trend (St - St-1). It is more flexible in that the level and trend can be smoothed with different weights. Limitations: It is difficult to decide what initial smoothing constants should be used and be optimized, because there are various combinations of the two smoothing parameters, α and β, need to initialize. Thus the search for the best combination of parameters is more complex. It is also not model for seasonality series.
Winter’s three-parameter model (Exponential smoothing adjusted for trend and seasonal variation) This model is extended from Holt’s model to the seasonal case by including a third smoothing operation to adjust for seasonality. The exponential smoothing series is:
St = α
Yt + (1 − α )(S t + bt −1 ) SI t − s
α = The level smoothing constant s = Length of seasonal cycle (e.g., 12 months or 4 quarters). The trend estimate: bt =
β ( S t − S t −1 ) + (1 − β )bt −1
β = The trend smoothing constant
102
8: Linear Time Series Analysis
The seasonality estimate:
SI t = γ
Yt + (1 − γ ) SI t − s St
γ = The seasonal smoothing constant The m-period ahead forecast is Ft + m = ( S t + mbt ) SI t − s + m In this equation, the trend is additive and the seasonal influences is multiplicative. Therefore, alternative models can also be applied as follows:
Ft + m = ( S t + mbt ) + SI t − s + m
(Trend is additive and seasonal is multiplicative)
Ft + m = ( S t btm ) + SI t − s + m
(Trend is multiplicative and seasonal is additive)
Ft + m = ( S t btm ) SI t − s + m
(Both trend and seasonal are multiplicative)
Initial values of b1, S1, and SI1 to SI12 can be estimated by using the seasonal indexes decomposition method and the trend line from the percent moving average method of previous discussed method. Alternatively, a regression decomposition method might also be used to determine these initial values. Data requirements: At least 36 months for three seasons of monthly data, or 16-20 quarters for four to five seasons of quarterly data, or 156 weeks for three seasons of weekly data. Advantages: Winter’s method provides alternative way to adjust the randomness, trend, and seasonality in a model when the data have a seasonal pattern. Particular, the seasonal factor is calculated for the next cycle of forecasting and used to forecast values one or more seasonal cycles ahead. And the seasonal indexes are easily interpreted. Disadvantages: Difficult to determine the initialize values of α, β, and γ. It is too complex for data does not have identified trends and seasonality. Outliers can have a very large effect on forecast.
Adaptive Response-Rate Exponential Smoothing (ARRES) model This model is similar to the simple exponential smoothing model. However, when the errors are not constant and vary over time, then we need to adjust the constant alpha when errors are high or low. The smoothing parameter therefore adapts to the data. The basic formula is:
103
8: Linear Time Series Analysis
Ft = α tYt + (1 − α t ) Ft −1
OR
Ft = Ft −1 + α t (Yt −1 − Ft −1 )
The basic equation is similar to SES except that α is replaced by αt Where αt is called adaptive alpha and it value is between 1 and 0. The adaptive response rate, α, is the ratio of the absolute value of mean forecast error (ME) and the mean absolute forecast error (MAE), and α is called as “tracking signal” because it tracks errors over time.
αt =
At ; Mt
At = β ( Et ) + (1 − β ) At −1
M t = β (Yt − Ft ) + (1 − β ) M t −1 where Et = Yt − Ft
αt = tracking signal in period t used for alpha in forecasting period t+1 At = Smoothed estimate of forecast error and is calculated and a weighted average of At-1 and the last forecasting error Et. Mt = Smoothed estimate of the absolute forecast error and is calculated as a weighted average of Mt-1 and the last absolute forecasting error Et . Initialization: It is little more cumbersome for ARRSES method.
α 2 = α 3 = α 4 = β = 0.2 ;
F2 = Y1 ;
A1 = M 1 = 0 .
If the model is forecasting accurately, ‘A’ will be nearly equal to zero and the ratio αt will therefore be low. However, if the model consistently under or over-forecasts A will approach the value M, and α will equal one. Advantages: Capable of representing almost all data patterns, the value of
αt
change
automatically whenever a change in the data pattern dictates that a change is desire. Limitations: It does not work well for the data that is very randomly and low autocorrelation. The process of re-computing the necessary statistics each time when new observations become available is relatively cumbersome. The forecasts from this technique lag turning points by one period, that is, it does not anticipate turning points in the forecasted time series. References
104
8: Linear Time Series Analysis Makridakis, Wheelwright and Hyndman (1998). Forecasting: Methods and Applications. Wiley (3rd edition). Hamilton, J.D. (1994). Time Series Analysis, Princeton University Press, Pankratz (1983). Forecasting with Univariate Box - Jenkins Models: Concepts and Cases
105