Get reference code

Exponential smoothing

Exponential smoothing theory
Timur2015-11-07 09:09:28

I was going to write an article about technical indicators and tell you about exponential moving average although, it turned out that by studying the theory of this indicator, I came across some interesting things more related to statistics than to the stock market or forex.

Since statistics already been mentioned on this site, I've decided to write a separate article about it and namely the article about exponential smoothing in time series analysis.

This topic was raised in the article Seasonal fluctuations. Seasonal indices. Constant mean method. and particularly there was said that the calculation of average seasonality indexes of average mean methods can be applied to time series where there were no upward/downward trends or they are negligible. In other words, the observed value fluctuates around some permanent value.

What does that mean? It means that the average constant is constant and because of that it can not capture the trend.
Let's illustrate it with a graph

Constant average - graphCreative Commons Attribution/Share-Alike License 3.0 (Unported)
Constant average:
Time series
Import data.
"One of the following characters is used to separate data fields: tab, semicolon (;) or comma(,)": 
Add Import data. Clear table

Generally speaking, all methods of averaging are intended to eliminate "noise" from the random scatter of the data that allows to identify the trend more clearly , or the seasonal or cyclic changes, that is, the internal structure of the data, seemingly random, and use it to build the model, followed by analysis and forecasting of future values - but as we see, the simple averaging method does not work if there is a pronounced trend and we cannot predict anything with its help.
We must be able to receive not only one average but average series. And the most popular (and simple) method to recieve those series is exponential smoothing.

It can be described as follows - When forecasting, newer values of observed values are given the greater weight comparing with older values. At the same time, older values are given an exponentially decreasing weights.

Now we describe the definition with formulas.
Traditionally denote the observed value as y, and smoothed average as S.
S_1 undefined
S_2 = y_1
S_3 = \alpha y_2 + (1-\alpha)S_2

and, generalized

S_t = \alpha y_{t-1} + (1-\alpha)S_{t-1}

where, \alpha takes the value from the range [0;1)

Whence comes the exhibitor - reveal the previous average

S_t = \alpha y_{t-1} + (1-\alpha)S_{t-1} = \alpha y_{t-1} + (1-\alpha)[\alpha y_{t-2} + (1-\alpha)S_{t-2}] = \alpha y_{t-1} + (1-\alpha)[\alpha y_{t-2} + (1-\alpha)[\alpha y_{t-3} + (1-\alpha)S_{t-3}]]
S_t = \alpha y_{t-1} + \alpha(1-\alpha)y_{t-2} + \alpha(1-\alpha)^{2}y_{t-3} + (1-\alpha)^{3}S_{t-3}

and, generalized

S_t=\alpha\sum_{i=1}^{t-2}(1-\alpha)^{i-1}y_{t-i} + (1-\alpha)^{t-2}S_2, for t > 2

Thus, the weight before y - is an infinitely decreasing geometric progression with multiplier 1-\alpha
And the farther S, the less it is affected by the initial values.

Let's assume that y_1=1000 and see how it's contribution changes for the various S.

Weight values change for exponential smoothingCreative Commons Attribution/Share-Alike License 3.0 (Unported)
Weight values change for exponential smoothing:

For S2, it is taken as it is, but in S3 with a coefficient alpha of 0.5, the contribution of y1 is just only 250, in S4 - 125, and so on.

At the same time, the choice of the coefficient is important \alpha. If you play around with the parameter "a" in the calculator (see. Above), it is clear that the higher the value, the faster the countdown actually ceases to affect the smoothed average, and vice versa - the lower, the longer it retains its influence.

Accordingly, for small \alpha, the method of obtaining S2 has a great influence on the result. Assignment S_2=y_1 is just on of the methods. As an alternative, the initial value may be a simple average of the first few values of y, for example.

But how do you choose \alpha? Which index is most suitable for the simulation of the dynamics of this series? There is no mathematical formulas for calculating exact \alpha. This indicator is most often chosen by selection or by the "trials and errors" method.
The method consists in the fact that you take multiple values \alpha then among them, select the best one. What is the criterion of "best" in our case?

That criterion is to minimize mean of squared errors.Error - is the deviation of the actual value of the forecast. For each S value, it is squared to get rid of the influence of the sign, and then calculate the average of all values. That index\alpha, for which the average value and the minimum are the best of several.

Now a few words about the prediction.

The next value of the series is predicted directly from the formula
S_{forecast} = \alpha y_{last} + (1-\alpha)S_{last}

If it is necessary to get a forecast for a larger number of samples the technique called bootstrapping is used. The last known value of "y" is taken as a constant, and is used in the recursive formula

S_{forecast+n} = \alpha y_{origin} + (1-\alpha)S_{forecast+n-1}

Now apply this knowledge when calculating the smoothed average for the graph shown at the beginning of this article. To make this more interesting, we calculate the smoothed average for the three values at once \alpha, and at the same time calculate the mean square error.

The graph shows for reference the following predicted value, ie, moving average extended for one count further than actual data.

Calculation of exponentially smoothed averageCreative Commons Attribution/Share-Alike License 3.0 (Unported)
Time series
Import data.
"One of the following characters is used to separate data fields: tab, semicolon (;) or comma(,)": 
Add Import data. Clear table
Exponential Smoothing:

By the way, I should note that the best default value for the calculator above \alpha will be 0.7
With \alpha equal 1 smoothing degenerates into a repeat of penultimate values that under large variation neighboring values do not always give a minimum mean square error.

Request a calculator
View all calculators
(505 calculators in total. )