homechevron_rightProfessionalchevron_rightStatistics

Exponential smoothing

Exponential smoothing theory

I was going to write an article about technical indicators and tell you about exponential moving average although, it turned out that by studying the theory of this indicator, I came across some interesting things more related to statistics than to the stock market or forex.

Since statistics already been mentioned on this site, I've decided to write a separate article about it and namely the article about exponential smoothing in time series analysis.

This topic was raised in the article Seasonal fluctuations. Seasonal indices. Constant mean method. and particularly there was said that the calculation of average seasonality indexes of average mean methods can be applied to time series where there were no upward/downward trends or they are negligible. In other words, the observed value fluctuates around some permanent value.

What does that mean? It means that the average constant is constant and because of that it can not capture the trend.
Let's illustrate it with a graph

Constant average

Time series

arrow_upwardarrow_downward
Items per page:

Generally speaking, all methods of averaging are intended to eliminate "noise" from the random scatter of the data that allows to identify the trend more clearly , or the seasonal or cyclic changes, that is, the internal structure of the data, seemingly random, and use it to build the model, followed by analysis and forecasting of future values - but as we see, the simple averaging method does not work if there is a pronounced trend and we cannot predict anything with its help.
We must be able to receive not only one average but average series. And the most popular (and simple) method to recieve those series is exponential smoothing.

It can be described as follows - When forecasting, newer values of observed values are given the greater weight comparing with older values. At the same time, older values are given an exponentially decreasing weights.

Now we describe the definition with formulas.
Traditionally denote the observed value as $y$, and smoothed average as $S$.
Then,
$S_1$ undefined
$S_2 = y_1$$S_3 = \alpha y_2 + (1-\alpha)S_2$

and, generalized

$S_t = \alpha y_{t-1} + (1-\alpha)S_{t-1}$

where, $\alpha$ takes the value from the range [0;1)

Whence comes the exhibitor - reveal the previous average

$S_t = \alpha y_{t-1} + (1-\alpha)S_{t-1} = \alpha y_{t-1} + (1-\alpha)[\alpha y_{t-2} + (1-\alpha)S_{t-2}] = \alpha y_{t-1} + (1-\alpha)[\alpha y_{t-2} + (1-\alpha)[\alpha y_{t-3} + (1-\alpha)S_{t-3}]]$$S_t = \alpha y_{t-1} + \alpha(1-\alpha)y_{t-2} + \alpha(1-\alpha)^{2}y_{t-3} + (1-\alpha)^{3}S_{t-3}$

and, generalized

$S_t=\alpha\sum_{i=1}^{t-2}(1-\alpha)^{i-1}y_{t-i} + (1-\alpha)^{t-2}S_2$, for t > 2

Thus, the weight before $y$ - is an infinitely decreasing geometric progression with multiplier $1-\alpha$
And the farther S, the less it is affected by the initial values.

Let's assume that $y_1=1000$ and see how it's contribution changes for the various S.

Weight values change for exponential smoothing

Digits after the decimal point: 4
y1 value

Weight values change for exponential smoothing

For S2, it is taken as it is, but in S3 with a coefficient alpha of 0.5, the contribution of y1 is just only 250, in S4 - 125, and so on.

At the same time, the choice of the coefficient is important $\alpha$. If you play around with the parameter "a" in the calculator (see. Above), it is clear that the higher the value, the faster the countdown actually ceases to affect the smoothed average, and vice versa - the lower, the longer it retains its influence.

Accordingly, for small $\alpha$, the method of obtaining S2 has a great influence on the result. Assignment $S_2=y_1$ is just on of the methods. As an alternative, the initial value may be a simple average of the first few values of y, for example.

But how do you choose $\alpha$? Which index is most suitable for the simulation of the dynamics of this series? There is no mathematical formulas for calculating exact $\alpha$. This indicator is most often chosen by selection or by the "trials and errors" method.
The method consists in the fact that you take multiple values $\alpha$ then among them, select the best one. What is the criterion of "best" in our case?

That criterion is to minimize mean of squared errors.Error - is the deviation of the actual value of the forecast. For each S value, it is squared to get rid of the influence of the sign, and then calculate the average of all values. That index$\alpha$, for which the average value and the minimum are the best of several.

Now a few words about the prediction.

The next value of the series is predicted directly from the formula
$S_{forecast} = \alpha y_{last} + (1-\alpha)S_{last}$

If it is necessary to get a forecast for a larger number of samples the technique called bootstrapping is used. The last known value of "y" is taken as a constant, and is used in the recursive formula

$S_{forecast+n} = \alpha y_{origin} + (1-\alpha)S_{forecast+n-1}$

Now apply this knowledge when calculating the smoothed average for the graph shown at the beginning of this article. To make this more interesting, we calculate the smoothed average for the three values at once $\alpha$, and at the same time calculate the mean square error.

The graph shows for reference the following predicted value, ie, moving average extended for one count further than actual data.

Time series

arrow_upwardarrow_downward
Items per page:

Digits after the decimal point: 2
Root mean square error 1

Root mean square error 2

Root mean square error 3

Exponential Smoothing

By the way, I should note that the best default value for the calculator above $\alpha$ will be 0.7
With $\alpha$ equal 1 smoothing degenerates into a repeat of penultimate values that under large variation neighboring values do not always give a minimum mean square error.

PLANETCALC, Exponential smoothing