Autocorrelation, also known as serial correlation, is the correlation of a signal with a delayed copy of itself as a function of delay. Informally, it is the similarity between observations as a function of the time lag between them. The analysis of autocorrelation is a mathematical tool for finding repeating patterns, such as the presence of a periodic signal obscured by noise, or identifying the missing fundamental frequency in a signal implied by its harmonic frequencies. It is often used in signal processing for analyzing functions or series of values, such as time domain signals.
In statistics, the autocorrelation of a random process is the Pearson correlation between values of the process at different times, as a function of the two times or of the time lag. 1
The sample Pearson correlation coefficient between x and y is:
For autocorrelation, this coefficient is computed between the time series and the same time series lagged by specified number of periods. For example, for 1-period time lag, the correlation coefficient is computed between first N-1 values, i.e. and next N-1 values (values shifted by one), i.e. .
where is the mean of the first N-1 values, and is the mean of the last N-1 values.
If we ignore difference between and , we can simplify the formula above to
This can be generalized for values separated by k periods as:
The value of is called the autocorrelation coefficient at lag . The plot of the sample autocorrelations versus (the time lags) is called the correlogram or autocorrelation plot.
The correlogram is a commonly used tool for checking randomness in a data set. This randomness is ascertained by computing autocorrelations for data values at varying time lags. If random, such autocorrelations should be near zero for any and all time-lag separations. If non-random, then one or more of the autocorrelations will be significantly non-zero.
In addition, correlograms are used in the model identification stage for Box–Jenkins autoregressive moving average time series models. Autocorrelations should be near-zero for randomness; if the analyst does not check for randomness, then the validity of many of the statistical conclusions becomes suspect. The correlogram is an excellent way of checking for such randomness.2
The default data for the calculator below is obtained by noising sine function using Noisy function generator, and you can clearly see non-random pattern.