# Time series simulator (ARMA)

Statistics

An important part of Econometrics focuses on analyzing time series. We can apply time series analysis to almost any context, the only requirement is that we have similar data along different periods.
To analyze and understand time series, it is sometimes useful to simulate data. In its simplest form, we can classify them into 4 categories: white noise, autoregressive, moving average, and a combination of the last two, called—unsurprisingly—autoregressive moving average.
Even though many programs that can already perform these simulations out-of-the-box (STATA, Matlab, Python modules, etc.), it is still interesting to develop them on our own. With this aim in mind, I developed a simple Python class that can simulate the series described.

# White noise process

A white noise process is as simple as it can get: it is just random numbers (which are sometimes called “innovations”). As a curious side note, the term “white noise” is not original from time series. It was first used to describe a sound that had all frequencies together1.
For our context, we will define a “white noise” as:

$y_{t} = \epsilon_{t}$

Where $\epsilon_{t}$ satisfies the following properties:

$\begin{gather} \tag{1}\label{eq:wn_1} E[\epsilon_{t}] = E[\epsilon_{t}|\epsilon_{t-1}, \epsilon_{t-2}, …] = 0\\ \tag{2}\label{eq:wn_2} V[\epsilon_{t}] = V[\epsilon_{t}|\epsilon_{t-1}, \epsilon_{t-2}, …] = \sigma^{2}\\ \tag{3}\label{eq:wn_3} E[\epsilon_{t}\epsilon_{t-j}] = E[\epsilon_{t} \epsilon_{t-j}|\epsilon_{t-1}, \epsilon_{t-2}, …] = 0 \quad \text{with } j \ne{0} \end{gather}$

Equation \eqref{eq:wn_1} guarantees that the process has a conditional mean of zero. Equation \eqref{eq:wn_2} that the process has a constant variance (homoscedastic), and equation \eqref{eq:wn_3} that errors are uncorrelated. It is worth noticing that no distribution is imposed on $\epsilon_{t}$, but to keep things simple, we will use a normal distribution with a variance of $1$. Here is that definition written in code:

# Moving average process

In simple words, a moving average process is just a weighted combination of values coming from a white noise process. Mathematically, we define a moving average process of $q$ lags (also called $MA(q)$) as:

$y_{t} = \mu + \epsilon_{t} + \theta_{1}\epsilon_{t-1} + \theta_{2}\epsilon_{t-2} + … + \theta_{q}\epsilon_{t-q}$

Here, each $\theta_{t}$ tells us how much the series “remembers” each $\epsilon_{t}$. We add the parameter $\mu$ to allow the process to have a non-zero mean.
To simulate a moving average process we need to:

1. Generate a white noise series.
2. Weight and sum the values of the white noise process.
3. Remove the first $q$ elements of the series ($q$ being the number of coefficients used).

Here are the steps listed above, now written in code:

# Autoregressive process

An autoregressive process consists in starting with an arbitrary number and using past results to generate iteratively future values. The parameters of this function determine how much the series “remembers” itself. Mathematically, we define an autoregressive process of $p$ lags (also called $AR(p)$) as:

$y_{t} = \phi_{1} y_{t-1} + \phi_{2} y_{t-2} + … + \phi_{p} y_{t-p} + \epsilon_{t}$

Here, each $\phi_{t}$ tells us how much the series weights its past values, and $\epsilon_{t}$ is a simple random number. To simulate a moving average process, we need:

1. Start with an arbitrary value for $y_{0}$.
2. Generate a white noise process.
3. Determine each new $y_{t}$ using $y_{t-1}, y_{t-2}, ...$ and $\epsilon_{t}$.
4. Remove the first $p$ elements of the series ($p$ being the number of coefficients used).

Here are the steps listed above, now written in code:

# Autoregressive moving average process

As the name suggests, this process is just a combination of the last two. This means that it has “memory” over its past values and also over the innovations. Mathematically, we define an autoregressive moving average process of $p$ lags in $y$, and $q$ lags in $\epsilon$ (noted as $ARMA(p, q)$) as:

$y_{t} = \mu + \phi_{1} y_{t-1} + \phi_{2} y_{t-2} + … + \phi_{p} y_{t-p} + \epsilon_{t} + \theta_{1}\epsilon_{t-1} + \theta_{2}\epsilon_{t-2} + … + \theta_{q}\epsilon_{t-q}$

To simulate this process, we just need to sum each sub-process in each instance:

# Plotting the series

Finally, we can visualize the series created:

What is interesting is that it is difficult to distinguish each process just by looking at the plot. To visually differentiate the series, we can resort to a visual tool called correlograms.

You can find all the code here and play with an online Jupyter Notebook here .
You can also find a Matlab version of the code here .

1. “Inside the plane … we hear all frequencies added together at once, producing a noise which is to sound what white light is to light.” (L. D. Carson, W. R. Miles & S. S. Stevens, “Vision, Hearing and Aeronautical Design,” Scientific Monthly, 56, (1943), 446-451).