You can use visual inspection, global vs. local analysis, and statistics to analyze stationarity. The Augmented Dickey-Fuller (ADF) test is the most commonly used parametric test, and the Zivot-Andrews test is better than the ADF at detecting stationarity through structural breaks.

This post assumes you understand what stationarity is and why it’s important .

*
Note: I use matplotlib plots for analysis and Plotly charts for presentation in this post. The complete code for this post is on the
Analyzing Alpha GitHub Repository
.
*

## Visual Inspection

It’s easy to see if a process is creating a stationary time series. If we see the mean or the distribution changing, it’s non-stationary.

Let’s first create a random walk model using Python, and then we’ll plot it in Plotly.

` ````
# Grab imports import numpy as np import pandas as pd # Create index periods = 2000 vol = .002 dti = pd.date_range("2021-01-01", periods=periods, freq="H") # Generate normal distribution random_walk = pd.DataFrame(index=dti, data=np.random.normal(size=periods) * vol) # Plot it using a custom function make_plot(random_walk, "Stationary "Random Walk" Time Series").show()
```

We can identify a stochastic process by analyzing a time series plot and gain an understanding of if there’s a shift in the mean or distribution through time. This is called global vs. local analysis, which we’ll discuss further below.

However, in practice, it’s trivial to use more advanced plot functions to identify a stationary process.

### Decomposition

Time series decomposition separates the signal, trend, seasonality, and error automatically. Let’s decompose Apple’s revenue. We can see the clear seasonality and trend in this non-stationary data.

` ````
from statsmodels.tsa.seasonal import seasonal_decompose sd = seasonal_decompose(apple_revenue_history) sd.plot()
```

### Autocorrelation Function

The autocorrelation and partial-autocorrelation functions analyze a data set for statistical significance between the first data point and prior data points.

A non-stationary time series data will show significance between itself and its lagged values, and that significance will decay to zero slowly as in the first plot.

The second plot, a stationary time series, will quickly drop to zero. The red lines are confidence intervals where values above or below the lines are more significant than two standard deviations.

` ````
from statsmodels.graphics.tsaplots import plot_acf plot_acf(df2, lags=40, alpha=0.05) plot_acf(df2.diff().dropna(), lags=40, alpha=0.05)
```

Sometimes it’s evident after visual inspection that we have stationary or non-stationary data. But even in these circumstances, we want to test our hypothesis using statistical tests to understand our series better.

## Global vs. Local Analysis

We can break our time series into multiple segments and analyze the summary statistics of each against the time series or another partition to see if our time series data is changing through time.

` ````
def get_quads(df): quadlen = int(len(df1) * 0.25) ss = df[:quadlen].describe() ss[1] = df[quadlen:quadlen*2].describe() ss[2] = df[quadlen*2:quadlen*3].describe() ss[3] = df[quadlen*3:].describe() return ss
```

### Random Walk

Notice how the random walk time series has a constant mean, variance, and the distribution between the min, percentages, and max are relatively consistent (no seasonality)?

` ````
random_walk = pd.DataFrame(index=dti, data=np.random.random(size=periods) * vol) get_quads(random_walk)
```

` ````
0 1 2 3 count 500.000000 500.000000 500.000000 500.000000 mean 0.000136 -0.000054 0.000070 -0.000216 std 0.002067 0.001909 0.002008 0.002020 min -0.005851 -0.005614 -0.006003 -0.006325 25% -0.001263 -0.001344 -0.001262 -0.001492 50% 0.000095 -0.000094 0.000045 -0.000229 75% 0.001503 0.001225 0.001331 0.001073 max 0.007002 0.006702 0.006338 0.005969
```

### Trend Stationary

You can see the mean is increasing in the trending time series.

` ````
trending = pd.DataFrame(index=dti, data=np.random.random(size=periods) * vol).cumsum() get_quads(trending)
```

` ````
0 1 2 3 count 500.000000 500.000000 500.000000 500.000000 mean 0.253949 0.754164 1.245338 1.752112 std 0.148433 0.138083 0.153263 0.144376 min 0.001630 0.509963 0.989912 1.509835 25% 0.122746 0.637347 1.111457 1.629752 50% 0.261204 0.749095 1.243160 1.748031 75% 0.376252 0.878295 1.376870 1.879781 max 0.508350 0.989398 1.509197 2.000878
```

### Volatile Time Series

The standard deviation is changing in the volatile time series.

` ````
varying = pd.DataFrame(index=dti, data=np.random.normal(size=periods) * vol \ * np.logspace(1,5,num=periods, dtype=int))
```

` ````
0 1 2 3 count 500.000000 500.000000 500.000000 500.000000 mean 0.000510 0.082075 -0.206946 -1.990338 std 0.094116 0.935861 10.263379 99.674838 min -0.664262 -3.887168 -46.277549 -472.434713 25% -0.039151 -0.280147 -4.025722 -45.672561 50% -0.001081 0.067228 -0.068825 -4.059768 75% 0.037203 0.476481 3.295115 35.828933 max 0.424776 5.000719 61.889090 455.791059
```

### Seasonal Time Series

You can identify seasonality by analyzing the distribution through min, max, and the percentages in between. Notice how the 50% area is changing.

` ````
def simulate_seasonal_term(periodicity, total_cycles, noise_std=1., harmonics=None): duration = periodicity * total_cycles assert duration == int(duration) duration = int(duration) harmonics = harmonics if harmonics else int(np.floor(periodicity / 2)) lambda_p = 2 * np.pi / float(periodicity) gamma_jt = noise_std * np.random.randn((harmonics)) gamma_star_jt = noise_std * np.random.randn((harmonics)) total_timesteps = 100 * duration # Pad for burn in series = np.zeros(total_timesteps) for t in range(total_timesteps): gamma_jtp1 = np.zeros_like(gamma_jt) gamma_star_jtp1 = np.zeros_like(gamma_star_jt) for j in range(1, harmonics + 1): cos_j = np.cos(lambda_p * j) sin_j = np.sin(lambda_p * j) gamma_jtp1[j - 1] = (gamma_jt[j - 1] * cos_j + gamma_star_jt[j - 1] * sin_j + noise_std * np.random.randn()) gamma_star_jtp1[j - 1] = (- gamma_jt[j - 1] * sin_j + gamma_star_jt[j - 1] * cos_j + noise_std * np.random.randn()) series[t] = np.sum(gamma_jtp1) gamma_jt = gamma_jtp1 gamma_star_jt = gamma_star_jtp1 wanted_series = series[-duration:] # Discard burn in return wanted_series duration = 100 * 3 periodicities = [10, 100] num_harmonics = [3, 2] std = np.array([2, 3]) np.random.seed(8678309) terms = [] for ix, _ in enumerate(periodicities): s = simulate_seasonal_term( periodicities[ix], duration / periodicities[ix], harmonics=num_harmonics[ix], noise_std=std[ix]) terms.append(s) terms.append(np.ones_like(terms[0]) * 10.) seasonal = pd.DataFrame(index=dti[:duration],data=np.sum(terms, axis=0))
```

` ````
0 1 2 3 count 75.000000 75.000000 75.000000 75.000000 mean 328.055459 -280.450561 -149.840302 120.579444 std 733.650363 750.299351 958.572453 1007.596395 min -863.464617 -1694.424423 -1815.786373 -1733.291362 25% -239.600008 -770.199418 -828.906000 -694.550888 50% 72.425299 -346.127307 -405.201792 59.358360 75% 923.255524 59.875525 485.926284 945.035545 max 1813.015082 1642.090923 1916.794045 1832.084148
```

## Parametric Tests

Statistical tests identify specific types of stationarity. Most parametric tests analyze if the time series has a unit root, which is a systematic pattern that is random in the presence of serial correlation — which is a fancy way of saying a lagged version of itself. The reason it’s called a unit root test is due to the math of the process.

Testing if the time series has a unit root and is therefore not stationary is the null hypothesis, which is a fancy way of saying the commonly accepted belief. We want to reject the null hypothesis with a level of certainty to state the time series is stationary — or more accurately, the process generating the time series is stationary.

Let’s look at the most common unit root tests.

### Dickey-Fuller

The Dickey-Fuller test is the first statistical test that analyzes if a unit root exists in an autoregressive model of a time series.

It runs into issues with serial correlation, which is why there’s an Augmented Dickey-Fuller test.

### Augmented Dickey-Fuller (ADF)

The Augmented Dickey-Fuller tests for a unit root in a univariate process in the presence of serial correlation. The ADF test handles more complex models and is the typical go-to for most analysts.

#### Non-Stationary Process

Let’s use the ADF test on Apple’s revenue, which is non-stationary.

` ````
from statsmodels.tsa.stattools import adfuller t_stat, p_value, _, _, critical_values, _ = adfuller(df6['observed'].values, autolag='AIC') print(f'ADF Statistic: {t_stat:.2f}') for key, value in critical_values.items(): print('Critial Values:') print(f' {key}, {value:.2f}')
```

` ````
ADF Test Statistic: -2.12 Critial Values: 1%, -3.69 Critial Values: 5%, -2.97 Critial Values: 10%, -2.63
```

You can compare the ADF test statistic of -2.12 against the critical values. We see that the test statistic is less than all of the critical values, so we cannot reject the null hypothesis — in other words, as we already knew, Apple’s revenue is non-stationary; we see trends, and it’s mean and variance are changing.

#### Stationary Process

Now let’s do the same for the random walk data, which exhibits stationarity.

` ````
from statsmodels.tsa.stattools import adfuller result = adfuller(random_walk[0].values, autolag='AIC') t_stat, p_value, _, _, critical_values, _ = adfuller(random_walk[0].values, autolag='AIC') print(f'ADF Statistic: {t_stat:.2f}') for key, value in critical_values.items(): print('Critial Values:') print(f' {key}, {value:.2f}') print(f'\np-value: {p_value:.2f}') print("Non-Stationary") if p_value > 0.05 else print("Stationary")
```

` ````
ADF Statistic: -43.83 Critial Values: 1%, -3.43 Critial Values: 5%, -2.86 Critial Values: 10%, -2.57 p-value: 0.00 Stationary
```

Notice that this time I’ve included the p-value. When a p-value is greater than 5%, we accept the null hypothesis. We also see an extreme ADF test statistic from this random walk model.

I like the critical value approach compared to the p-value as I can see to what degree I can reject that the time series is not stationary.

### Kwiakowski-Phillips-Schmidt-Shin (KPSS)

The Kwiatkowski-Phillips-Schmidt-Shin (KPSS) tests for the null hypothesis that the time series process is stationary or trend stationarity. In other words, the KPSS test null hypothesis is the opposite of the ADF test.

#### Exhibits Non-Stationarity

` ````
from statsmodels.tsa.stattools import kpss t_stat, p_value, _, critical_values = kpss(df6['observed'].values, nlags='auto') print(f'ADF Statistic: {t_stat:.2f}') for key, value in critical_values.items(): print('Critial Values:') print(f' {key}, {value:.2f}') print(f'\np-value: {p_value:.2f}') print("Stationary") if p_value > 0.05 else print("Non-Stationary")
```

` ````
ADF Statistic: 0.91 Critial Values: 10%, 0.35 Critial Values: 5%, 0.46 Critial Values: 2.5%, 0.57 Critial Values: 1%, 0.74 p-value: 0.01 Non-Stationary
```

#### Exhibits Stationarity

` ````
from statsmodels.tsa.stattools import kpss t_stat, p_value, _, critical_values = kpss(random_walk[0].values, nlags='auto') print(f'ADF Statistic: {t_stat:.2f}') for key, value in critical_values.items(): print('Critial Values:') print(f' {key}, {value:.2f}') print(f'\np-value: {p_value:.2f}') print("Stationary") if p_value > 0.05 else print("Non-Stationary")
```

` ````
from statsmodels.tsa.stattools import kpss t_stat, p_value, _, critical_values = kpss(random_walk[0].values, nlags='auto') print(f'ADF Statistic: {t_stat:.2f}') for key, value in critical_values.items(): print('Critial Values:') print(f' {key}, {value:.2f}') print(f'\np-value: {p_value:.2f}') print("Non-Stationary") if p_value > 0.05 else print("Stationary")
```

### Zivot and Andrews

Zivot-Andrews tests for the same thing as the ADF and KPSS tests for the presence of a structural break.

Let’s add a break to our random walk model. Notice how the variance and cyclicality properties are relatively constant, but there’s a significant shift in the mean.

Now let’s run both an ADF test and a Zivot-Andrews to test for non-stationarity.

` ````
t_stat, p_value, _, _, critical_values, _ = adfuller(stationary_with_break[0].values, autolag='AIC') print(f'ADF Statistic: {t_stat:.2f}') for key, value in critical_values.items(): print('Critial Values:') print(f' {key}, {value:.2f}') print(f'\np-value: {p_value:.2f}') print("Non-Stationary") if p_value > 0.05 else print("Stationary")
```

` ````
ADF Statistic: -1.05 Critial Values: 1%, -3.43 Critial Values: 5%, -2.86 Critial Values: 10%, -2.57 p-value: 0.73 Non-Stationary
```

` ````
from statsmodels.tsa.stattools import zivot_andrews t_stat, p_value, critical_values, _, _ = zivot_andrews(stationary_with_break[0].values) print(f'Zivot-Andrews Statistic: {t_stat:.2f}') for key, value in critical_values.items(): print('Critial Values:') print(f' {key}, {value:.2f}') print(f'\np-value: {p_value:.2f}') print("Non-Stationary") if p_value > 0.05 else print("Stationary")
```

` ````
Zivot-Andrews Statistic: -60.35 Critial Values: 1%, -5.28 Critial Values: 5%, -4.81 Critial Values: 10%, -4.57 p-value: 0.00 Stationary
```

We see different results between the two stationarity tests. The ADF test fails to incorporate the structural break while the Zivot-Andrew identifies the series as stationary.

## The Bottom Line

The process for testing for stationarity in time series data is relatively straightforward. You analyze the information visually, perform a decomposition, review the summary statistics, and then select a parametric test to gain confidence in your assumptions.

The Augmented Dickey-Fuller or ADF test is the most commonly used test for stationarity; however, it’s not always the best. To analyze stationarity in time series with structural breaks, you’ll want to use the Zivot-Andrews test. As always, it’s essential to have an intuitive understanding of the problem set and to have a deep knowledge of the tools at your disposal.