Time Series Analysis Using Python

2 min readOct 2, 2023

Time series analysis is a method of studying how a variable changes over time. It can be used for forecasting, anomaly detection, trend analysis, and more. In this post, I will show you how to perform time series analysis using code in Python.

Some steps to follow are:

Import the necessary libraries, such as pandas, numpy, matplotlib, and statsmodels.

# Import necessary libraries
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import statsmodels.api as sm
from statsmodels.tsa.stattools import adfuller
from statsmodels.graphics.tsaplots import plot_acf, plot_pacf
from statsmodels.tsa.arima.model import ARIMA
from sklearn.metrics import mean_squared_error

Load the data into a pandas DataFrame and check its shape, columns, and summary statistics.

# Step 1: Load the data
data = pd.read_csv('time_series_data.csv')
print("Data Shape:", data.shape)
print("Columns:", data.columns)
print("Summary Statistics:")
print(data.describe())

Plot the data to visualize the time series and identify any patterns or outliers.

# Step 2: Plot the data
plt.figure(figsize=(12, 6))
plt.plot(data['Date'], data['Value'])
plt.xlabel('Date')
plt.ylabel('Value')
plt.title('Time Series Data')
plt.show()

Test the stationarity of the data using the Augmented Dickey-Fuller test or the KPSS test. Stationarity means that the mean, variance, and autocorrelation of the data do not change over time.

# Step 3: Test for stationarity (ADF test)
result = adfuller(data['Value'])
print('ADF Statistic:', result[0])
print('p-value:', result[1])

If the data is not stationary, apply some transformations, such as differencing, logarithm, or seasonal adjustment, to make it stationary.

# Step 4: If not stationary, apply differencing
data['Differenced'] = data['Value'].diff().dropna()

# Step 5: Plot differenced data
plt.figure(figsize=(12, 6))
plt.plot(data['Date'], data['Differenced'])
plt.xlabel('Date')
plt.ylabel('Differenced Value')
plt.title('Differenced Time Series Data')
plt.show()

Choose an appropriate model for the data, such as ARIMA, SARIMA, VAR, or LSTM. The model should capture the autocorrelation and seasonality of the data.

# Step 6: Choose an appropriate model (ARIMA in this example)
plot_acf(data['Differenced'], lags=30)
plot_pacf(data['Differenced'], lags=30)
plt.show()

# Step 7: Fit ARIMA model
model = ARIMA(data['Value'], order=(1, 1, 1))
results = model.fit()

Fit the model to the data and check its performance using metrics such as AIC, BIC, RMSE, or MAPE.

# Step 8: Model evaluation
aic = results.aic
bic = results.bic
rmse = np.sqrt(mean_squared_error(data['Value'].iloc[1:], results.fittedvalues))

print('AIC:', aic)
print('BIC:', bic)
print('RMSE:', rmse)

Use the model to make predictions for future values and plot them along with the actual data.

# Step 9: Use the model to make predictions
forecast_periods = 10
forecast, stderr, conf_int = results.forecast(steps=forecast_periods)

# Step 10: Plot predictions along with actual data
plt.figure(figsize=(12, 6))
plt.plot(data['Date'], data['Value'], label='Actual')
plt.plot(range(len(data), len(data) + forecast_periods), forecast, label='Forecast', color='red')
plt.fill_between(range(len(data), len(data) + forecast_periods), 
                 conf_int[:, 0], conf_int[:, 1], color='pink', alpha=0.3, label='95% CI')
plt.xlabel('Date')
plt.ylabel('Value')
plt.title('Model Predictions')
plt.legend()
plt.show()

Evaluate the accuracy and reliability of the predictions using confidence intervals and error analysis.

# Step 11: Evaluate prediction accuracy
prediction_error = data['Value'].iloc[-1] - forecast[0]
prediction_interval = conf_int[0][1] - conf_int[0][0]

print('Prediction Error:', prediction_error)
print('Prediction Interval:', prediction_interval)

Time Series Analysis Using Python

Written by Prabhat Tiwari

No responses yet