0

I am trying to fit an ARIMA model using Python. It has two columns. First- date and second- confirmed orders. Here are first few rows from the data file (daily data of confirmed orders from March 14, 2020 to April 14, 2020):

data

My codes are working well as long as number of differences (d) is 2 or less. When d>2, then I get an error " raise ValueError("d > 2 is not supported").

Here is the code that I am using:

import numpy as np
import pandas as pd
from matplotlib import pyplot as plt
from statsmodels.tsa.stattools import adfuller
from statsmodels.tsa.seasonal import seasonal_decompose
from statsmodels.tsa.arima_model import ARIMA
from pandas.plotting import register_matplotlib_converters
from pandas import read_csv
from pandas import DatetimeIndex
from datetime import datetime
register_matplotlib_converters()

df = pd.read_csv('order.csv',parse_dates = ['date'], index_col = ['date'])
df.info()

#Declare that data are collected on daily basis
df.index.freq = 'd'

#ARIMA
model = ARIMA(df,order=[1,4,1], freq='D')
model_fit = model.fit(disp=0)
print(model_fit.summary())

The screenshot of the error is also attached for details. Any help on solving this will much appreciated. Thanks in advance.

screenshot

5
  • It means that whoever wrote the ARIMA code decided not to implement the code for d > 2. Commented Jun 7, 2020 at 13:31
  • Thanks @Han-KwangNienhuys. Are you sure? I don't think that python will have such a major gap in its modules. Commented Jun 7, 2020 at 13:49
  • The line raise ValueError("d > 2 ...") was put there by the author of the ARIMA code. Also note that statsmodels is not part of Python; rather, it is a package written in Python. Commented Jun 7, 2020 at 14:08
  • Thank you, @Han-KwangNienhuys Commented Jun 7, 2020 at 15:07
  • You can use sm.tsa.SARIMAX for d > 2. However, d > 2 implies explosive behavior, and there are very few time series that would be well-modeled that way. Commented Jun 7, 2020 at 20:42

1 Answer 1

0

Maybe d>2 is not allowed means our best bet is to start simple, check if integrating once grants stationarity. If so, we can fit a simple ARIMA model and examine the ACF of the residual values to get a better feel about what orders of differencing to use. Also a drawback, if we integrate more than two times (d>2), we lose n observations, one for each integration. And one of the most common errors in ARIMA modeling is to "overdifference" the series and end up adding extra AR or MA terms to undo the forecast damage, so the author (I assume) decides to raise this exception.

Sign up to request clarification or add additional context in comments.

1 Comment

Example: ADF Statistic: -2.464240 p-value: 0.124419 0-Hypothesis non stationary 0.12 > 0.05 -> not significant, therefore we can not reject the 0-hypthesis so our time series is non stationary and we had to differencing it to make it stationary. The purpose of differencing it is to make the time series stationary.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.