I am trying to explain to myself the forecasting result from applying an ARIMA model to a time-series dataset. The data is from the M1-Competition, the series is MNB65. For quick reference, I have a google doc spreadsheet with the data. I am trying to fit the data to an ARIMA(1,0,0) model and get the forecasts. I am using R. Here are some output snippets:
> arima(x, order = c(1,0,0))
Series: x
ARIMA(1,0,0) with non-zero mean
Call: arima(x = x, order = c(1, 0, 0))
Coefficients:
ar1 intercept
0.9421 12260.298
s.e. 0.0474 202.717
> predict(arima(x, order = c(1,0,0)), n.ahead=12)
$pred
Time Series:
Start = 53
End = 64
Frequency = 1
[1] 11757.39 11786.50 11813.92 11839.75 11864.09 11887.02 11908.62 11928.97 11948.15 11966.21 11983.23 11999.27
I have a few questions:
(1) How do I explain that although the dataset shows a clear downward trend, the forecast from this model trends upward. This also happens for ARIMA(2,0,0), which is the best ARIMA fit for the data using auto.arima (forecast package) and for an ARIMA(1,0,1) model.
(2) The intercept value for the ARIMA(1,0,0) model is 12260.298. Shouldn't the intercept satisfy the equation: C = mean * (1 - sum(AR coeffs)), in which case, the value should be 715.52. I must be missing something basic here.
(3) This is clearly a series with non-stationary mean. Why is an AR(2) model still selected as the best model by auto.arima? Could there be an intuitive explanation?
Thanks.