The Algorithm Pattern Sequence based Forecasting (PSF) was first proposed by Martinez Alvarez, et al., 2008 and then modified and suggested improvement by Martinez Alvarez, et al., 2011. The technical detailes are mentioned in referenced articles. PSF algorithm consists of various statistical operations like:
This section discusses about the examples to introduce the use of the PSF package and to compare it with auto.arima() and ets() functions, which are well accepted functions in the R community working over time series forecasting techniques. The data used in this example are ’nottem’ and ’sunspots’ which are the standard time series dataset available in R. The ’nottem’ dataset is the average air temperatures at Nottingham Castle in degrees Fahrenheit, collected for 20 years, on monthly basis.
Similarly, ’sunspots’ dataset is mean relative sunspot numbers from 1749 to 1983, measured on monthly basis. First of all, the psf() function from PSF package is used to forecast the future values. For both datasets, all the recorded values except for the final year are considered as training data, and the last year is used for testing purposes. The predicted values for final year with psf() function for both datasets are now discussed.
## $predictions
## [1] 38.97692 38.71538 42.49231 46.32308 52.91538 57.97692 61.87692 60.19231
## [9] 57.03846 49.42308 43.23846 40.21538
##
## $k
## [1] 2
##
## $w
## [1] 1
## $predictions
## [1] 57.42000 54.33333 53.40000 56.07333 59.67333 59.22667 51.86000 50.89333
## [9] 51.44667 44.66667 39.63333 42.39333 38.15000 39.28333 39.77500 40.73333
## [17] 33.45000 31.14167 26.50833 26.31667 26.62500 27.39167 27.90000 22.20000
## [25] 19.80714 20.82857 24.86429 18.12143 19.95714 18.30000 18.64286 20.54286
## [33] 14.11429 18.12857 20.37143 16.19286 14.47222 13.83333 13.66111 11.93889
## [41] 15.02222 14.21111 10.82778 13.93333 13.39444 18.74444 19.18889 18.83333
##
## $k
## [1] 2
##
## $w
## [1] 5
To represent the prediction performance in plot format, the psf_plot() function is used as shown in the following code.
psf()
with auto.arima()
and
ets()
functions:Example below shows the comparisons for psf()
,
auto.arima()
and ets()
functions when using
the Root Mean Square Error (RMSE) parameter as metric, for ’sunspots’
dataset. In order to avail more accurate and robust comparison results,
error values are calculated for 5 times and the mean value of error
values for methods under comparison are also shown. These values clearly
state that ‘psf()’ function is able to outperform the comparative time
series prediction methods. Additionally, the reader might want to refer
to the results published in the original work Martinez Alvarez et
al. (2011), in which it was shown that PSF outperformed many different
methods when applied to electricity prices and demand forecasting.
## Registered S3 method overwritten by 'quantmod':
## method from
## as.zoo.data.frame zoo
options(warn=-1)
## Consider data `sunspots` with removal of last years's readings
# Training Data
x <- sunspots[1:2772]
# Test Data
y <- sunspots[2773:2820]
PSF <- NULL
ARIMA <- NULL
ETS <- NULL
for(i in 1:5)
{
set.seed(i)
# for PSF
a <- psf(data = x, n.ahead = 48)$predictions
# for ARIMA
b <- forecast(auto.arima(x), 48)$mean
# for ets
c <- as.numeric(forecast(ets(x), 48)$mean)
## For Error Calculations
# Error for PSF
PSF[i] <- sqrt(mean((y - a)^2))
# Error for ARIMA
ARIMA[i] <- sqrt(mean((y - b)^2))
# Error for ETS
ETS[i] <- sqrt(mean((y - c)^2))
}
## Error values for PSF
PSF
## [1] 61.08512 61.08512 61.08512 61.08512 61.08512
## [1] 61.08512
## [1] 103.0719 103.0719 103.0719 103.0719 103.0719
## [1] 103.0719
## [1] 70.66647 70.66647 70.66647 70.66647 70.66647
## [1] 70.66647
Martínez-Álvarez, F., Troncoso, A., Riquelme, J.C. and Ruiz, J.S.A., 2008, December. LBF: A labeled-based forecasting algorithm and its application to electricity price time series. In Data Mining, 2008. ICDM’08. Eighth IEEE International Conference on (pp. 453-461). IEEE.
Martinez Alvarez, F., Troncoso, A., Riquelme, J.C. and Aguilar Ruiz, J.S., 2011. Energy time series forecasting based on pattern sequence similarity. Knowledge and Data Engineering, IEEE Transactions on, 23(8), pp.1230-1243.