This document demonstates the R package ‘ForecastTB’. It is intended for comparing the performance of forecasting methods. The package assists in developing background, strategies, policies and environment needed for comparison of forecasting methods. A comparison report for the defined framework is produced as an output. Load the package as following:
library(ForecastTB)
#> Registered S3 method overwritten by 'quantmod':
#> method from
#> as.zoo.data.frame zoo
The basic function of the package is
prediction_errors()
. Following are the parameters
considered by this function:
data
: input time series for testing
nval
: an integer to decide number of values to
predict (default:12
)
ePara
: type of error calculation (RMSE and MAE are
default), add an error parameter of your choice in the following manner:
ePara = c("errorparametername")
, where errorparametername
is should be a source/function which returns desired error set.
(default:RMSE
and MAE
)
ePara_name
: list of names of error parameters passed
in order (default:RMSE
and MAE
)
Method
: list of locations of function for the
proposed prediction method (should be recursive)
(default:ARIMA
)
MethodName
: list of names for function for the
proposed prediction method in order
(default:ARIMA
)
strats
: list of forecasting strategies. Available :
recursive
and dirRec
.
(default:recursive
)
append_
: suggests if the function is used to append
to another instance. (default:1
)
dval
: last d values of the data to be used for
forecasting (default: length of the data
)
The prediction_errors()
function returns, two slots as
output. First slot is output
, which provides
Error_Parameters
, indicating error values for the
forecasting methods and error parameters defined in the framework, and
Predicted_Values
as values forecasted with the same
foreasting methods. Further, the second slot is parameters
,
which returns the parameters used or provided to
prediction_errors()
function.
a <- prediction_errors(data = nottem) #`nottem` is a sample dataset in CRAN
a
#> An object of class "prediction_errors"
#> Slot "output":
#> $Error_Parameters
#> RMSE MAE MAPE exec_time
#> ARIMA 2.3400915 1.9329816 4.2156087 0.1046858
#>
#> $Predicted_Values
#> 1 2 3 4 5 6 7
#> Test values 39.40000 40.90000 42.40000 47.80000 52.40000 58.00000 60.70000
#> ARIMA 37.41933 37.69716 41.18252 46.29926 52.24804 57.10696 59.71674
#> 8 9 10 11 12
#> Test values 61.80000 58.20000 46.7000 46.60000 37.80000
#> ARIMA 59.41173 56.38197 51.4756 46.04203 41.52592
#>
#>
#> Slot "parameters":
#> $data
#> Jan Feb Mar Apr May Jun Jul Aug Sep Oct Nov Dec
#> 1920 40.6 40.8 44.4 46.7 54.1 58.5 57.7 56.4 54.3 50.5 42.9 39.8
#> 1921 44.2 39.8 45.1 47.0 54.1 58.7 66.3 59.9 57.0 54.2 39.7 42.8
#> 1922 37.5 38.7 39.5 42.1 55.7 57.8 56.8 54.3 54.3 47.1 41.8 41.7
#> 1923 41.8 40.1 42.9 45.8 49.2 52.7 64.2 59.6 54.4 49.2 36.3 37.6
#> 1924 39.3 37.5 38.3 45.5 53.2 57.7 60.8 58.2 56.4 49.8 44.4 43.6
#> 1925 40.0 40.5 40.8 45.1 53.8 59.4 63.5 61.0 53.0 50.0 38.1 36.3
#> 1926 39.2 43.4 43.4 48.9 50.6 56.8 62.5 62.0 57.5 46.7 41.6 39.8
#> 1927 39.4 38.5 45.3 47.1 51.7 55.0 60.4 60.5 54.7 50.3 42.3 35.2
#> 1928 40.8 41.1 42.8 47.3 50.9 56.4 62.2 60.5 55.4 50.2 43.0 37.3
#> 1929 34.8 31.3 41.0 43.9 53.1 56.9 62.5 60.3 59.8 49.2 42.9 41.9
#> 1930 41.6 37.1 41.2 46.9 51.2 60.4 60.1 61.6 57.0 50.9 43.0 38.8
#> 1931 37.1 38.4 38.4 46.5 53.5 58.4 60.6 58.2 53.8 46.6 45.5 40.6
#> 1932 42.4 38.4 40.3 44.6 50.9 57.0 62.1 63.5 56.3 47.3 43.6 41.8
#> 1933 36.2 39.3 44.5 48.7 54.2 60.8 65.5 64.9 60.1 50.2 42.1 35.8
#> 1934 39.4 38.2 40.4 46.9 53.4 59.6 66.5 60.4 59.2 51.2 42.8 45.8
#> 1935 40.0 42.6 43.5 47.1 50.0 60.5 64.6 64.0 56.8 48.6 44.2 36.4
#> 1936 37.3 35.0 44.0 43.9 52.7 58.6 60.0 61.1 58.1 49.6 41.6 41.3
#> 1937 40.8 41.0 38.4 47.4 54.1 58.6 61.4 61.8 56.3 50.9 41.4 37.1
#> 1938 42.1 41.2 47.3 46.6 52.4 59.0 59.6 60.4 57.0 50.7 47.8 39.2
#> 1939 39.4 40.9 42.4 47.8 52.4 58.0 60.7 61.8 58.2 46.7 46.6 37.8
#>
#> $nval
#> [1] 12
#>
#> $ePara
#> [1] "RMSE" "MAE" "MAPE"
#>
#> $ePara_name
#> [1] "RMSE" "MAE" "MAPE"
#>
#> $Method
#> [1] "ARIMA"
#>
#> $MethodName
#> [1] "ARIMA"
#>
#> $Strategy
#> [1] "Recursive"
#>
#> $dval
#> [1] 240
The quick visualization of the object retuned with
prediction_errors()
function can be done with
plot()
function as below:
As discussed above, prediction_errors()
function
evaluates the performance of ARIMA
method. In addition, it
allows to compare performance of distinct methods along with
ARIMA
. In following example, two methods (LPSF
and PSF
) are compared along with the ARIMA
.
These methods are formatted in the form of a function, which requires
data
and nval
as input parameters and must
return the nval
number of frecasted values as a vector. In
following code, test1()
and test2()
functions
are used for LPSF
and PSF
methods,
respectively.
library(decomposedPSF)
test1 <- function(data, nval){
return(lpsf(data = data, n.ahead = nval))
}
library(PSF)
test2 <- function(data, nval){
a <- psf(data = data, cycle = 12)
b <- predict(object = a, n.ahead = nval)
return(b)
}
Following code chunk show how user can attach various methods in the
prediction_errors()
function. In this chunk, the
append_
parameter is assigned 1
, to appned the
new methods (LPSF
and PSF
) in addition to the
default ARIMA
method. On contrary, if the
append_
parameter is assigned 0
, only newly
added LPSF
and PSF
nethods would be
compared.
a1 <- prediction_errors(data = nottem, nval = 48,
Method = c("test1(data, nval)", "test2(data, nval)"),
MethodName = c("LPSF","PSF"), append_ = 1)
a1@output$Error_Parameters
#> RMSE MAE MAPE exec_time
#> ARIMA 2.52331560 2.12806408 4.51353777 0.06366229
#> LPSF 2.4422070 1.9250000 4.2138243 0.1590395
#> PSF 2.19415390 1.70462240 3.62539036 0.08694172
b1 <- plot(a1)
Consider, another function test3()
, which is to be added
to an already existing object prediction_errors
, eg.
a1
.
library(forecast)
test3 <- function(data, nval){
b <- as.numeric(forecast(ets(data), h = nval)$mean)
return(b)
}
For this purpose, the append_()
function can be used as
follows:
The append_()
function have object
,
Method
, MethodName
, ePara
and
ePara_name
parameters, with similar meaning as that of used
in prediction_errors()
function. Other hidden parameters of
the append_()
function automatically get synced with the
prediction_errors()
function.
c1 <- append_(object = a1, Method = c("test3(data,nval)"), MethodName = c('ETS'))
c1@output$Error_Parameters
#> RMSE MAE MAPE exec_time
#> ARIMA 2.52331560 2.12806408 4.51353777 0.06366229
#> LPSF 2.4422070 1.9250000 4.2138243 0.1590395
#> PSF 2.19415390 1.70462240 3.62539036 0.08694172
#> ETS 38.29743056 36.85216463 73.47667823 0.02627611
d1 <- plot(c1)
When more than one methods are established in the environment and the
user wish to remove one or more of these methods from it, the
choose_()
function can be used. This function takes a
prediction_errors
object as input shows all methods
established in the environment, and asks the number of methods which the
user wants to remove from it.
In the following example, the user supplied 4
as input,
which reflects Method 4: ETS
, and in response to this, the
choose_()
function provides a new object with updated
method lists.
# > e1 <- choose_(object = c1)
# Following are the methods attached with the object:
# [,1] [,2] [,3] [,4]
# Indices "1" "2" "3" "4"
# Methods "ARIMA" "LPSF" "PSF" "ETS"
#
# Enter the indices of methods to remove:4
#
# > e1@output$Error_Parameters
# RMSE MAE exec_time
# ARIMA 2.5233156 2.1280641 0.1963789
# LPSF 2.3915796 1.9361111 0.2990961
# PSF 2.2748736 1.8301389 0.1226711
In default scenario, the prediction_errors()
function
compares forecasting methods in terms of RMSE
,
MAE
and MAPE
. In addition, it allows to append
multiple new error metrics. The Percent change in variance (PCV) is an
another error metric with following definition:
$PCV = \frac{\mid var(Predicted) - var(Observed) \mid}{var(Observed)}$
where var(Predicted) and var(Observed) are variance of predicted and obvserved values. Following chunk code is the function for PCV error metric:
pcv <- function(obs, pred){
d <- (var(obs) - var(pred)) * 100/ var(obs)
d <- abs(as.numeric(d))
return(d)
}
Following chunk code is used to append PCV as a new error metric in
existing prediction_errors
object.
a1 <- prediction_errors(data = nottem, nval = 48,
Method = c("test1(data, nval)", "test2(data, nval)"),
MethodName = c("LPSF","PSF"),
ePara = "pcv(obs, pred)", ePara_name = 'PCV',
append_ = 1)
a1@output$Error_Parameters
#> RMSE MAE MAPE PCV exec_time
#> ARIMA 2.52331560 2.12806408 4.51353777 13.75707258 0.05308485
#> LPSF 2.4422070 1.9250000 4.2138243 12.6285740 0.2081342
#> PSF 2.35646643 1.95798611 4.28284019 2.78559366 0.08323026
b1 <- plot(a1)
A unique way of showing forecasted values, especially if these are seasonal values, the following function can be used. This plot shows how forecatsed observations are behaving on an increasing number of seasonal time horizons.
Monte-Carlo is a popular strategy to compare the performance of forecasting methods, which selects multiple patches of dataset randomly and test performance of forecasting methods and returns the average error values.
The Monte-Carlo strategy ensures an accurate comparison of forecasting methods and avoids the baised results obtained by chance.
This package provides the monte_carlo()
function as
follows:
The parameters used in this function are:
object
: output of ‘prediction_errors()’
function
size
: volume of time series used in Monte Carlo
strategy
iteration
: number of iterations models to be
applied
fval
: a flag to view forecasted values in each
iteration (default: 0, don’t view values)
figs
: a flag to view plots for each iteration
(default: 0, don’t view plots)
This function returns:
a1 <- prediction_errors(data = nottem, nval = 48,
Method = c("test1(data, nval)"),
MethodName = c("LPSF"), append_ = 1)
monte_carlo(object = a1, size = 180, iteration = 10)
#> ARIMA LPSF
#> 21 2.931270 4.907598
#> 3 4.974178 5.665410
#> 23 3.190408 4.867142
#> 29 6.042312 4.480362
#> 56 2.484979 5.272920
#> 43 2.974569 5.066408
#> 31 2.472649 4.755019
#> 19 3.112520 5.253590
#> 10 3.829513 5.350396
#> 46 2.790409 5.076289
#> Mean 3.480281 5.069514
When monte_carlo()
function with fval
and
figs
ON flags:
#> $Error_Parameters
#> ARIMA LPSF
#> 59 2.979238 5.525509
#> 32 3.730970 5.654111
#> Mean 3.355104 5.589810
#>
#> $Predicted_Values
#> $Predicted_Values[[1]]
#> 1 2 3 4 5 6 7
#> Test values 45.80000 40.00000 42.60000 43.50000 47.10000 50.00000 60.50000
#> ARIMA 36.85791 34.08208 35.37813 40.21406 47.23467 54.50775 60.09563
#> LPSF 43.60000 41.80000 36.20000 39.30000 44.50000 48.70000 54.20000
#> 8 9 10 11 12 13 14
#> Test values 64.60000 64.00000 56.80000 48.60000 44.20000 36.40000 37.30000
#> ARIMA 62.55707 61.32137 56.81549 50.32443 43.62658 38.50791 36.28698
#> LPSF 60.80000 65.50000 64.90000 60.10000 50.20000 42.10000 35.80000
#> 15 16 17 18 19 20 21
#> Test values 35.00000 44.00000 43.90000 52.70000 58.60000 60.00000 61.10000
#> ARIMA 37.47692 41.67046 47.67279 53.84011 58.52861 60.53138 59.38766
#> LPSF 39.40000 38.20000 40.40000 46.90000 53.40000 59.60000 66.50000
#> 22 23 24 25 26 27 28
#> Test values 58.10000 49.60000 41.60000 41.30000 40.80000 41.00000 38.40000
#> ARIMA 55.48536 49.93535 44.25689 39.96289 38.15787 39.25529 42.88606
#> LPSF 60.40000 59.20000 51.20000 43.60000 41.80000 36.20000 39.30000
#> 29 30 31 32 33 34 35
#> Test values 47.40000 54.10000 58.60000 61.40000 61.80000 56.30000 50.90000
#> ARIMA 48.01748 53.24548 57.17777 58.80359 57.75224 54.37459 49.63051
#> LPSF 44.50000 48.70000 54.20000 60.80000 65.50000 64.90000 60.10000
#> 36 37 38 39 40 41 42
#> Test values 41.40000 37.1000 42.10000 41.20000 47.30000 46.60000 52.40000
#> ARIMA 44.81754 41.2169 39.75341 40.75914 43.90089 48.28656 52.71717
#> LPSF 50.20000 42.1000 35.80000 39.40000 38.20000 40.40000 46.90000
#> 43 44 45 46 47 48
#> Test values 59.00000 59.60000 60.40000 57.00000 50.70000 47.80000
#> ARIMA 56.01377 57.33026 56.36948 53.44756 49.39348 45.31512
#> LPSF 53.40000 59.60000 66.50000 60.40000 59.20000 51.20000
#>
#> $Predicted_Values[[2]]
#> 1 2 3 4 5 6 7
#> Test values 56.30000 47.30000 43.60000 41.80000 36.20000 39.30000 44.50000
#> ARIMA 58.74398 52.09439 44.54066 38.33014 34.98517 35.31626 39.11816
#> LPSF 61.60000 57.00000 50.90000 43.00000 38.80000 37.10000 38.40000
#> 8 9 10 11 12 13 14
#> Test values 48.70000 54.20000 60.80000 65.50000 64.90000 60.10000 50.20000
#> ARIMA 45.28746 52.13055 57.82934 60.92182 60.67549 57.25735 51.66246
#> LPSF 38.40000 46.50000 53.50000 58.40000 60.60000 58.20000 53.80000
#> 15 16 17 18 19 20 21
#> Test values 42.10000 35.8000 39.40000 38.20000 40.40000 46.90000 53.40000
#> ARIMA 45.42723 40.2092 37.34793 37.52283 40.59496 45.66853 51.34948
#> LPSF 46.60000 45.5000 40.60000 42.40000 38.40000 40.30000 44.60000
#> 22 23 24 25 26 27 28
#> Test values 59.60000 66.5000 60.40000 59.20000 51.20000 42.80000 45.80000
#> ARIMA 56.12686 58.7734 58.65915 55.89871 51.29828 46.12271 41.74918
#> LPSF 50.90000 57.0000 62.10000 60.57500 56.57500 49.40000 43.00000
#> 29 30 31 32 33 34 35
#> Test values 40.00000 42.60000 43.50000 47.10000 50.0000 60.50000 64.60000
#> ARIMA 39.30199 39.36499 41.84473 46.01576 50.7306 54.73404 56.99623
#> LPSF 39.87500 37.82500 37.17500 41.37500 45.6500 53.02500 57.42500
#> 36 37 38 39 40 41 42
#> Test values 64.00000 56.80000 48.6000 44.20000 36.40000 37.30000 35.00000
#> ARIMA 56.97625 54.74927 50.9679 46.67308 43.00875 40.91816 40.90229
#> LPSF 61.75000 60.00000 55.3250 49.02500 42.20000 38.50000 41.00000
#> 43 44 45 46 47 48
#> Test values 44.00000 43.90000 52.70000 58.60000 60.00000 61.10000
#> ARIMA 42.90173 46.32953 50.24147 53.59511 55.52658 55.57208
#> LPSF 40.00000 41.92500 46.92500 50.90000 57.65000 61.72500
plot.MC()
bollinger_plot()
New simulation strategies in
prediction_errors()