Newer
Older
---
title: "Online updating of onlineforecast models"
author: "Peder Bacher"
date: "`r Sys.Date()`"
output:
rmarkdown::html_vignette:
toc: true
toc_debth: 3
vignette: >
%\VignetteIndexEntry{Online updating of onlineforecast models}
%\VignetteEngine{knitr::rmarkdown}
%\VignetteEncoding{UTF-8}
---
```{r external-code, cache=FALSE, include=FALSE, purl = FALSE}
# Have to load the knitr to use hooks
library(knitr)
# This vignettes name
vignettename <- "online-updating"
# REMEMBER: IF CHANGING IN THE shared-init (next block), then copy to the others!
<!--shared-init-start-->
# Width will scale all
figwidth <- 12
# Scale the wide figures (100% out.width)
figheight <- 4
# Heights for stacked time series plots
figheight1 <- 5
figheight2 <- 6.5
figheight3 <- 8
figheight4 <- 9.5
figheight5 <- 11
# Set the size of squared figures (same height as full: figheight/figwidth)
owsval <- 0.35
ows <- paste0(owsval*100,"%")
ows2 <- paste0(2*owsval*100,"%")
#
fhs <- figwidth * owsval
# Set for square fig: fig.width=fhs, fig.height=fhs, out.width=ows}
# If two squared the: fig.width=2*fhs, fig.height=fhs, out.width=ows2
# Check this: https://bookdown.org/yihui/rmarkdown-cookbook/chunk-styling.html
# Set the knitr options
knitr::opts_chunk$set(
collapse = TRUE,
comment = "## ",
prompt = FALSE,
cache = TRUE,
cache.path = paste0("../tmp/vignettes/tmp-",vignettename,"/"),
fig.align="center",
fig.path = paste0("../tmp/vignettes/tmp-",vignettename,"/"),
fig.height = figheight,
fig.width = figwidth,
out.width = "100%"
)
options(digits=3)
hook_output <- knit_hooks$get("output")
knit_hooks$set(output = function(x, options) {
lines <- options$output.lines
if (is.null(lines)) {
return(hook_output(x, options)) # pass to default hook
}
x <- unlist(strsplit(x, "\n"))
more <- "## ...output cropped"
if (length(lines)==1) { # first n lines
if (length(x) > lines) {
# truncate the output, but add ....
x <- c(head(x, lines), more)
}
} else {
x <- c(more, x[lines], more)
}
# paste these lines together
x <- paste(c(x, ""), collapse = "\n")
hook_output(x, options)
})
[onlineforecasting]: https://onlineforecasting.org/articles/onlineforecasting.pdf
[building heat load forecasting]: https://onlineforecasting.org/examples/building-heat-load-forecasting.html
[onlineforecasting.org]: https://onlineforecasting.org
<!--shared-init-end-->
## Intro
This vignette explains how to
Load the package:
```{r, echo=1:2, purl=1:2}
library(onlineforecast)
#library(devtools)
#load_all(as.package("../../onlineforecast"))
```
Load data, setup and define a model:
```{r, output.lines=10}
# Keep the data in D to simplify notation
# Set the score period
D$scoreperiod <- in_range("2010-12-20", D$t)
# Set the training period
D$trainperiod <- in_range(D$t[1], D$t, "2011-02-01")
# Define a new model with low-pass filtering of the Ta input
model <- forecastmodel$new()
model$output = "heatload"
model$add_inputs(Ta = "lp(Ta, a1=0.9)",
model$add_regprm("rls_prm(lambda=0.9)")
model$add_prmbounds(Ta__a1 = c(0.5, 0.9, 0.9999),
lambda = c(0.9, 0.99, 0.9999))
rls_optim(model, subset(D,D$trainperiod), kseq = c(3,18))
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
```
## Recursive update and prediction
How to get new data and update and predict.
First fit on a period
```{r}
iseq <- which(in_range("2010-12-15",D$t,"2011-01-01"))
Dfit <- subset(D, iseq)
rls_fit(model$prm, model, Dfit)
```
Now the fits are saved in the model object (its an R6 object, hence passed by reference to the functions and can be changed inside the functions). A list of fits with an entry for each horizon is in Lfits, see the two first
```{r}
str(model$Lfits[1:2])
```
Now new data arrives, take the point right after the fit period
```{r}
(i <- iseq[length(iseq)] + 1)
Dnew <- subset(D, i)
```
First we need to transform the new data (This must only be done once for each
new data, since some transform functions, e.g. lp(), actually keep states, see
the detailed description in )
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
```{r}
Dnew_transformed <- model$transform_data(Dnew)
```
Then we can update the parameters using the transformed data
```{r}
rls_update(model, Dnew_transformed, Dnew[[model$output]])
```
Calculate predictions using the new data and the updated fits (rls coefficient estimates in model$Lfits[[k]]$theta)
```{r}
yhat <- rls_predict(model, Dnew_transformed)
```
Plot to see that it fits the observations
```{r}
iseq <- i+model$kseq
plot(D$t[iseq], D$heatload[iseq], type = "b", xlab = "t", ylab = "y")
lines(D$t[iseq], yhat, type = "b", col = 2)
legend("topright", c("observations",pst("predictions (",min(model$kseq)," to ",max(model$kseq)," steps ahead)")), lty = 1, col = 1:2)
```
Run this for a longer period to verify that the same forecasts are obtained (in one go vs. iteratively)
First in one go
```{r}
val <- rls_fit(model$prm, model, D, returnanalysis = TRUE)
D$Yhat1 <- val$Yhat
```
and then iteratively
```{r}
itrain <- which(in_range("2010-12-15",D$t,"2011-01-01"))
itest <- which(in_range("2011-01-01",D$t,"2011-01-04"))
rls_fit(model$prm, model, subset(D, itrain))
D$Yhat2 <- data.frame(matrix(NA, nrow(D$Yhat1), ncol(D$Yhat1)))
names(D$Yhat2) <- names(D$Yhat1)
for(i in itest){
Dnew <- subset(D, i)
Dnewtr <- model$transform_data(Dnew)
rls_update(model, Dnewtr, Dnew[[model$output]])
D$Yhat2[i, ] <- as.numeric(rls_predict(model, Dnewtr))
}
```
Compare to see the difference between the one step forecasts
```{r}
D$Yhat1$k1[itest] - D$Yhat2$k1[itest]
```
Note about model$reset_states()