You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: vignettes/custom_epiworkflows.Rmd
+44-21Lines changed: 44 additions & 21 deletions
Original file line number
Diff line number
Diff line change
@@ -19,14 +19,17 @@ library(recipes)
19
19
library(epipredict)
20
20
library(epiprocess)
21
21
library(ggplot2)
22
+
library(rlang) # for %@%
22
23
forecast_date <- as.Date("2021-08-01")
23
24
used_locations <- c("ca", "ma", "ny", "tx")
24
25
library(epidatr)
25
26
```
26
27
27
28
If you want to do custom data preprocessing or fit a model that isn't included in the canned workflows, you'll need to write a custom `epi_workflow()`.
29
+
An `epi_workflow()` is a sub-class of a `workflows::workflow()` from the
30
+
`{workflows}` package designed to handle panel data specifically.
28
31
29
-
To get understand how to work with custom `epi_workflow()`s, let's recreate and then
32
+
To understand how to work with custom `epi_workflow()`s, let's recreate and then
30
33
modify the `four_week_ahead` example from the [landing
31
34
page](../index.html#motivating-example).
32
35
Let's first remind ourselves how to use a simple canned workflow:
@@ -133,9 +136,11 @@ parameters have already been calculated based on the training data set.
133
136
Let's create an `epi_recipe()` to hold the 6 steps:
We can inspect newly-created columns by running `bake()` on the
188
-
recipe so far:
199
+
`bake()` applies a prepared recipe to a (potentially new) dataset to create the dataset as handed to the `epi_workflow()`.
200
+
We can inspect newly-created columns by running `bake()` on the recipe so far:
189
201
190
202
```{r bake_recipe}
191
203
four_week_recipe |>
@@ -243,7 +255,7 @@ On the other hand, the layers that are only supported by quantile estimating
243
255
engines (such as `quantile_reg()`) are
244
256
245
257
-`layer_quantile_distn()`: adds the specified quantiles.
246
-
If they differ from the ones actually fit, they will be interpolated and/or
258
+
If the quantile levels specified differ from the ones actually fit, they will be interpolated and/or
247
259
extrapolated.
248
260
-`layer_point_from_distn()`: this adds the median quantile as a point estimate,
249
261
and, if called, should be included after `layer_quantile_distn()`.
@@ -272,8 +284,7 @@ However, it does not generate any predictions; predictions need to be created in
272
284
273
285
## Predicting
274
286
275
-
To make a prediction, we need to first narrow a data set down to the relevant observations.
276
-
This process removes observations that will not be used for training because, for example, they contain missing values or <!-- TODO other reasons?-->.
287
+
To make a prediction, it helps to narrow the data set down to the relevant observations using `get_test_data()`. Not doing this will still fit, but it will predict on every day in the data-set, and not just on the `reference_date`.
COVID-19 forecasting and hotspot prediction?.” Proceedings of the National
598
619
Academy of Sciences 118.51 (2021): e2111453118. doi:10.1073/pnas.2111453118
620
+
621
+
[^4]: Note that `prep()` and `bake()` are standard `{recipes}` functions, so any discussion of them there applies just as well here. For example in the [guide to creating a new step](https://www.tidymodels.org/learn/develop/recipes/#create-the-prep-method).
0 commit comments