Description
suppressPackageStartupMessages({
library(dplyr)
library(epiprocess)
})
vctrs::vec_rbind(
tibble::tibble(geo_value = 1, time_value = 1:4 + 0, value = 1:4),
tibble::tibble(geo_value = 2, time_value = 3:5 + 0, value = 11:13)
) %>%
as_epi_df() %>%
epi_slide(~ sum(.x$value), .window_size = Inf)
#> An `epi_df` object, 7 x 4 with metadata:
#> * geo_type = hhs
#> * time_type = integer
#> * as_of = 2025-04-08 16:57:38.919515
#>
#> # A tibble: 7 × 4
#> geo_value time_value value slide_value
#> <dbl> <dbl> <int> <int>
#> 1 1 1 1 1
#> 2 1 2 2 3
#> 3 1 3 3 6
#> 4 1 4 4 10
#> 5 2 3 11 NA
#> 6 2 4 12 NA
#> 7 2 5 13 NA
# (We get the same result with epi_slide_sum; something like this is in our test suite.)
vctrs::vec_rbind(
tibble::tibble(geo_value = 1, time_value = 1:4 + 0, value = 1:4),
tibble::tibble(geo_value = 2, time_value = 3:5 + 0, value = 11:13)
) %>%
as_epi_df() %>%
epi_slide_sum(value, .window_size = Inf)
#> An `epi_df` object, 7 x 4 with metadata:
#> * geo_type = hhs
#> * time_type = integer
#> * as_of = 2025-04-08 16:57:39.021215
#>
#> # A tibble: 7 × 4
#> geo_value time_value value value_running_sum
#> <dbl> <dbl> <int> <dbl>
#> 1 1 1 1 1
#> 2 1 2 2 3
#> 3 1 3 3 6
#> 4 1 4 4 10
#> 5 2 3 11 NA
#> 6 2 4 12 NA
#> 7 2 5 13 NA
Created on 2025-04-08 with reprex v2.1.1
The NAs in the second group are presumably coming from completing time values 1&2 with NAs. Is this what we want? On one hand, it makes the input time_value
s contributing to each output time_value
the same for each geo_value
. On the other hand, it makes the result inconsistent with what one might expect from explicitly spelling out edf %>% group_by(geo_value) %>% epi_slide(....) %>% ungroup()
, i.e., that it'd be the same as group-splitting/mapping and performing the same operation, and recombining. (We might have some other lesser violations of this expectation with period-inference somewhere, maybe epix_slide
, but in general I think we've been following this as well. [Another violation is in handling of explicit .ref_time_values
; if we split out into geos with partial ref time availability then we would raise an error.)