-
Notifications
You must be signed in to change notification settings - Fork 0
DRYify Map And Forecast Functions #71
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
I am going to update the |
Codecov Report❌ Patch coverage is Additional details and impacted files@@ Coverage Diff @@
## main #71 +/- ##
========================================
+ Coverage 8.32% 8.74% +0.42%
========================================
Files 10 10
Lines 769 743 -26
========================================
+ Hits 64 65 +1
+ Misses 705 678 -27 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
…a-and-get_forecast_data
R/summarize_ref_date_forecasts.R
Outdated
| checkmate::assert_character(excluded_locations) | ||
| checkmate::assert_character(targets, null.ok = TRUE) | ||
| checkmate::assert_character(model_ids, null.ok = TRUE) | ||
| checkmate::assert_data_frame(population_data, null.ok = TRUE) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think these type checks can all be omitted. AFAICT, everything we do with them below is can rely on duck typing principles
R/summarize_ref_date_forecasts.R
Outdated
| forecasts_data <- forecasts_data |> | ||
| dplyr::arrange(.data$location_sort_order, .data$location_name) | ||
|
|
||
| if (include_metadata && !is.null(model_metadata)) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since you filter to desired columns when producing the output tables, you can always fetch and include the metadata, which will avoid extra switches
R/summarize_ref_date_forecasts.R
Outdated
| checkmate::assert_data_frame(population_data, null.ok = TRUE) | ||
| checkmate::assert_logical(include_metadata, len = 1) | ||
|
|
||
| if (!is.null(population_data)) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Similar to https://github.com/CDCgov/hubhelpr/pull/71/files#r2593041996, let's just make the population data table required for this function (even if some downstream tables we may choose to produce don't end up including the per 100k columns(). We'll always have the population data available and that way the output data frame for this function has a predictable set of columns
R/summarize_ref_date_forecasts.R
Outdated
| if (!is.null(model_ids)) { | ||
| current_forecasts <- current_forecasts |> | ||
| dplyr::filter(.data$model_id %in% !!model_ids) | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Move this down to alongside the target filter and use nullable_comparison
R/write_ref_date_summary.R
Outdated
| #' model metadata (team_name, model_name). Default: TRUE. | ||
| #' @param column_selection named character vector | ||
| #' specifying which columns to select and rename. If NULL, | ||
| #' includes all columns. Default: NULL. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Strictly doesn't have to be named; if it isn't you just wont get renamings. You can even mix and match a bit, though there are quirks. See last example
Compare
df <- tibble::tibble(x=1:5,y=10:14)
dplyr::select(df, tidyselect::all_of(c(a="x", b="y", c="x")))
dplyr::select(df, tidyselect::all_of(c("x", "y", "x")))
dplyr::select(df, tidyselect::all_of(c("y", "x", c="x")))
R/write_ref_date_summary.R
Outdated
|
|
||
| if (!is.null(column_selection)) { | ||
| summary_data <- summary_data |> | ||
| dplyr::select(!!!column_selection) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| dplyr::select(!!!column_selection) | |
| dplyr::select(tidyselect::all_of(column_selection)) |
R/write_ref_date_summary.R
Outdated
| model_ids = NULL, | ||
| population_data = NULL, | ||
| include_metadata = TRUE, | ||
| column_selection = NULL |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If you have this default to tidyselect::everything() you can avoid the if is NULL check
| column_selection = NULL | |
| column_selection = tidyselect::everything() |
R/write_ref_date_summary.R
Outdated
| checkmate::assert_string(file_suffix) | ||
| checkmate::assert_character(column_selection, null.ok = TRUE) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As above
| checkmate::assert_string(file_suffix) | |
| checkmate::assert_character(column_selection, null.ok = TRUE) |
R/write_ref_date_summary.R
Outdated
| checkmate::assert_data_frame(population_data) | ||
| checkmate::assert_names( | ||
| colnames(population_data), | ||
| must.include = c("location", "population") | ||
| ) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Defer checking the population data to summarize_ref_date_forecasts
| checkmate::assert_data_frame(population_data) | |
| checkmate::assert_names( | |
| colnames(population_data), | |
| must.include = c("location", "population") | |
| ) |
R/get_forecast_data.R
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is the thought to retain this for backward compatibility? I think it is not used enough yet to make this worthwhile. I would just delete and make sure we port over the (currently one) hub that uses it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I originally wrote a deprecation comment on both get_*.R files, but I left them with minimal comment, in hopes that you might afford some remarks. I think deleting them makes the most sense (given there is not really a user base who've become accustomed to them) but didn't want to make this decision myself.
R/get_map_data.R
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same comment as get_forecast_data.R
dylanhmorris
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is looking good. Thanks, @O957. A few changes needed.
R/summarize_ref_date_forecasts.R
Outdated
| @@ -0,0 +1,141 @@ | |||
| #' Summarize forecast hub data for a specific reference | |||
| #' date. This function generates a tibble of forecast data | |||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The first line becomes the header in R function descriptions, so it is worth having a line break.
| #' date. This function generates a tibble of forecast data | |
| #' date. | |
| #' | |
| #' This function generates a tibble of forecast data |
R/summarize_ref_date_forecasts.R
Outdated
| excluded_locations = character(0), | ||
| targets = NULL, | ||
| model_ids = NULL, | ||
| population_data |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Put this before the keyword args
R/summarize_ref_date_forecasts.R
Outdated
|
|
||
| model_metadata <- hubData::load_model_metadata( | ||
| base_hub_path, | ||
| model_ids = NULL |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why not use the user-passed model_ids here?
R/summarize_ref_date_forecasts.R
Outdated
| forecast_due_date_formatted = format(.data$forecast_due_date, "%B %d, %Y") | ||
| ) | ||
|
|
||
| forecasts_data |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I prefer explicit returns. Some R linters disagree on the default setting, but this can be adjusted.
| forecasts_data | |
| return(forecasts_data) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I added explicit returns elsewhere across the codebase where they didn't occur.
R/write_ref_date_summary.R
Outdated
| #' @param column_selection character vector specifying | ||
| #' which columns to select. Uses tidyselect semantics. | ||
| #' Default: tidyselect::everything(). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| #' @param column_selection character vector specifying | |
| #' which columns to select. Uses tidyselect semantics. | |
| #' Default: tidyselect::everything(). | |
| #' @param column_selection Columns to include in the output table. | |
| #' Uses [tidy selection](https://dplyr.tidyverse.org/articles/programming.html). | |
| #' Default: [tidyselect::everything()]. |
dylanhmorris
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is looking good. A few remaining questions.
I think I got everything, including the renamings. Thank you thus far for the review comments. There were also a few more lines breaks that I failed to add for other functions; I forgot about the fact that these first lines become the R function descriptions. |
…a-and-get_forecast_data
…a-and-get_forecast_data
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks, @O957!
|
@O957 I think you can merge. The CI failure looks to be a runner issue, not an issue with the codebase |
Thank you, I thought so, but got distracted after I re-ran the failed job. |
This PR:
write_ref_date_summary.Randsummarize_ref_date_forecasts.R, which DRYify processes acrossget_forecast_data.Randget_map_data.R.