Handle repeated target data updates with the same as-of date

These lines make it possible for multiple updates in the same day to create either duplicate or conflicting rows representing the same datapoint as of the same as-of date if the function is run twice. It will append to the existing table as many times as it is run:

https://github.com/CDCgov/hubhelpr/blob/b63736300d6e476c83d13fa0db8a0e06759689af/R/update_hub_target_data.R#L143-L154

The function should instead:

- take an `overwrite_existing` keyword argument, default `FALSE`
- When `overwrite_existing` is `TRUE`, overwrite all and only the shared rows between the new data and the old with the given `as_of` date. That is, if there are datapoints with the current `as_of` date that are _not_ contained within the update (say the update only contains NSSP, and we've previously updated NHSN), they are _not_ deleted. In the case of conflicts, we defer to the update and overwrite the old as-of-today entries.
- When `overwrite_existing` is `FALSE`, error if there are any conflicts.

A complete PR should include unit tests that all of the above occurs. This would be a good opportunity for test-driven development (i.e. write the test first and confirm it fails).

	output_file <- fs::path(output_dirpath, "time-series", ext = "parquet")
	if (fs::file_exists(output_file)) {
	existing_data <- forecasttools::read_tabular_file(output_file)
	} else {
	existing_data <- NULL
	}
	dplyr::bind_rows(
	existing_data,
	hubverse_format_nhsn_data,
	hubverse_format_nssp_data
	) \|>
	forecasttools::write_tabular_file(output_file)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Handle repeated target data updates with the same as-of date #66

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Handle repeated target data updates with the same as-of date #66

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions