Conversation
| filter_recent_ts <- function(df, threshold = 202310) { | ||
| touchstone_year <- unique(df$touchstone) | ||
|
|
||
| # TODO: check that touchstone year is 6 digit - can there be more digits? |
There was a problem hiding this comment.
It should always be 6 numbers (year followed by month) in the dataset but sometimes report parameters have character additions for disease too e.g. 202602yf. I think fine to leave as 6 numbers for this filtering purpose
| #' @name constants | ||
| EXCLUDED_DISEASES <- c("Hib", "PCV", "Rota", "JE") | ||
|
|
||
| #' @name constants | ||
| TOUCHSTONE_OLD <- "201910" | ||
|
|
||
| #' @name constants | ||
| TOUCHSTONE_NEW <- "202310" | ||
|
|
||
| #' @name constants | ||
| TOUCHSTONE_OLD_OLD <- "202110" |
There was a problem hiding this comment.
I agree it's much cleaner have these separate to the functions, the only thing is, they need to be flexible as each time a report is run, users may want to change these, specifically the TOUCHSTONE_OLD etc. in order to compare different reports/ years
There was a problem hiding this comment.
I'll rename these to DEF_* for default X - that way they shouldn't interfere with users' analyses but it also allows us to have some sensible defaults.
| # TODO: replace `old` and `new` with defined objs --- see scratch.R | ||
| # unsure why this syntax was used |
There was a problem hiding this comment.
I can't see scratch.R? Agree the syntax could be more streamlined, I thought I had a reason at the time but I'm not sure now.
There was a problem hiding this comment.
Ah thanks - that comment is aimed at me - the scratch.R file is where I try out the functions.
| #' @export | ||
| gen_combined_df <- function(prev_dat, df2, interest_cols, key_cols) { | ||
| # TODO: input checks | ||
| # TODO: df2 needs a better name |
There was a problem hiding this comment.
df2 is essentially a cleaned version of of the touchstone_new latest report on packit where vaccine names are cleaned, subregions appended etc. happy to take suggestions on a better way to do it
There was a problem hiding this comment.
Thanks - could you suggest an informative name for the argument that somebody new to the package and the workflow would understand? data_current? data_cleaned? If you could pop a couple of lines in this thread about any expectations around it, I can add those to the function docs.
There was a problem hiding this comment.
df_clean is fine, thanks. df_clean is the imported dependency report (currently named touchstone_new) with the following cleaning steps applied:
- Malaria vaccine names amended
- Subregions appended
- Fixes for missing country names
- Removal of now-redundant columns
It may even be sufficient to just apply these steps todfitself, I think I just kept them separate when developing the report so I didn't have to keep pulling a clean dependency when testing things out.
| results | ||
| } | ||
|
|
||
| # TODO: reconsider function name, add explicit arguments |
There was a problem hiding this comment.
Yes - I think a more informative function name would be great - any suggestions? I'll also be adding some explicit arguments, as currently the function relies on having the objects it saves in its environment.
WIP as PR develops.