Skip to content
Rob edited this page Dec 16, 2022 · 3 revisions

Welcome to the naomi wiki!

debug

When a user submits a "Troubleshooting request" in the web app this posts to a microsoft power automate flow which posts a message to the naomi support general channel. This report includes the IDs of the model fit, the calibrate, the spectrum, summary and coarse output downloads.

Science have one drive synced on their machines and when they receive a new debug request they run (from a machine on the DIDE network)

setwd("~/Imperial College London/Naomi Support - WP - 2022_debug/")
job_id <- "97de55e96b20dc3e260a3d5863b5bde8"
ticket_id <- "65"
dir.create(as.character(ticket_id))
hintr::download_debug(job_id, dest = as.character(ticket_id))

this runs download_debug with the model fit ID and saves it into the specified folder in one drive.

Science then run the model step by step locally to identify any data issues or model issues, issues are communicated back to country teams/UNAIDS via email and described in comments on the debug request in the teams channel.

When an issue is resolved we then document all steps and mark the post as resolved in the teams channel.

What does download_debug return?

With a model fit ID

├── data.rds
└── files
    ├── 1B8D1D8DD8DADA83555D35E002B4328C.geojson
    ├── 1F2A5DE8E88025B28F591A20BACFB810.csv
    ├── 2BAB68CA6AE33B6A0EA62FB24A6E8AD9.csv
    ├── 4FF2DCBBC19CF3CA6A06429DEC878498.zip
    ├── 9C24170BC99B1FADDD0AB9BA26ECCDAF.csv
    └── F538D0FFB17720AA66EECEB5C909830A.csv
  • The files are the input files, the name is the hash of the file with the extension of the original file
  • data.rds is an RDS file which has 3 objects at the top level, expr, objects, sessionInfo
    • expr is the R expression which was run by rrq
    • sessionInfo is the session info of the API which generated this debug output (note this might be different from the sessionInfo which ran the model)
    • objects contains the bulk of interesting stuff. These are the inputs in the expr above. At the time of writing they are
      • data - info about all input data including path, hash and filename. The path is the path relative to the files dir, and the filename is the name of the uploaded file. Use this to map from files dir to get the input of a particular type.
      • options - the options set by the user for their model fit
      • results_dir - path to the results dir (on the server, not interesting for local debugging)
      • prerun_dir - path to the prerun dir (on the server, not very interesting for local debugging)
      • language - language set by the user

Note that this is the input data to hintr:::run_model at the moment it does not include any output data

With a calibration ID

├── data.rds
└── files
    └── model_output1241143435e5.rds
  • files contains model output file only, this is the output from hintr_run_model saved by naomi. https://github.com/mrc-ide/naomi/blob/master/R/run-model.R#L59
  • data.rds has the same 3 values as from model fit, expr, objects and sessionInfo
    • expr is the R expression which was run by rrq
    • objects are the inputs passed to expr they contain
      • model_output the list of model output including path to model_output, the version info that the model output is from, any warnings generated by the model fit
      • calibration_options - the calibration options set by the user
      • results_dir - path to the results dir (on the server, not interesting for local debugging)
      • language - language set by the user
    • sessionInfo is the session info of the API which generated this debug output (note this might be different from the sessionInfo which ran the calibration)

Note again this contains no calibration outputs, only the input to the calibration.

With an output id e.g. spectrum download

├── data.rds
└── files
    ├── model_output756b7b80e4d9.rds
    └── plot_data756b3b12a377.rds
  • files contains the output from calibration, and the plot data i.e. the output from calibrate https://github.com/mrc-ide/naomi/blob/3ffd73bc9dd5a01dae5ed61a2251c5363c86a968/R/run-model.R#L170
  • data.rds has the same 3 values as from model fit, expr, objects and sessionInfo
    • expr is the R expression which was run by rrq
    • objects are the inputs passed to expr they contain
      • model_output the list of model output including path to model_output, path to plot_data the version info that the model calibration output is from, any warnings generated by the model calibration or fit
      • type - the type of download being run
      • results_dir - path to the results dir (on the server, not interesting for local debugging)
      • notes - any notes set by the user on their project
      • state - state info for the front end web app
      • language - language set by the user
    • sessionInfo is the session info of the API which generated this debug output (note this might be different from the sessionInfo which ran the download generation)

Note again it does not contain the download outputs.

Improvements

When there are many debugs coming in it can be a bit tricky to keep track of what is resolved and what state each request is in. We have the excel file which the MS flow also writes to when new requests come in https://imperiallondon.sharepoint.com/:x:/r/sites/NaomiSupport-WP/Shared%20Documents/General/Naomi%20troubleshooting%20requests.xlsx?d=wa5e0b236b4b94cbfb1b2f6f8c4eabeb7&csf=1&web=1&e=uYB0ZS but this is not always in sync with the comments which are put on a ticket. It is also annoying as it puts the post which was commented on to the end of the conversation. Could we make this better some way?

  • When comments are posted on a ticket we could put these into the spreadsheet automatically?

  • Have a button on a post to mark it as resolved which hides it? or deletes it? And then just put a link to the discussion into the spreadsheet?

  • Include the comparison report download ID in the troubleshooting request teams post

  • Include the output in hintr_debug, at the moment it only includes the input data

prerun

naomi has a way to import prerun model results for a user. The way this works is

  1. Start a project on the destination web app and upload all the input files. (This is a temporary measure, ideally this is not needed but hint maintains a list of known files in the database, so any files must be known by hint or will return errors)
  2. Run and calibrate a model locally using naomi::hintr_run_model and naomi::hintr_calibrate
  3. Push a prerun to the server using hintr::hintr_submit_prerun. This must be run on a machine on the DIDE network or on the VPN. This will create an output zip which can then be uploaded into the web app or uploaded onto the ADR and can be used to recover a project.

Example

data <- list(shape = "~/projects/hintr/tests/testthat/testdata/malawi.geojson",
             pjnz = "~/projects/hintr/tests/testthat/testdata/Malawi2019.PJNZ",
             art_number = "~/projects/hintr/tests/testthat/testdata/programme.csv",
             population = "~/projects/hintr/tests/testthat/testdata/population.csv",
             survey = "~/projects/hintr/tests/testthat/testdata/survey.csv",
             anc_testing = "~/projects/hintr/tests/testthat/testdata/anc.csv")
options <- list(
    area_scope = "MWI",
    area_level = 4,
    calendar_quarter_t1 = "CY2016Q1",
    calendar_quarter_t2 = "CY2018Q3",
    calendar_quarter_t3 = "CY2019Q2",
    survey_prevalence = c("DEMO2016PHIA", "DEMO2015DHS"),
    survey_art_coverage = "DEMO2016PHIA",
    survey_recently_infected = "DEMO2016PHIA",
    include_art_t1 = "true",
    include_art_t2 = "true",
    anc_clients_year2 = 2018,
    anc_clients_year2_num_months = "9",
    anc_prevalence_year1 = 2016,
    anc_prevalence_year2 = 2018,
    anc_art_coverage_year1 = 2016,
    anc_art_coverage_year2 = 2018,
    spectrum_population_calibration = "none",
    spectrum_plhiv_calibration_level = "none",
    spectrum_plhiv_calibration_strat = "sex_age_coarse",
    spectrum_artnum_calibration_level = "none",
    spectrum_artnum_calibration_strat = "sex_age_coarse",
    spectrum_infections_calibration_level = "none",
    spectrum_infections_calibration_strat = "sex_age_coarse",
    spectrum_aware_calibration_level = "none",
    spectrum_aware_calibration_strat = "sex_age_coarse",
    calibrate_method = "logistic",
    artattend_log_gamma_offset = -4,
    artattend = "false",
    output_aware_plhiv = "true",
    rng_seed = 17,
    no_of_samples = 500,
    max_iter = 250)
model_output <- naomi::hintr_run_model(data, options)

calibration_options <- list(
  spectrum_plhiv_calibration_level = "subnational",
  spectrum_plhiv_calibration_strat = "sex_age_group",
  spectrum_artnum_calibration_level = "subnational",
  spectrum_artnum_calibration_strat = "sex_age_coarse",
  spectrum_aware_calibration_level = "national",
  spectrum_aware_calibration_strat = "age_coarse",
  spectrum_infections_calibration_level = "none",
  spectrum_infections_calibration_strat = "age_coarse",
  calibrate_method = "logistic"
)
calibrate_output <- naomi::hintr_calibrate(model_output, calibration_options)

names(data)[names(data) == "anc_testing"] <- "anc"
names(data)[names(data) == "art_number"] <- "programme"
hintr_submit_prerun(data, model_output, calibrate_output, server = "http://naomi-staging.dide.ic.ac.uk")