This page is intended to provide teams with all the information they need to submit forecasts.
Update: All forecasts should be emailed to [email protected].
These instructions provide detail about the data format as well as validation that you can do prior to this pull request. In addition, we describe metadata that each model should provide either through email or submitted in a pull request to data-forecasts/.
See the guidelines and data-surveillance/ folder for the ESSENCE CLI case definition and an example data file.
Table of Contents
- Data formatting for submission
- Forecast file format
- Making a submission
- Forecast data validation
- Policy on late submissions
Forecast files submitted to this collaboration should adhere to the below formatting instructions
to ensure the file can be used in the visualization and ensemble forecasting.
Each subdirectory within the data-forecasts/ directory has the format
team-model
where
team
is the name of your team andmodel
is the name of your model.
Both team and model should be less than 15 characters and not include
hyphens. The model
should be unique from any other model in the project.
Within each subdirectory, there should be a metadata file, a license file (optional), and a set of forecasts.
The metadata file should have the following format
metadata-team-model.txt
and here is the structure of the metadata file.
By default, forecasts are released under a CC-BY 4.0 license. If you would like to release your forecasts under a different license, please specify a standard
license in the license
field of your metadata file. Alternatively, if you wish to use a license that is not in the list of standard
licenses, you may include a
LICENSE.txt
file in your model directory.
Each forecast file should have the following format
YYYY-MM-DD-team-model.csv
where
YYYY
is the 4 digit year,MM
is the 2 digit month,DD
is the 2 digit day,team
is the name of your team, andmodel
is the name of your model.
The date YYYY-MM-DD is the forecast_date
. For this project, the forecast_date
should always
be the date that the submission is due.
The team
and model
in this file must match the team
and model
in
the directory this file is in. Both team
and model
should be less
than 15 characters, alpha-numeric and underscores only, with no spaces
or hyphens.
The file must be a comma-separated value (csv) file with the following columns (in any order):
forecast_date
target
target_end_date
location
type
quantile
value
No additional columns are allowed.
Each row in the file is a single quantile forecast for a specific location. See the template for an example.
Values in the forecast_date
column must be a date in the format
YYYY-MM-DD
This is the date on which the forecasts were due to be submitted. Note that all due dates fall on Wednesdays. forecast_date
should correspond
and be redundant with the date in the filename, and is included here for internal completeness.
Values in the target
column must be the following character (string):
"N wk ahead CLI pct" where N is a number between 1 and 4
Values in the target_end_date
column should be the date in the format
YYYY-MM-DD
This is the date corresponding to the Saturday of the MMWR week of the forecasted value.
Values in the location
column consist of the MHS market name. Name should appear exactly how it appears according to the guidelines Table A.2. column MARKET NAME (For Files). The required location name format can also be found in the location file.
(Examples: National_Capital_Region, Lejeune_Cherry_Point)
Values in the type
column should all be either the character string "point" or "quantile".
Values are either "NA" if type="point", or a quantile in the format 0.### if type="quantile".
This value indicates the quantile for the value
in this row.
Teams must provide the following 23 quantiles:
c(0.01, 0.025, seq(0.05, 0.95, by = 0.05), 0.975, 0.99)
## [1] 0.010 0.025 0.050 0.100 0.150 0.200 0.250 0.300 0.350 0.400 0.450 0.500
## [13] 0.550 0.600 0.650 0.700 0.750 0.800 0.850 0.900 0.950 0.975 0.990
Values in the value
column are non-negative real numbers indicating the "point" or "quantile" prediction for this row.
Contact [email protected] to submit your initial forecast file submission. Metadata and optional license files can be emailed directly or added to this repository using a pull request (see instructions below).
To prepare for the initial submission, fork this repository and clone it to your computer/work station/etc. In the forked repository you created, make a subdirectory for your team in the data-forecasts/ folder following the subdirectory naming convention. This is where you will place all your metadata, and optional license files.
Use a pull request to create your submission. Open a pull request from your forked repository to the original repo. This will initiate merging your changes into the main repo. With the pull request, automatic validation checks on file format and content are run. More information on making a pull request can be found here.
When a pull request is open, you can add/modify files in the pull request by pushing changes from your forked repo. This will allow you to address any problems found during the validation checks. Automatic checks run after each push so you can check if you were able to resolve the problems listed.
Common reasons for a failed pull request: Excel changing the date format upon saving the .csv, misspelled column headers or keys in the metadata While pull requests through this repository will not be required for this collaboration, ensuring that these formatting features are correct will aid in forecast interpretation and ensembling.
Pull requests including metadata or license files will be merged as submitted. However, forecast files should be emailed directly to [email protected].
To ensure proper data formatting, automatic validations are run on all pull requests to
data-forecasts/
.
When a pull request is submitted, the data are validated through Github Actions which runs the tests present in the validations repository. The intent for these tests are to validate the requirements above. Due to differences between the DOD-CLI forecasting collaboration and other forecasting challenges, forecast files for this collaboration will now be submitted by email [email protected]. Questions and concerns can also be emailed directly.
In order to ensure that forecasting is done in real-time, all forecasts are requested to be emailed by the listed deadlines.