-
Notifications
You must be signed in to change notification settings - Fork 10
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add leaderboard component to ClimaLand's long runs #890
Draft
ph-kev
wants to merge
3
commits into
main
Choose a base branch
from
kp/leaderboard
base: main
Could not load branches
Branch not found: {{ refName }}
Loading
Could not load tags
Nothing to show
Loading
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
Draft
Changes from all commits
Commits
Show all changes
3 commits
Select commit
Hold shift + click to select a range
File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,123 @@ | ||
# Leaderboard | ||
|
||
## Long run | ||
|
||
### Add a new variable to compare against observations | ||
Computing errors against observations are all contained in the `leaderboard` folder. The | ||
files in the leaderboard folder are `data_sources.jl` and `leaderboard.jl`. Loading and | ||
preprocessing variables of interest are done in `data_sources.jl` and computing the errors | ||
and plotting are done in `leaderboard.jl`. To add a new variable, you ideally only need to | ||
modify `data_sources.jl`. | ||
|
||
### Computation | ||
As of now, the leaderboard produces bias plots with the global bias and global root mean | ||
squared error (RMSE). These quantities are computed for each month with the first year of | ||
the simulation not considered as that is the spinup time. The start date of the simulation | ||
is 2012 which means that only the year 2013 is used to compare against observational data. | ||
See the plots below for what this look like. | ||
|
||
![bias_with_custom_mask_plot](./leaderboard/images/global_rmse_and_bias_graphs.png) | ||
![gpp_bias_plot](./leaderboard/images/gpp_bias_plot.png) | ||
|
||
### Add a new variable to the bias plots | ||
There are four dictionaries that you need to modify to add a new variable which are | ||
`sim_var_dict`, `obs_var_dict`, `mask_dict`, and `compare_vars_biases_plot_extrema`. | ||
|
||
To add a variable for the leaderboard, add a key-value pair to the dictionary `sim_var_dict` | ||
whose key is the short name of the variable and the value is a function that returns a | ||
[`OutputVar`](https://clima.github.io/ClimaAnalysis.jl/dev/var/). Any preprocessing is done | ||
in the function which includes unit conversion and shifting the dates. | ||
|
||
```julia | ||
sim_var_dict["et"] = | ||
() -> begin | ||
# Load in variable | ||
sim_var = get( | ||
ClimaAnalysis.SimDir(diagnostics_folder_path), | ||
short_name = "et", | ||
) | ||
# Shift to the first day and subtract one month as preprocessing | ||
sim_var = | ||
ClimaAnalysis.shift_to_start_of_previous_month(sim_var) | ||
return sim_var | ||
end | ||
``` | ||
|
||
Then, add a key-value pair to the dictionary `obs_var_dict` whose key is the same short name | ||
as before and the value is a function that takes in a start date and returns a `OutputVar`. | ||
Any preprocessing is done in the function. | ||
|
||
```julia | ||
obs_var_dict["et"] = | ||
(start_date) -> begin | ||
# We use ClimaArtifacts to use a dataset from ILAMB | ||
obs_var = ClimaAnalysis.OutputVar( | ||
ClimaLand.Artifacts.ilamb_dataset_path(; | ||
context = "evspsbl_MODIS_et_0.5x0.5.nc", | ||
), | ||
"et", | ||
# start_date is used to align the dates in the observational data | ||
# with the simulation data | ||
new_start_date = start_date, | ||
# Shift dates to the first day of the month before aligning the dates | ||
shift_by = Dates.firstdayofmonth, | ||
) | ||
# More preprocessing to match the units with the simulation data | ||
ClimaAnalysis.units(obs_var) == "kg/m2/s" && | ||
(obs_var = ClimaAnalysis.set_units(obs_var, "kg m^-2 s^-1")) | ||
# ClimaAnalysis cannot handle `missing` values, but does support handling NaNs | ||
obs_var = ClimaAnalysis.replace(obs_var, missing => NaN) | ||
return obs_var | ||
end | ||
``` | ||
|
||
!!! tip "Preprocessing" | ||
Observational and simulational data should be preprocessed for dates and units. For | ||
simulation data, monthly averages correspond to the first day following the month. | ||
For instance, the monthly average corresponding to January 2010 is on the date | ||
2/1/2010. Preprocessing is done to shift this date to 1/1/2010. When preprocessing | ||
data, we follow the convention that the first day corresponds to the monthly average | ||
for that month. For observational data, you should check the convention being followed | ||
and preprocess the dates if necessary. | ||
|
||
For `obs_var_dict`, the anonymous function must take in a start date. The start date is | ||
used in `leaderboard.jl` to adjust the seconds in the `OutputVar` to match between start | ||
date in the simulation data. | ||
|
||
Units should be the same between the simulation and observational data. | ||
|
||
Next, add a key-value pair to the dictionary `mask_dict` whose key is the same short name | ||
as before and the value is a function that takes in a `OutputVar` representing simulation | ||
data and a `OutputVar` representing observational data and returns a masking function or | ||
`nothing` if no masking function is needed. The masking function is used to correctly | ||
normalize the global bias and global RMSE. See the example below where a mask is made using | ||
the observational data. | ||
|
||
```julia | ||
mask_dict["et"] = | ||
(sim_var, obs_var) -> begin | ||
return ClimaAnalysis.make_lonlat_mask( | ||
# We do this to get a `OutputVar` with only two dimensions: | ||
# longitude and latitude | ||
ClimaAnalysis.slice( | ||
obs_var, | ||
time = ClimaAnalysis.times(obs_var) |> first, | ||
); | ||
# Any values that are NaN should be 0.0 | ||
set_to_val = isnan, | ||
true_val = 0.0 | ||
) | ||
end | ||
``` | ||
|
||
Finally, add a key-value pair to the dictionary `compare_vars_biases_plot_extrema` whose | ||
key is the same short name as before and the value is a tuple of floats which determine | ||
the range of the bias plots. | ||
|
||
```julia | ||
compare_vars_biases_plot_extrema = Dict( | ||
"et" => (-0.00001, 0.00001), | ||
"gpp" => (-8.0, 8.0), | ||
"lwu" => (-40.0, 40.0), | ||
) | ||
``` |
Oops, something went wrong.
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe we could link to ClimaAnalysis documentation explaining the formatting for units? If I understand correctly, this will carry out unit conversion, but that means the string must be in the format expected by ClimaAnalysis
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For that part of the code, it is just checking that the string of the units is "kg/m2/s" and if it is, then set the units to "kg m^-2 s^-1". The check is just there to make sure the units are set correctly.
There is no convention for the formatting of units unless one want to use automatic unit conversion. We only do this because ClimaAnalysis can't tell that "kg/m2/s" (in the observational data) is the same as "kg m^-2 s^-1" (in the simulation data).