-
Notifications
You must be signed in to change notification settings - Fork 15
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Create global time series Viewers #616
Conversation
9b6ee2d
to
f6a7567
Compare
@chengzhuzhang @thorntonpe @BunnyVon Here's my initial implementation for #601. Thoughts/suggestions? The HTML is sparse, but effective. If the HTML needs to look more like the E3SM Diagnostics viewer, that would involve more significant, time-intense front-end work. The PDFs are multi-page, with one variable per page. The PNGs are one variable per PNG. The implementation method is for the user to set |
the visualization looks great to me and fits the purpose. Could you clarify a few questions I have below?
|
In theory, yes. I haven't tested with all of them. I know at our last meeting @thorntonpe mentioned he could provide a full list of 400+ 2D variables to work with.
I'm not sure. @chengzhuzhang do you know more about this?
That might be beyond the scope of |
yeah, I think you are right. the user should know what they want to plot :) please ignore that question. @forsyth2 @chengzhuzhang |
@BunnyVon I was reviewing my notes from our June 14 meeting. It looks like the highest priority item was just setting up a basic HTML page with links to single plots. That is accomplished by this pull request. The second highest priority item was to categorize those plots (@thorntonpe suggested Energy, Water, Carbon). But it looks like we never found a good way to automatically determine a plot's category (one suggestion was to try basing it on the variables' units). @chengzhuzhang recently suggested at least categorizing the plots so that the global, Northern hemisphere, and Southern hemisphere plots are grouped together per-variable. So, at this point, the question is what level of categorization we want for this pull request:
As for the front-end code, I'm going to try using the |
@forsyth2 I think the code change at this point effectively generated one file per each variable. It is the first step to create the html page. As I mentioned in EZ meeting yesterday, we can adopt For now, each variable has 3 plots, global mean, NH mean, and SH mean. Each component can be a group, for each group, the row number is the number of variables, and results time-series plots can have 3 columns: Global Average, NH Average, and SH Average, each links to a corresponding plot. I'm thinking something like following:
Let me know if this makes sense, and happy to clarify. |
@chengzhuzhang Yes, your notes here will help with implementing option 2 in my comment above. Thanks, I will get started on that. I think to must fully implement the categories (option 3) though, we'll need the extra input I described from the Land team. |
@chengzhuzhang @tomvothecoder Does What was the solution for the |
I'm not able to glean much info from E3SM-Project/e3sm_diags#628. Is the EDIT: It's definitely a CDAT dependency: https://anaconda.org/cdat/output_viewer shows https://github.com/ESGF/output_viewer listed as a CDAT package. |
I'd also like to note this is another reason I realize for tech debt purposes this is unlikely to change, but I think we should still note how that tech debt negatively impacts the modularity of our codebase. |
e3sm_diags does not use the CDAT version of the Before: Now: |
The ESGF |
@forsyth2 , thank you for working on this and having it completed. |
@tomvothecoder Ok, so even though anaconda.org/cdat/output_viewer shows ESGF/output_viewer listed as a CDAT package, it does NOT introduce a CDAT dependency by including it? |
@BunnyVon Ah good idea to match up with the ILAMB categories. I'll try incorporating that. Thanks! |
@tomvothecoder Looking at E3SM-Project/e3sm_diags#628 > fixed by E3SM-Project/e3sm_diags#773 > |
anaconda.org/cdat is the Anaconda channel for CDAT. This does not necessarily mean all of the packages in that channel are "CDAT packages". The ESGF Note, the ESGF
It makes more sense to write an Alternatively, |
Note to self, mostly: using the 1a.
In particular:
1a-2a.
If we use 1a-2a-3a.
1a-2a-3a-4a. 1a-2a-3a.
1a-2a-3a-4b. 1a-2a-3a.
1a-2a-3a-4c. 1a-2a-3a.
1a-2a-3a-4d. 1a-2a.
1a-2a-3b.
So, we'd need some different path because clearly 1a-2a-3b.
which was described previously (1a-2a-3a-4c). I'll try out copying some of the |
27dd400
to
9249d1b
Compare
VisualsAlright, the second commit (9249d1b) has produced a very basic viewer (with Accomplished:
Need to add/fix:
Directory structureThe produced directory structure:
Compare that with example output from
The directory structures actually appear to match up ok, however there is no |
9249d1b
to
cd5b100
Compare
Ok with the third commit (cd5b100), I am able to categorize the variables. I dislike the hard-coded variable categorization, but I suppose it will have to do if we have no other method. @BunnyVon How does this look? It doesn't have all the bells and whistles of E3SM Diags, but it is the same general display. I tried to categorize variables the best I could using ilamb.cfg, but I don't think I did it totally correctly. |
I think it looks great. what do you think, @thorntonpe? @forsyth2 , in terms of the variables, I am finalizing my preprocessing script with Nate Collier, which has the info for the conversion between ilamb variables and E3SM variables for the ilamb scorecard. It's soon to be completed, and I can share afterward. |
Awesome, thanks @BunnyVon! |
@forsyth2 and @BunnyVon - After a hectic few weeks I am now able to give this thread the attention it deserves. Thanks for your patience. The overall timeseries plotting capability looks excellent - the plots themselves have good graphic quality and having them all accessible through a single index page is exactly what we need. Thank you!
|
Thanks @thorntonpe! To your points:
|
Next steps with this pull request:
After this is merged: with the viewers looking how we want for one simulation, we'll follow an approach similar to Diags model-vs-model (add equivalent parameters for |
@czender We had some questions come up about how NCO calculates the global averages. The Land team would like to process some variables as "total" and others as "average". We're currently only getting the global averages.
|
@forsyth2 The averages that NCO produces from EAM simulations are true global averages. The "global" averages that NCO produces from ELM simulations are by definition global land-only averages. They are computed by summing each field (e.g., Responses to your questions:
|
FYI the latest NCO snapshot includes a new option to control the global/regional statistics that ncclimo computes for timeseries:
Feedback welcome. |
@czender thank you for having this implemented so quickly. I'm tagging land model developers @thorntonpe @BunnyVon and @darincomeau about this new |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok I have the Viewer categorized by the groupings in the csv:
Remaining To Do items are listed in my comments here.
@thorntonpe also noted that once a few “A” and “T” variables are working, the Land team would like to do some checks against outputs from their prior workflow.
zppy/templates/coupled_global.py
Outdated
# TODO: make viewer home page to point to multiple viewers | ||
url = create_viewer(figstr, regions, "lnd", plots_lnd) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
TODO: Right now, the viewer only exists for land variables. We'll need to create a viewer for each component (e.g., atm, ice, ocn).
- Design decision: In the absence of corresponding variable definition csv files for those components, I can just put all the component's variables under one group.
- To make a home page of multiple viewers, we'll have to more closely emulate the E3SM Diags home page (e.g., https://github.com/E3SM-Project/e3sm_diags/blob/main/e3sm_diags/viewer/main.py, https://github.com/E3SM-Project/e3sm_diags/blob/main/e3sm_diags/viewer/index_template.html)
(Again, a note on design and tech debt: this is very much outside the scope of zppy as a post-processing workflow coordinator. At this point, Global Time Series is for all intents-and-purposes a plotting package fully contained within zppy
due to legacy reasons).
zppy/templates/coupled_global.py
Outdated
# the name of the ELM variable on the monthly h0 history file | ||
self.variable_name = csv_row[0] | ||
# “A” or “T” for global average over land area or global total, respectively | ||
self.metric = csv_row[1] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
TODO: Actually use A
or T
.
Going by @czender's comments #616 (comment) and #616 (comment), it seems like the logic change would belong more in ts.py
than coupled_global.py
. That would mean ts.py
would also have to parse the csv.
In any case, at some point, a decision on what to plot (A
or T
) must be made based on this csv. Currently, the csv parsing is done very late, after the plots themselves have been generated. This parsing will need to be moved much earlier in order to propagate the A/T
information before it is needed.
zppy/templates/coupled_global.py
Outdated
# “A” or “T” for global average over land area or global total, respectively | ||
self.metric = csv_row[1] | ||
# the factor that should convert from original units to final units, after standard processing with nco | ||
self.scale_factor = csv_row[2] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
TODO: design decision -- should this be done at the ts
task, or at the global_time_series
task? I.e., is it only relevant to plotting?
zppy/templates/coupled_global.py
Outdated
# test string for the units as given on the history file (included here for possible testing) | ||
self.original_units = csv_row[3] | ||
# the units that should be reported in time series plots, based on A/T and Scale Factor | ||
self.final_units = csv_row[4] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
TODO: Again, the csv parsing will need to be moved earlier on to before the plots are generated (rather than right before they're organized into a viewer)
zppy/templates/coupled_global.py
Outdated
# a name used to cluster variables together, to be separated in groups within the output web pages | ||
self.group = csv_row[5] | ||
# Descriptive text to add to the plot page to help users identify the variable | ||
self.long_name = csv_row[6] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
TODO: Again, the csv parsing will need to be moved earlier on to before the plots are generated (rather than right before they're organized into a viewer)
9353deb
to
3416d4c
Compare
75c9757
to
8ef71d5
Compare
8ef71d5
to
8fdb945
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Latest status update
# Run this test suite in the environment the global_time_series task runs in. | ||
# I.e., whatever `environment_commands` is set to for `[global_time_series]` | ||
# NOT the zppy dev environment. | ||
# Run: python -u -m unittest tests/global_time_series/test_coupled_global.py |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These unit tests are specific to coupled_global.py
. I tried to add as many unit tests as I could, but the many functions that rely on IO (notably reading data or plotting data) can't be well tested outside of integration testing/the "min-case" tests.
Importantly, these unit tests can't be run like the other unit tests. They need to be run from the environment that zppy
is running the global_time_series
task. That is, these tests likely need to be run from E3SM Unified rather than the zppy
dev environment. To facilitate testing I also just moved the contents of readTS.py
into coupled_global.py
.
(Note: this testing setup again highlights how we need to be wary of scope creep in zppy
. Here we're really testing the de facto Global Time Series package, not so much zppy
itself. See #398)
@@ -0,0 +1,46 @@ | |||
[default] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is the min-case cfg
template I'm using for testing this PR.
@@ -49,13 +50,18 @@ def global_time_series(config, script_dir, existing_bundles, job_ids_file): | |||
c["global_time_series_dir"] = os.path.join(script_dir, f"{prefix}_dir") | |||
if not os.path.exists(c["global_time_series_dir"]): | |||
os.mkdir(c["global_time_series_dir"]) | |||
scripts = ["coupled_global.py", "readTS.py", "ocean_month.py"] | |||
scripts = ["coupled_global.py", "ocean_month.py"] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Contents of readTS.py
are now in coupled_global.py
, as mentioned in an earlier comment.
if (metric == Metric.AVERAGE) or (metric == Metric.TOTAL): | ||
annual_average_dataset_for_var: xarray.core.dataset.Dataset = ( | ||
self.f.temporal.group_average(var, "year") | ||
) | ||
data_array = annual_average_dataset_for_var.data_vars[var] | ||
# elif metric == Metric.TOTAL: | ||
# # TODO: Implement this! | ||
# raise NotImplementedError() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@tomvothecoder @chengzhuzhang I will look into this further, but in case one of you has an immediate answer:
We currently use group_average
from XCDAT to calculate the global annual average for variables. The Land team however needs to compute averages of some variables and totals for others. It looks like we don't need to modify the ts
task then; we just need to modify the calculation here. What I'm not sure of is what calculation to use. Multiply each year's average by a constant (e.g., landfrac * surface area
)? Use a different xCDAT function?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@forsyth2 I believe the formula to compute global sum from global average is to multiply global average by a scalar: the sum of area*landfrac. Maybe you could check if area
and landfrac
are available from the .nc
file generated by ts
global task.
# new_url = f"viewer_{component}" | ||
# # shutil.move("viewer", new_url) | ||
# distutils.dir_util.copy_tree("viewer", new_url) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Any time I try to move/copy the land viewer over to a different dir (e.g., viewer_lnd
), the formatting goes bad. I'm assuming some formatting file it's referencing isn't also getting copied over.
What this means though is that I can't seem to properly make different viewer directories for different components.
url = create_viewer(parameters, vars, component) | ||
print(url) | ||
title_and_url_list.append((component, url)) | ||
# index_url: str = create_viewer_index(parameters.case_dir, title_and_url_list) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@chengzhuzhang I will work on this more, but I'm having some difficulties with setting up multiple viewers (see earlier comment) and then getting them to show up on an index page together. I think I just need to match up file paths and URLs correctly.
The problem is viewer
seems to be used everywhere, so it's difficult to differentiate between viewers. For instance, in https://github.com/E3SM-Project/e3sm_diags/blob/7b054251e54f223e44ec1e15908548e8e1797744/e3sm_diags/viewer/index_template.html, the code block
<link href="viewer/css/bootstrap.min.css" rel="stylesheet" type="text/css"/>
<link href="viewer/css/viewer.css" rel="stylesheet" type="text/css"/>
<script src="viewer/js/jquery-2.2.3.min.js" type="text/javascript">
</script>
<script src="viewer/js/bootstrap.min.js" type="text/javascript">
</script>
<script src="viewer/js/viewer.js" type="text/javascript">
implies the existence of a viewer
directory above all the other viewers.
Again, I'll look into this more myself, but I am wondering if we could get the multi-viewer index working quite quickly if we just pair programmed it together.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure about issues here. Is the idea to create a main viewer and have each component viewers link to it? As a first step, does it work to create each component viewer under each directory?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What I'm aiming for:
Something like e3sm_diags where there's an index page linking to multiple viewers:
I want a viewer for each component.
What I currently have:
My latest results are just the land plots:
The parent directory has a subdir viewer/
.
If I try to create multiple viewers by having two subdirs viewer_lnd
and viewer_atm
, then the formatting goes off, as described in #616 (comment). This can be seen here:
and here:
Actually, atm
is even worse; the plots don't seem to get linked from the atm
viewer.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@tomvothecoder The issues I mentioned with the Viewer generation are described in-detail above.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It seems like you're on the right track by generating one viewer sub-directory per component. And as you mentioned, there might be issues with relative links. It looks like each viewer's HTML is not referencing the /viewer/css
and /viewer/js
files, resulting in messed up formatting. I'm guessing the same issue is happening with the links to the atm
plots.
For context, this is how the E3SM diagnostics viewer is structured and operates.
Directory structure
/viewer -- root dir
/viewer -- main viewer dir
/css. -- CSS dir shared across each set viewer
/js. - JavaScript dir, shared across each set viewer
index.html -- main index page that links to each set viewer (e.g., `lat_lon`, `polar`)
/lat_lon
index.html. -- lat lon viewer index page
/aod_550 -- variable
/aoddust-global-macv2
ann.html
jja.html
/polar
...
- Generate a base files:
/viewer/viewer
dir that contains/css
and/js
for CSS and JavaScript filesindex.html
which is the index page you linked above. This file provides links to each set's viewerindex.html
, example below.
<!--
A template used to create the index.
This index has links to all of the separate viewers.
-->
<!DOCTYPE html>
<html>
<head>
<title>
E3SM Diagnostics
</title>
<meta content="width=device-width, intial-scale=1" name="viewport"/>
<meta charset="utf-8"/>
<link href="viewer/css/bootstrap.min.css" rel="stylesheet" type="text/css"/>
<link href="viewer/css/viewer.css" rel="stylesheet" type="text/css"/>
<script src="viewer/js/jquery-2.2.3.min.js" type="text/javascript">
</script>
<script src="viewer/js/bootstrap.min.js" type="text/javascript">
</script>
<script src="viewer/js/viewer.js" type="text/javascript">
</script>
</head>
<body>
<div id="e3sm-header" style="background-color:#dbe6c5; float:left; width:45%">
<p style="margin-left:5em">
<b>
E3SM Diagnostics Package v2.12.0
<br/>
</b>
Test: e3sm_v2
<br/>
Reference: Observation and Reanalysis
<br/>
Created: 2024-09-26 16:58:04
</p>
</div>
<div id="e3sm-header2" style="background-color:#dbe6c5; float:right; width:55%">
<img alt="logo" src="./viewer/e3sm_logo.png" style="width:201px; height:91px; background-color:#dbe6c5"/>
</div>
<div class="container">
<div class="row">
<div class="col-sm-5 col-sm-offset-1">
<table class="table">
<!--The rows get inserted here.-->
<tr>
<td>
<a href="lat_lon/index.html">
Latitude-Longitude contour maps
</a>
</td>
<td>
<a href="table/index.html">
Table
</a>
</td>
<td>
<a href="taylor/index.html">
Taylor Diagram
</a>
</td>
<td>
<a href="cmip6/index.html">
CMIP6 Comparison
</a>
</td>
</tr>
<tr>
<td>
<a href="zonal_mean_xy/index.html">
Zonal mean line plots
</a>
</td>
</tr>
<tr>
<td>
<a href="zonal_mean_2d/index.html">
Pressure-Latitude zonal mean contour plots
</a>
</td>
</tr>
<tr>
<td>
<a href="zonal_mean_2d_stratosphere/index.html">
Pressure-Latitude zonal mean contour plots (Stratosphere)
</a>
</td>
</tr>
<tr>
<td>
<a href="polar/index.html">
Polar contour maps
</a>
</td>
</tr>
<tr>
<td>
<a href="cosp_histogram/index.html">
CloudTopHeight-Tau joint histograms
</a>
</td>
</tr>
<tr>
<td>
<a href="meridional_mean_2d/index.html">
Pressure-Longitude meridional mean contour plots
</a>
</td>
</tr>
<tr>
<td>
<a href="annual_cycle_zonal_mean/index.html">
Annual Cycle Zonal Mean Contour Plots
</a>
</td>
</tr>
<tr>
<td>
<a href="enso_diags/index.html">
ENSO Diagnostics
</a>
</td>
</tr>
<tr>
<td>
<a href="qbo/index.html">
Quasi-biennial Oscillation
</a>
</td>
</tr>
<tr>
<td>
<a href="area_mean_time_series/index.html">
Area Mean Time Series
</a>
</td>
</tr>
<tr>
<td>
<a href="diurnal_cycle/index.html">
Diurnal cycle phase maps
</a>
</td>
</tr>
<tr>
<td>
<a href="streamflow/index.html">
Streamflow
</a>
</td>
</tr>
<tr>
<td>
<a href="arm_diags/index.html">
Diagnostics at ARM stations
</a>
</td>
</tr>
<tr>
<td>
<a href="tc_analysis/index.html">
Diagnostics for Tropical Cyclones
</a>
</td>
</tr>
<tr>
<td>
<a href="aerosol_aeronet/index.html">
Aerosol Diags at AERONET sites
</a>
</td>
</tr>
<tr>
<td>
<a href="aerosol/index.html">
Aerosol Budget Tables
</a>
</td>
</tr>
<tr>
<td>
<a href="mp_partition/index.html">
Mixed-phase Partition
</a>
</td>
</tr>
<tr>
<td>
<a href="../prov">
Provenance
</a>
</td>
</tr>
</table>
</div>
<div class="col-sm-5">
<div class="img_links">
</div>
</div>
</div>
</div>
</body>
</html>
- Generate viewer for each set and store in sub-directories (e.g.,
/viewer/lat_lon
,/viewer/polar
)- Contains an
index.html
- Contains sub-directories with
.html
for each page - These HTML Files reference the base
/viewer/viewer
CSS and JS files
- Contains an
for rgn in parameters.regions: | ||
run(parameters, requested_variables, rgn) | ||
plots_per_page = parameters.nrows * parameters.ncols | ||
# TODO: Is this how we want to determine when to make a viewer or should we have a `make_viewer` parameter in the cfg? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Design decision question: how do we want the user to specify they want a Viewer?
- Always make a viewer
- Make a viewer if they say they want 1 row, 1 column summary pages (i.e., single plot pages) and otherwise construct the
nrows x ncols
summary PDFs as before? - Have an explicit parameter in the
cfg
:make_viewer = True
Closing in favor of E3SM-Project/zppy-interfaces#9 and #648 |
Allow single variable global time series plots. Resolves #601.