Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Workflow for ZPPY- PCMDI for E3SM diagnostics (with fixes on issues found in #624) #640

Closed
wants to merge 30 commits into from

Conversation

zhangshixuan1987
Copy link

Issue resolution

  • Closes #<ISSUE_NUMBER_HERE>

Select one: This pull request is...

  • a bug fix: increment the patch version
  • a small improvement: increment the minor version
  • an incompatible (non-backwards compatible) API change: increment the major version

1. Does this do what we want it to do?

Objectives:

  • Objective 1
  • Objective 2
  • ...
  • Objective n

Required:

  • Product Management: I have confirmed with the stakeholders that the objectives above are correct and complete.
  • Testing: I have added or modified at least one "min-case" configuration file to test this change. Every objective above is represented in at least one cfg.
  • Testing: I have considered likely and/or severe edge cases and have included them in testing.

If applicable:

  • Testing: this pull request introduces an important feature or bug fix that we must test often. I have updated the weekly-test configuration files, not just a "min-case" one.
  • Testing: this pull request adds at least one new possible parameter to the cfg. I have tested using this parameter with and without any other parameter that may interact with it.

2. Are the implementation details accurate & efficient?

Required:

  • Logic: I have visually inspected the entire pull request myself.
  • Logic: I have left GitHub comments highlighting important pieces of code logic. I have had these code blocks reviewed by at least one other team member.

If applicable:

  • Dependencies: This pull request introduces a new dependency. I have discussed this requirement with at least one other team member. The dependency is noted in zppy/conda, not just an import statement.

3. Is this well documented?

Required:

  • Documentation: by looking at the docs, a new user could easily understand the functionality introduced by this pull request.

4. Is this code clean?

Required:

  • Readability: The code is as simple as possible and well-commented, such that a new team member could understand what's happening.
  • Pre-commit checks: All the pre-commits checks have passed.

If applicable:

  • Software architecture: I have discussed relevant trade-offs in design decisions with at least one other team member. It is unlikely that this pull request will increase tech debt.

zhangshixuan1987 and others added 26 commits September 16, 2024 17:55
The modification attemts to interpolate the 3D model level data to
standard pressure level specified by vrt_remap_plev19.nc. This is
needed by e3sm_to_cmip to convert these variables into cmip type
Develop a script pcmdi_diags.py along with pcmdi_diags.bash to
run the pcmdi diagnostics using the E3SM model simulations
* Update mypy python version

* Update conda python version
* Update pre-commit dependencies

* Update pre-commit deps in `conda/dev.yml`

* Fix `black` formatting errors

* Fix `flake8` programming errors
@zhangshixuan1987
Copy link
Author

@chengzhuzhang @forsyth2 : Hi Jill and Ryan, I closed the #624 as some of the fixes are made on the issues pointed our by you. I open this new pull request for the continuing discussion.

@chengzhuzhang
Copy link
Collaborator

@zhangshixuan1987

From the original PR#624, I ran into an env problem that suggests I can't access your build of pcmdi_metrics_master conda environment. All other tasks like ts ran.

EnvironmentNameNotFound: Could not find conda environment: pcmdi_metrics_master

However In the new PR I ran into an error much ealier, right after issuing the zppy command line, as below:

(zppy_dev_24) [ac.zhang40@chrlogin2 test_zppy]$ zppy -c post.v3.LR.historical_pmp.cfg 
Traceback (most recent call last):
  File "/home/ac.zhang40/y/envs/zppy_dev_24/bin/zppy", line 5, in <module>
    from zppy.__main__ import main
  File "/home/ac.zhang40/y/envs/zppy_dev_24/lib/python3.9/site-packages/zppy/__main__.py", line 18, in <module>
    from zppy.pcmdi_diags import pcmdi_diags
  File "/home/ac.zhang40/y/envs/zppy_dev_24/lib/python3.9/site-packages/zppy/pcmdi_diags.py", line 8, in <module>
    from zppy.utils import (
ImportError: cannot import name 'checkStatus' from 'zppy.utils' (/home/ac.zhang40/y/envs/zppy_dev_24/lib/python3.9/site-packages/zppy/utils.py)

I think the error is caused by the conflicting zppy/__main__.py

In any case, I think i now have a better understand of your implementation, after testing and reviewing the results. And perhaps we can have a chat sometime next week about design details.

@zhangshixuan1987
Copy link
Author

@chengzhuzhang: Hi Jill, when I submitted this new PR, I did a code rebase to make sure that my code is consistent with the zppy master branch. However, I just found that a certain amount of the changes have been made in zppy-master branch. Due to these changes, I need to make modifications to my pcmdi workflow. Overall, please hold, and I may need to resubmit a PR again. I apologize for the inconvenience.

@chengzhuzhang
Copy link
Collaborator

@zhangshixuan1987 Oh, no worries! I didn't realize it is still a draft. Thanks for the heads-up. I'm happy to test again when it is ready. No hurry.

@forsyth2
Copy link
Collaborator

forsyth2 commented Nov 1, 2024

we can have a chat sometime next week about design details.

Due to these changes, I need to make modifications to my pcmdi workflow. Overall, please hold, and I may need to resubmit a PR again. I apologize for the inconvenience.

I've unfortunately been busy with other priorities, but I've been trying to follow this work at a high level. I would like to again raise my concern about repeating design decision mistakes that have caused issues in the global_time_series task. It may be a good idea to discuss design further before proceeding with more work.

However, I just found that a certain amount of the changes have been made in zppy-master branch.

Yes, I mentioned in #624 (comment) that #628 made some changes to clean up the zppy code base. That's probably the biggest change.

@forsyth2
Copy link
Collaborator

forsyth2 commented Nov 1, 2024

I'm hesitant to make too many comments without fully reviewing and running the latest code myself, but my specific design decision concerns at the moment are the following. Perhaps I just need more context though, which we can discuss in a meeting.

I agree strongly that it would be much easier to navigate the results with a viewer page!

It appears we are once again adding a de facto analysis package inside zppy. #616 is really turning the global_time_series task into that -- a piece of software with scope and environment distinct from zppy that is however fundamentally integrated into zppy. This has caused two major issues:

  1. It is hard to test global_time_series. Create global time series Viewers #616 is adding unit tests for the global_time_series. I have to run those tests from a different environment than the zppy dev environment because that code runs in the environment that the global_time_series task is running in, not the environment zppy is running in. This convolutes an already complicated testing scheme -- e.g., see zppy weekly testing log #634 for the process required to do weekly integration testing.
  2. The environment issue segues into the second major issue: the global_time_series code has different dependencies than the zppy code. This complicated @xylar's question of what code exactly had numpy dependencies (global_time_series needs numpy, but zppy does not). Furthermore, this means there isn't really a dev environment for global_time_series. And I suspect a similar issue here is responsible for the EnvironmentNameNotFound: Could not find conda environment: pcmdi_metrics_master error mentioned in Workflow for ZPPY- PCMDI for E3SM diagnostics (with fixes on issues found in #624)  #640 (comment).

global_time_series developed these issues over time as it grew from a small task into a complex pseudo-package. We need to discuss the design decisions for implementing the PMP task, before repeating known issues. zppy is already a complex piece of code requiring python > bash interaction. These built-in analysis packages add a third layer of complexity: python > bash > python.

@chengzhuzhang
Copy link
Collaborator

I've unfortunately been busy with other priorities, but I've been trying to follow this work at a high level. I would like to again raise my concern about repeating design decision mistakes that have caused issues in the global_time_series task. It may be a good idea to discuss design further before proceeding with more work.

I'm still not convinced that we should consider a stand-alone package for the global_time_series, mostly because the potential maintenance overhead. Also the global_time_series capability is very specific about its input: the global average files are generated with a relatively recent new NCO capability. It doesn't feel like we need a new package for just visualization global time-series and to aggregate the results.

My view of zppy is that it not only provide interface to connect multiple diagnostics tools, it also closes any gaps that these packages won't cover but E3SM requires. In the case of zppy-PMP workflow, there is missing pieces from PMP, including decoupled plotting capability and no viewer is generated. It feels that it is inevitable that we need to use zppy to close such gaps.

Another thing i would note is that there are several scripts use cdms or other CDAT utilities in this PR, which we should migrate away .

@xylar
Copy link
Contributor

xylar commented Nov 1, 2024

My view of zppy is that it not only provide interface to connect multiple diagnostics tools, it also closes any gaps that these packages won't cover but E3SM requires.

If that is going to continue to be the purview of zppy. I think there needs to be a yaml file or something similar that defines the dependencies and their constraints for each tool like global_time_series that has unique dependencies. I need somewhere I can go to know what packages E3SM-Unified needs to have to support not just zppy, the workflow management tool, but its pseudo-sub-packages.

@chengzhuzhang
Copy link
Collaborator

@xylar yes, dependencies such as numpy, and the upcoming outputviewer, should be explicitly listed in a yaml file.

@rljacob
Copy link
Member

rljacob commented Nov 1, 2024

I think Ryan's point is that global_time_series is currently poorly maintained because its buried inside zppy. The cost of making it a separate package is offset by the benefit of better testing.

@xylar
Copy link
Contributor

xylar commented Nov 1, 2024

I agree with both @rljacob and @forsyth2, I don't think it's great if zppy becomes a kitchen-sink package. The extra maintenance is already there and breaking our distinct packages should provide transparency for users and developers. I realized we're short on staff. But bundling packages together doesn't really solve that problem, it just obscures it.

In general I think zppy should return to being a workflow manager and not try to also be the tool that adapts all other tools for E3SM's needs. That's really messy to try to bundle into a single package.

@chengzhuzhang
Copy link
Collaborator

hmmm, is it possible to make the global_time_series capability more modular and testable in zppy before considering packaging it as a stand-alone package? I don't know if such a package is useful beyond zppy.

@zhangshixuan1987
Copy link
Author

zhangshixuan1987 commented Nov 2, 2024

@chengzhuzhang @xylar @forsyth2: Hi all, I may not fully understand all your discussions here. The EnvironmentNameNotFound: Could not find conda environment: pcmdi_metrics_master is because the pcmdi_metrics_master is a customized environment used for the development of zppy-pcmdi workflow for the following reasons:

  1. the PCMDI within the e3sm-unified environment is significantly outdated, and significant changes including the data structure etc. have been made in the latest version of the PCMDI package (https://github.com/PCMDI/pcmdi_metrics). Therefore, it is not valid to me if I develop the zppy-pcmdi workflow with respect to the outdated PCMDI library in the current e3sm-unified environment.
  2. at the beginning of this zppy-pcmdi workflow, we believe that the update of PCMI within the e3sm-unified environment is not urgent and needed. Therefore, I built a PCMDI library in my local machine for the work.

To me, The above issue will not be a problem anymore once the PCMDI is updated in the e3sm-unified environment.

One step further, I currently used the ZPPY preferred method to load the conda environment pcmdi_metrics_master by adding following lines in the ".cfg" file:

  • environment_commands = "source /lcrc/soft/climate/e3sm-unified/load_latest_e3sm_unified_chrysalis.sh; conda activate pcmdi_metrics_master"

I think that @chengzhuzhang has a problem calling "conda activate pcmdi_metrics_master" as there is no such coda environment installed in her local machine. However, this is not because there is a module called "pcmdi_metrics_master" within any Python scripts of the zppy-pcmid workflow. My point here is that the above method to activate the customized PCMDI library will not cause any issues within zppy-pcmdi workflow.

@xylar
Copy link
Contributor

xylar commented Nov 2, 2024

@zhangshixuan1987, you are right. We are hijacking your PR to have what has meandered into a very important discussion but one that is not related.

@xylar
Copy link
Contributor

xylar commented Nov 2, 2024

Let's move the broader discussion to #641 and keep this PR focused on its original purpose.

There is still, of course, room to discuss whether this work should be done outside of zppy for similar reasons to what moves over to that discussion.

@chengzhuzhang
Copy link
Collaborator

chengzhuzhang commented Nov 2, 2024

@zhangshixuan1987 Sorry for this distraction. I'm grateful that Xylar created a separate discussion item.

For the pcmdi_diags_master the environment. It is possible to source someone's locally built conda env through commands like:
source /home/ac.forsyth2/miniconda3/etc/profile.d/conda.sh; conda activate e3sm_diags_20241015
This is how I source @forsyth2 's locally built e3sm_diags environment. I think we can do something similar for using your built pmp environment: first to source your conda path, and then activate the env.

@zhangshixuan1987
Copy link
Author

@chengzhuzhang @forsyth2: I am not intended to interrupt the discussions. I think all discussions here are very useful to me as well. I am also very interested in the further discussions at the #641.

@zhangshixuan1987
Copy link
Author

zhangshixuan1987 commented Nov 2, 2024

@chengzhuzhang : Hi Jill, I have checked and made modifications to the PCMDI workflow. The test on my side indicates that the workflow can work as expected.

Below are the example results for the E3SM AMIP simulation run with revised workflow:

An updated ".cfg" file is attached here for your test:

Also, I think that my built "pcmdi_metrics_master" environment can not be sourced as like @xylar's likely because I did not install the miniconda environment by myself, rather, I relied on the "e3sm_unified" environment on Chrysalis. Therefore, to make sure that you can test the package successfully, I converted my package to a ".yml" file:

  • pcmdi_metrics_master.yml
    I think you can use this file to build an equivalent environment for test purposes (I apologize for the extra work on your side).

Finally, I am trying to work on a viewer function following the one in E3SM diagnostics, I will let you know if I can quickly figure this out. Please let me know if you could be available for a meeting, and I would like to discuss with you for the strategy and the requirements that I should follow.

@zhangshixuan1987
Copy link
Author

I close this pull request as it is outdated, and we have new pull request #647 generated for this work.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants