Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve input validation and testing #628

Merged
merged 1 commit into from
Oct 12, 2024
Merged

Conversation

forsyth2
Copy link
Collaborator

@forsyth2 forsyth2 commented Oct 9, 2024

Issue resolution

Select one: This pull request is...

  • a bug fix: increment the patch version
  • a small improvement: increment the minor version
  • an incompatible (non-backwards compatible) API change: increment the major version

1. Does this do what we want it to do?

Objectives:
Adding "defensive programming" techniques to catch bad input early.

Functional code:

  • Added guess_path_parameters & guess_section_parameters parameters, defaulting to True, to keep backwards compatibility. Setting to False allows for more transparency of what paths are being used.
  • Added required parameter checks for certain diags sets, based on default.ini notes. Also added climo_diurnal_frequency check.
  • Added dependencies to .settings files
  • Added fail_on_dependency_skip parameter, defaulting to False, to keep backwards compatibility. Setting to True stops zppy if any job launch fails.
  • Refactored code to be more modular. This is a great improvement to both readability & testing. Previously, much of zppy had been written in monolithic functions that made testing outside of full integration testing difficult. Now, real unit testing (as in, independent of the outside world) is more feasible.
  • Added exit to tc_analysis if the cyclone_stitch dat file is empty.

Testing:

  • Run all e3sm_diags sets on v2 and v3 comprehensive runs
  • Changed reference years for e3sm_diags in v2 to start at 1870
  • Added daily ts subtask to support the tropical_subseasonal e3sm_diags set.
  • Remove simple_image_name so that all image check failures actually show up in the web's diff directory. Resolves Clarify simple_image_name #358.
  • Added unit tests for the newly modular code.
  • Updated expected results for integration tests.

Required:

  • Product Management: I have confirmed with the stakeholders that the objectives above are correct and complete.
  • As zppy's lead developer, I am the stakeholder these objectives are most relevant for.
  • Testing: I have added or modified at least one "min-case" configuration file to test this change. Every objective above is represented in at least one cfg.
    • These are not necessary since the purpose of this PR is to fix the comprehensive Weekly integration testing & add more unit tests.
  • Testing: I have considered likely and/or severe edge cases and have included them in testing.
    • I have attempted exhaustive unit testing of functions that don't interact with the outside world (i.e., functions that wouldn't need to be wrapped in the IO monad were the code written in Haskell).

If applicable:

  • Testing: this pull request introduces an important feature or bug fix that we must test often. I have updated the weekly-test configuration files, not just a "min-case" one.
  • Testing: this pull request adds at least one new possible parameter to the cfg. I have tested using this parameter with and without any other parameter that may interact with it.
    • fail_on_dependency_skip (False on v2,v3 tests; True on bundles tests) , guess_path_parameters, guess_section_parameters (False on v3 tests; True on v2, bundles tests)

2. Are the implementation details accurate & efficient?

Required:

  • Logic: I have visually inspected the entire pull request myself.
  • Logic: I have left GitHub comments highlighting important pieces of code logic. I have had these code blocks reviewed by at least one other team member.

If applicable:

  • Dependencies: This pull request introduces a new dependency. I have discussed this requirement with at least one other team member. The dependency is noted in zppy/conda, not just an import statement.
    • No new dependencies

3. Is this well documented?

Required:

  • Documentation: by looking at the docs, a new user could easily understand the functionality introduced by this pull request.
    • The bulk of zppy documentation is through reading parameter comments on default.ini.
    • This PR mainly affects developers, not users.

4. Is this code clean?

Required:

  • Readability: The code is as simple as possible and well-commented, such that a new team member could understand what's happening.
  • Pre-commit checks: All the pre-commits checks have passed.

If applicable:

  • Software architecture: I have discussed relevant trade-offs in design decisions with at least one other team member. It is unlikely that this pull request will increase tech debt.

@forsyth2 forsyth2 added semver: new feature New feature (will increment minor version) Testing Files in `tests` modified labels Oct 9, 2024
@forsyth2 forsyth2 self-assigned this Oct 9, 2024
@forsyth2 forsyth2 force-pushed the issue-625-improve-tests branch 2 times, most recently from 5cddd93 to ee8d74b Compare October 11, 2024 09:18
@forsyth2 forsyth2 force-pushed the issue-625-improve-tests branch from 0820d07 to f8acf41 Compare October 12, 2024 14:33
bundle = "bundle3" # Let bundle1 finish first because "e3sm_diags: atm_monthly_180x360_aave_mvm" requires "ts: atm_monthly_180x360_aave"
scratch = "#expand scratch#zppy_weekly_bundles_scratch/#expand unique_id#/#expand case_name#"
years = "1985:1989:2",
# TODO: Add "tc_analysis" back in after empty dat is resolved.
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

tc_analysis must be excluded for now because of E3SM-Project/e3sm_diags#866 (comment)

tests/integration/utils.py Outdated Show resolved Hide resolved
shutil.copy(
path_to_actual_png,
os.path.join(diff_dir, "{}_actual.png".format(simple_image_name)),
os.path.join(diff_dir, "{}_actual.png".format(image_name)),
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Resolves #358

Comment on lines +93 to +99
# If cyclones_stitch file is empty, exit
if ! [ -s ${result_dir}cyclones_stitch_${file_name}.dat ]; then
cd {{ scriptDir }}
echo 'ERROR (1)' > {{ prefix }}.status
exit 1
fi

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Collaborator

@chengzhuzhang chengzhuzhang Oct 17, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@forsyth2 I suggest not to catch this error, so that othere3sm_diags runs won't be blocked by empty stitch files.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Based on your comment here: #632 (comment).

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@chengzhuzhang I think it would be a mistake to let errors slide by silently, especially considering this one was hard to track down.

A couple options on the top of my mind:

  1. Leave this as-is. If tc_analysis fails because of this, then users can remove tc_analysis from their list of sets in the e3sm_diags task and then re-run zppy.
  2. Have a parameter something like fail_on_empty_stitch to toggle this error on/off

Copy link
Collaborator

@chengzhuzhang chengzhuzhang Oct 17, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My take is that this is not an error, just no cyclone is detected. Without catching the error, it should work more smoothly, we can print in the log that indicate no cyclone stitch file is generated.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That makes sense, but it does cause an error in E3SM Diags down the road then. Notably, the IndexError: list index out of range error on #625 (comment) didn't add much clarity to the error.

I think if we remove this error check in the tc_analysis bash script (which I agree we should do if it's not actually an error per-se), we should add an empty-dat check in the e3sm_diags task.
I've copied the stack trace again here:

Traceback (most recent call last):
  File "/home/ac.forsyth2/miniconda3/envs/e3sm_diags_1003/lib/python3.10/site-packages/e3sm_diags/parameter/core_parameter.py", line 266, in _run_diag
    single_result = module.run_diag(self)
  File "/home/ac.forsyth2/miniconda3/envs/e3sm_diags_1003/lib/python3.10/site-packages/e3sm_diags/driver/tc_analysis_driver.py", line 91, in run_diag
    test_data["metrics"] = generate_tc_metrics_from_te_stitch_file(test_te_file)
  File "/home/ac.forsyth2/miniconda3/envs/e3sm_diags_1003/lib/python3.10/site-packages/e3sm_diags/driver/tc_analysis_driver.py", line 172, in generate_tc_metrics_from_te_stitch_file
    te_stitch_vars = _get_vars_from_te_stitch(lines, max_len, num_storms)
  File "/home/ac.forsyth2/miniconda3/envs/e3sm_diags_1003/lib/python3.10/site-packages/e3sm_diags/driver/tc_analysis_driver.py", line 249, in _get_vars_from_te_stitch
    year_start = int(lines[0].split("\t")[2])
IndexError: list index out of range
    test_te_file = os.path.join(
        test_data_path,
        "cyclones_stitch_{}_{}_{}.dat".format(test_name, test_start_yr, test_end_yr),
    )

So, it would be nice to add a non-empty check to that test_te_file. Based on the stack trace, it would probably be easier to add that in e3sm_diags than in zppy.

Copy link
Collaborator

@chengzhuzhang chengzhuzhang Oct 17, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, it makes sense to catch it in e3sm_diags. This error doesn't affect other non tc_analysis sets, right?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sounds good. And no, I don't think these dat files are used in any other sets. The TC analysis task isn't even a dependency for any other set:

    if "tc_analysis" in c["sets"]:
        dependencies.append(os.path.join(script_dir, f"tc_analysis{status_suffix}"))

@forsyth2 forsyth2 force-pushed the issue-625-improve-tests branch from f8acf41 to 4e8de5a Compare October 12, 2024 14:45
@forsyth2 forsyth2 marked this pull request as ready for review October 12, 2024 14:49
@forsyth2 forsyth2 merged commit abad674 into main Oct 12, 2024
4 checks passed
@forsyth2 forsyth2 deleted the issue-625-improve-tests branch October 12, 2024 14:49
zhangshixuan1987 pushed a commit to zhangshixuan1987/zppy that referenced this pull request Nov 1, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
semver: new feature New feature (will increment minor version) Testing Files in `tests` modified
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[Bug]: Failures on the Weekly test run Clarify simple_image_name
2 participants