Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Split attrs - tests for status quo #4960

Merged
merged 27 commits into from
Feb 20, 2023

Conversation

pp-mo
Copy link
Member

@pp-mo pp-mo commented Sep 12, 2022

Addresses #4981

Just the addtitional tests which "snapshot" the existing behaviour

In principle should address "all" behaviour that we want to capture + preserve,
but in practice we may miss something here + add more later.
What these tests definitely should do is pass if run against the pre-existing main branch code.

@pp-mo pp-mo force-pushed the split_attrs branch 3 times, most recently from 64db552 to f67e549 Compare September 15, 2022 17:31
@pp-mo pp-mo changed the base branch from main to FEATURE_split_attrs September 15, 2022 17:31
@pp-mo
Copy link
Member Author

pp-mo commented Sep 15, 2022

Rebased + targetting new feature-branch "FEATURE_split_attrs"

@pp-mo
Copy link
Member Author

pp-mo commented Sep 15, 2022

UPDATE: I have since converted the rough plan outlined in this comment into tickets on a new project to cover all of this work


Note :
a rough plan for the future development path...

  1. initial tests for status-quo
    • now basically done, though more may still be needed / to be added later
  2. cube metadata changes
    • the cube state will contain separate 'local' + 'global' attribute dicts
  3. cube attributes handler
    • "cube.attributes" to become a wrapper object, providing access ~same as existing (combined local+global),
      plus specific local + global settings. (Have code for this prototyped already)
  4. netcdf loader changes
    • specifically, assign incoming attributes into relevant local+global areas when loading netcdf
    • NOTE: the 'combined access' means that, at this point, saving to netcdf will have almost-same results.
  5. netcdf saver changes
    • changes to respect the local/global designations on saving (e.g. from original load provenance), where possible
    • PLUS: a FUTURE flag implementation to switch new/old handling behaviour
  6. ?? cube printout ??
    • COULD modify this to have an extra section.
    • i.e. "Attributes:" (local) and separate "Global Attributes:"
    • not yet clear if this is required / a good idea
      • NB existing listing is presented in alphabetical order
      • so this is going to change A LOT of test results
  7. adjust "equalise_attributes"
    • at least ensure that operation makes sense with the new split-form attributes
    • needs to handle newly-possible collisions between matching local+global settings
    • possibly provide automatic resolution of some such problems
    • possibly add extra keywords to control different resolution modes
  8. UserGuide update

other questions ...

  • ? special additional logic in common-metadata / lenient logic, allowing combine+compare of local+global settings ??
  • ? POSSIBLY ? more fixes / rationalisation of existing code
    • adjust LimitedAttributeDict to conform to _CF_ATTRS
    • modify LimitedAttributeDict to complain about mis-assignment of recognised global-only or local-only attributes
      • NOTE: this probably can't / shouldn't occur in the CubeAttributesMap
        • "should" instead occur within the individual (specific) sets
    • worry about handling of "missing_value" attribute : suspect there may be an anomaly here
    • rationalise attribute handling : separate logic to separate code, apart from netcdf-save ??

@pp-mo pp-mo changed the title Split attrs Split attrs - tests for status quo Sep 23, 2022
@pp-mo pp-mo marked this pull request as ready for review September 23, 2022 11:05
@pp-mo pp-mo linked an issue Sep 23, 2022 that may be closed by this pull request
@pp-mo
Copy link
Member Author

pp-mo commented Sep 23, 2022

I've now created a project for this entire effort, and re-targetted this PR onto a dedicated feature branch.

So this PR is now represents only the very first step of that (#4981), and should be evaluated, fixed and merged as it is.

Please review for sense, and completeness of the testing scope

@pp-mo
Copy link
Member Author

pp-mo commented Sep 28, 2022

NOTE: rebased onto updated feature-branch + fixed, following the mergeback which contained #4803

Copy link
Member

@lbdreyer lbdreyer left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is great improvement, adding a lot of missing testing! Funnily enough, a lot of the comments I wanted to make were issues with the current behaviour brought to light by these new tests! But as this PR is aimed at testing the current behaviour, I have made myself notes of these comments so that we address them later on.

I think most of what is here is looking really good. I suppose my biggest cause for concern is that you test roundtripping (load+save) so we miss whether the behaviors is occurring on load or save (see this comment, in particular). That being said I don't have enough oversight of the split_attr project to know whether this will matter?

lib/iris/tests/integration/test_netcdf.py Outdated Show resolved Hide resolved
noxfile.py Outdated Show resolved Hide resolved
lib/iris/tests/integration/test_netcdf.py Outdated Show resolved Hide resolved
lib/iris/tests/integration/test_netcdf.py Outdated Show resolved Hide resolved
lib/iris/tests/integration/test_netcdf.py Outdated Show resolved Hide resolved
lib/iris/tests/integration/test_netcdf.py Outdated Show resolved Hide resolved
lib/iris/tests/integration/test_netcdf.py Outdated Show resolved Hide resolved
lib/iris/tests/integration/test_netcdf.py Outdated Show resolved Hide resolved
lib/iris/tests/integration/test_netcdf.py Outdated Show resolved Hide resolved
lib/iris/tests/integration/test_netcdf.py Outdated Show resolved Hide resolved
@pp-mo
Copy link
Member Author

pp-mo commented Oct 26, 2022

@lbdreyer I hope recent commits should cover a lot of your outstanding comments.
But a couple of things are clearly still outstanding

  • whether I should be testing load- and save behaviours independently of a round-trip approach
    • I suspect you may be right, but changes from this code would be quite involved, so perhaps we should consider it as a separate follow-on.
      I'm still thinking on this one.
  • how to fix the parallel tests
    • I have no idea what the problem here is, or why it only seems to occur with Python 3.10, which feels to me like a PyTest problem.
      I will un-fix this + see what happens now. I could also try a rebase to bring in newer dependencies, but AFAICT there is nothing much in the latest iris/main lockfiles that seems like it is going to affect this.

)
self.check_expected_results(
global_attr_value="single-value", # local values eclipse the global ones
# N.B. the output var has NO such attribute
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

TODO: from the comment, I think this is missing a var_attr_vals={"myvar": None}

@pp-mo
Copy link
Member Author

pp-mo commented Oct 27, 2022

Updates ...

@lbdreyer ... a couple of things are clearly still outstanding

* whether I should be [testing load- and save behaviours independently of a round-trip approach](https://github.com/SciTools/iris/pull/4960#discussion_r987733199)

I am working on this.
I intend to add some specific load- and save-behaviour tests, probably in addition to the round-trips,
but still working out what.


* how to [fix the parallel tests](https://github.com/SciTools/iris/pull/4960#discussion_r987285790)

Well, you'll see that I came up with a workaround, but it's really rather horrible !
As we're working on a feature-branch here, I'd rather pause on a better solution for this, and just move forwards noting the problem ?

@pp-mo
Copy link
Member Author

pp-mo commented Feb 13, 2023

Rebased.

I have now added specific, independent tests for load and save behaviour (file->cubes, cubes->file).
These tests should persist in future by matching (almost all) uses of the "combined" cube.attributes value.
And I think ~all results will match, up to but not including the ordering of the dictionary keys.

In future, I hope I can use all the same testcases to check the new cube.attributes.global and cube.attributes.local content in each case.

@pp-mo
Copy link
Member Author

pp-mo commented Feb 13, 2023

@lbdreyer I think I have now addressed everything outstanding -- can you re-review ?
I will then go on to get #5040 rebased onto this

@pp-mo
Copy link
Member Author

pp-mo commented Feb 14, 2023

Note: when #5095 is merged, that splits tests/integration/test_netcdf.py into various tests/integration/netcdf/test_xxx.py.
So we will want to rebase or merge that in here.
(Mergeback into the feature-branch would be OK I think)

Copy link
Member

@lbdreyer lbdreyer left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just a few small changes but otherwise I would say this is close to be merged!

I see a bunch of changes from main have also got mixed in with this PR. Is that intentional? It may be easier to update the feature branch in another PR

@pp-mo
Copy link
Member Author

pp-mo commented Feb 20, 2023

I have now merged-forward latest main into the F-B, and rebased this onto that.
So the history here is now much cleaner.
This will also affect other open PRs, I think at present only #4983

@pp-mo
Copy link
Member Author

pp-mo commented Feb 20, 2023

Thanks @lbdreyer I hope recent commits (post-rebase) address all your latest points.
The previous notes are also not resolved -- could you please check those over and re-raise anything from there that still seems unfinished ?
Otherwise, I hope is all done 🤞

Copy link
Member

@lbdreyer lbdreyer left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks all good now! Thanks @pp-mo !

@lbdreyer lbdreyer merged commit f485f8a into SciTools:FEATURE_split_attrs Feb 20, 2023
@pp-mo
Copy link
Member Author

pp-mo commented Feb 20, 2023

Great, thanks so much for your patience @lbdreyer !

@pp-mo pp-mo mentioned this pull request Jul 20, 2023
ESadek-MO pushed a commit that referenced this pull request Nov 21, 2023
* Split-attrs: Cube metadata refactortests (#4993)

* Convert Test___eq__ to pytest.

* Convert Test_combine to pytest.

* Convert Test_difference to pytest.

* Review changes.

* Split attrs - tests for status quo (#4960)

* Tests for attribute handling in netcdf load/save.

* Tidy test functions.

* Fix import order exception.

* Add cf-global attributes test.

* Towards more pytest-y implemenation.

* Replace 'create_testcase' with fixture which also handles temporary directory.

* Much tidy; use fixtures to parametrise over multiple attributes.

* Fix warnings; begin data-style attrs tests.

* Tests for data-style attributes.

* Simplify setup fixture + improve docstring.

* No parallel test runner, to avoid error for Python>3.8.

* Fixed for new-style netcdf module.

* Small review changes.

* Rename attributes set 'data-style' as 'local-style'.

* Simplify use of fixtures; clarify docstrings/comments and improve argument names.

* Clarify testing sections for different attribute 'styles'.

* Re-enable parallel testing.

* Sorted params to avoid parallel testing bug - pytest#432.

* Rename test functions to make alpha-order match order in class.

* Split netcdf load/save attribute testing into separate sourcefile.

* Add tests for loaded cube attributes; refactor to share code between Load and Roundtrip tests.

* Add tests for attribute saving.

* Fix method names in comments.

* Clarify source of Conventions attributes.

* Explain the test numbering in TestRoundtrip/TestLoad.

* Remove obsolete test helper method.

* Fix small typo; Fix numbering of testcases in TestSave.

* Implement split cube attributes. (#5040)

* Implement split cube attributes.

* Test fixes.

* Modify examples for simpler metadata printouts.

* Added tests, small behaviour fixes.

* Simplify copy.

* Fix doctests.

* Skip doctests with non-replicable outputs (from use of sets).

* Tidy test comments, and add extra test.

* Tiny typo.

* Remove redundant redefinition of Cube.attributes.

* Add CubeAttrsDict in module __all__ + improve docs coverage.

* Review changes - small test changes.

* More review changes.

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* Fix CubeAttrsDict example docstrings.

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* Odd small fixes.

* Improved docstrings and comments; fix doctests.

* Don't sidestep netcdf4 thread-safety.

* Publicise LimitedAttributeDict, so CubeAttrsDict can refer to it.

* Fix various internal + external links.

* Update lib/iris/cube.py

Co-authored-by: Martin Yeo <[email protected]>

* Update lib/iris/cube.py

Co-authored-by: Martin Yeo <[email protected]>

* Update lib/iris/cube.py

Co-authored-by: Martin Yeo <[email protected]>

* Update lib/iris/cube.py

Co-authored-by: Martin Yeo <[email protected]>

* Streamline docs.

* Review changes.

---------

Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Co-authored-by: Martin Yeo <[email protected]>

* Splitattrs ncload (#5384)

* Distinguish local+global attributes in netcdf loads.

* Small test fixes.

* Small doctest fix.

* Fix attribute load-save tests for new behaviour, and old-behaviour equivalence.

* Split attrs docs (#5418)

* Clarification in CubeAttrsDict examples.

* CubeAttrsDict fix docstring typo.

* Raise awareness of split attributes in user guide.

* What's New entry.

* Changes to metadata documentation.

* Splitattrs ncsave redo (#5410)

* Add docs and future switch, no function yet.

* Typing enables code completion for Cube.attributes.

* Make roundtrip checking more precise + improve some tests accordingly (cf. #5403).

* Rework all tests to use common setup + results-checking code.

* Saver supports split-attributes saving (no tests yet).

* Tiny docs fix.

* Explain test routines better.

* Fix init of FUTURE object.

* Remove spurious re-test of FUTURE.save_split_attrs.

* Don't create Cube attrs of 'None' (n.b. but no effect as currently used).

* Remove/repair refs to obsolete routines.

* Check all warnings from save operations.

* Remove TestSave test numbers.

* More save cases: no match with missing, and different cube attribute types.

* Run save/roundtrip tests both with+without split saves.

* Fix.

* Review changes.

* Fix changed warning messages.

* Move warnings checking from 'run' to 'check' phase.

* Simplify and improve warnings checking code.

* Fix wrong testcase.

* Minor review changes.

* Fix reverted code.

* Use sets to simplify demoted-attributes code.

* WIP

* Working with iris 3.6.1, no errors TestSave or TestRoundtrip.

* Interim save (incomplete?).

* Different results form for split tests; working for roundtrip.

* Check that all param lists are sorted.

* Check matrix result-files compatibility; add test_save_matrix.

* test_load_matrix added; two types of load result.

* Finalise special-case attributes.

* Small docs tweaks.

* Add some more testcases,

* Ensure valid sort-order for globals of possibly different types.

* Initialise matrix results with legacy values from v3.6.1 -- all matching.

* Add full current matrix results, i.e. snapshot current behaviours.

* Review changes : rename some matrix testcases, for clarity.

* Splitattrs ncsave redo commonmeta (#5538)

* Define common-metadata operartions on split attribute dictionaries.

* Tests for split-attributes handling in CubeMetadata operations.

* Small tidy and clarify.

* Common metadata ops support mixed split/unsplit attribute dicts.

* Clarify with better naming, comments, docstrings.

* Remove split-attrs handling to own sourcefile, and implement as a decorator.

* Remove redundant tests duplicated by matrix testcases.

* Newstyle split-attrs matrix testing, with fewer testcases.

* Small improvements to comments + docstrings.

* Fix logic for equals expectation; expand primary/secondary independence test.

* Clarify result testing in metadata operations decorator.

* Splitattrs equalise (#5586)

* Add tests in advance for split-attributes handling cases.

* Move dict conversion inside utility, for use elsewhere.

* Add support for split-attributes to equalise_attributes.

* Update lib/iris/util.py

Co-authored-by: Martin Yeo <[email protected]>

* Update lib/iris/tests/unit/util/test_equalise_attributes.py

Co-authored-by: Martin Yeo <[email protected]>

* Simplify and clarify equalise_attributes code.

---------

Co-authored-by: Martin Yeo <[email protected]>

* Fix merge-fail messaging for attribute mismatches. (#5590)

* Extra CubeAttrsDict methods to emulate dictionary behaviours. (#5592)

* Extra CubeAttrsDict methods to emulate dictionary behaviours.

* Don't use staticmethod on fixture.

* Add Iris warning categories to saver warnings.

* Type equality fixes for new flake8.

* Licence header fixes.

* Splitattrs ncsave deprecation (#5595)

* Small improvement to split-attrs whatsnew.

* Emit deprecation warning when saving without split-attrs enabled.

* Stop legacy-split-attribute warnings from upsetting delayed-saving tests.

---------

Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Co-authored-by: Martin Yeo <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Type: Feature Branch Highlight this for a feature branch
Projects
Status: Done
Development

Successfully merging this pull request may close these issues.

split-attrs: tests for status-quo
2 participants