Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

MeshCoord metadata should include the "mesh" property #4673

Closed
pp-mo opened this issue Mar 30, 2022 · 6 comments
Closed

MeshCoord metadata should include the "mesh" property #4673

pp-mo opened this issue Mar 30, 2022 · 6 comments

Comments

@pp-mo
Copy link
Member

pp-mo commented Mar 30, 2022

We initially chose to leave the 'mesh' property out of the MeshCoordMetadata,
largely because at that time we were wary of implementing equivalence on Mesh objects.
As noted here

IIRC this was largely due to concerns about the efficiency of calculating mesh equality (i.e. potentially comparing large arrays),
but we since essentially changed our minds about this, and implemented full equality testing on meshes anyway

So, in all probability, it would make more sense now to reverse this, and make 'mesh' a participant in MeshCooord.metadata

Present effects

With same 'location' and 'axis' but different 'mesh', we can find that meshcoord1 != meshcoord2, while at the same time meshcoord1.metadata == meshcoord2.metadata. This is not a logical flaw (can be true for coords with different data), but we would expect mesh to participate + it does not.

Arguably, metadata comparison should work as an 'identity equivalence' for data components,
this allows e.g. co = cube1.coord('longitude') ; co2 = cube2.coord(co) to work even when the 2 coords in question have different data, shapes or dtypes. This aspect is basically not working for meshcoords, at this point.

The code here would not be reachable, or would not be needed, if this worked as it does for the other coordinate types
-- instead, the resolve operation "ought to" be able to decide whether 2 meshcoords can be treated as equivalent or not.
It might also make sense to permit some aspects of lenient mesh comparison within this -- e.g. if ~equivalent meshes from different files had component with different var_names -- which is not possible at present.
At present, cubes can be merged between meshes that compare equal, but it requires an exact match in all details (e.g. including var_names)

Similar considerations could also apply to merge+concatenate.

Effects of change

It's not clear, if we simply change this, what knock-on problems could be introduced.
Possibly, merging 'equal' meshes, loaded from different files, would stop working (it currently works).

Up to now, no other metadata component has been a complex object -- the most complex is attribute dictionaries, which themselves require special consideration in comparison/combination.

@github-actions
Copy link
Contributor

In order to maintain a backlog of relevant issues, we automatically label them as stale after 500 days of inactivity.

If this issue is still important to you, then please comment on this issue and the stale label will be removed.

Otherwise this issue will be automatically closed in 28 days time.

@github-actions github-actions bot added the Stale A stale issue/pull-request label Aug 20, 2023
@pp-mo pp-mo self-assigned this Aug 30, 2023
@pp-mo pp-mo added Status: Decision Required Feature: UGRID and removed Stale A stale issue/pull-request labels Aug 30, 2023
@scitools-ci scitools-ci bot removed this from 🚴 Peloton Dec 15, 2023
@pp-mo pp-mo removed their assignment Mar 20, 2024
@ESadek-MO ESadek-MO added this to the v3.10 milestone Apr 3, 2024
@trexfeathers
Copy link
Contributor

Note this comment, which will need removing even if we decide no action:

_members = ("location", "axis")
# NOTE: in future, we may add 'mesh' as part of this metadata,
# as the Mesh seems part of the 'identity' of a MeshCoord.
# For now we omit it, particularly as we don't yet implement Mesh.__eq__.
#
# Thus, for now, the MeshCoord class will need to handle 'mesh' explicitly
# in identity / comparison, but in future that may be simplified.

@pp-mo
Copy link
Member Author

pp-mo commented Jul 17, 2024

After some thought

TL;DR: I now think we shouldn't do this after all.

I think metadata is generally to be viewed as analagous to "phenomenon"
As on cubes or coordinates, for instance ..

  • it distinguishes one thing from another, where directly combining/comparing them would be a categorical mistake, like a units error
    • (i.e. like adding apples to elephants)
  • it characterises the "meaning" of the numbers stored the object, e.g. the numbers are "a longitude" or "a height in metres".
    • so in that way, it's a bit like an extended units concept

So for a cube, the metadata is those characteristics which give it a distinct identity as a "type of value"
-- but not the values themselves (.data), their type (.dtype) or structure (.shape)
At present, this is mostly used in 3 places:

  • in comparison (==): But here, it is just a fast check, as non-metadata properties are generally also compared anyway
  • in arithmetic : but this doesn't really apply to MeshCoords
  • in selecting objects ... read on ...

For selection, component access methods like Cube.coords use metadata as an identity, so you can ...

co = cube1.coord(name)
meta = co.metadata
assert cube2.coord(co).metadata == cube2.coord(meta).metadata == meta

That is, you can get the equivalent coord in a different cube by using the coord (or equivalently, in fact, its metadata) as a reference.
This works even when the 2 cubes have different dimensions or coordinate values, because those aspects are outside the metadata.

So, at present, we can do the same with MeshCoords..
x_co = cube1.coord(cube2.coord(mesh_coords=True, axis="x"))

But if we added the mesh to the metadata, this would not be possible, as you could no longer use it to pick meshcoords with similar properties but on a different mesh.

So I think that, with that in mind, adding a mesh to the MeshCoordMetadata would be like adding a cube's coords to its metadata : It would mean that the values and grid-locations of its content become part of the identity, so meshcoords on different meshes can't be matched. Which I think is therefore at odds with how it works for cubes and coords, so we shouldn't do this..

@pp-mo
Copy link
Member Author

pp-mo commented Jul 18, 2024

@stephenworsley please "review", i.e. see what you think of the above ideas.
Do you agree that we should close this ?

@stephenworsley
Copy link
Contributor

@pp-mo This makes sense, I agree that the status of the mesh property should stay as is given that:

  • metadata is a "lightweight" description and meshes are large.
  • mesh already exists as a property of MeshCoords.
  • meshes are more similar to objects which are not considered suitable for including in other metadata (e.g. coords, data) than they are to objects included in metadata (e.g. names, attributes).
  • adding meshes to metadata would break some existing ehaviour.

@pp-mo
Copy link
Member Author

pp-mo commented Jul 19, 2024

Ok, thanks @stephenworsley .
I think we are done with this.
Anyone disagreeing can re-open !

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants