Why iris.cube.Cube.collapsed requires weights with full shape? #3707

TomekTrzeciak · 2020-05-14T15:39:25Z

We are looking at ways to reduce memory footprint of our iris code in https://github.com/metoppv/improver and are somewhat stuck on calculating weighted averages with iris.cube.Cube.collapsed. It looks like Iris insists on weights having the same shape as the collapsed cube even though the underlying numpy.ma.average in iris.analysis.MEAN can happily work with 1D array of weights. Why this restriction?

The text was updated successfully, but these errors were encountered:

rcomer · 2020-05-19T14:04:54Z

I can't answer the specific question, but if you use iris.util.broadcast_to_shape, it claims to make the new array out of multiple views of the input array. Does this help with the memory footprint issue?

TomekTrzeciak · 2020-05-19T16:28:20Z

We already use iris.util.broadcast_to_shape, the problem was some intermediate step that destroyed that array view and made it full size (it's difficult to control this sort of thing).

Still, using 1D weights seems like a common case and it would be nice to allow it. Would the following change be acceptable?

diff --git a/lib/iris/cube.py b/lib/iris/cube.py
index fec68c57..f1fe313b 100644
--- a/lib/iris/cube.py
+++ b/lib/iris/cube.py
@@ -3957,8 +3957,9 @@ bound=(1994-12-01 00:00:00, 1998-12-01 00:00:00)
             unrolled_data = np.transpose(self.data, dims).reshape(new_shape)
 
             # Perform the same operation on the weights if applicable
-            if kwargs.get("weights") is not None:
-                weights = kwargs["weights"].view()
+            weights = kwargs.get("weights")
+            if weights is not None and weights.ndim > 1:
+                weights = weights.view()
                 kwargs["weights"] = np.transpose(weights, dims).reshape(
                     new_shape
                 )

TomekTrzeciak · 2020-12-17T21:19:30Z

Bump. This is a simple change and it would be useful to have it. Any thoughts on this?

pp-mo · 2021-01-04T12:00:39Z

If we're going to get this into Iris 3.0 we need to move quick !
To make this into a viable PR we should add a testcase and a whatsnew description.
I will try and get something done about this today.

pp-mo · 2021-01-04T12:15:22Z

The scope of this proposed change is still rather limited, and doesn't do all that we might.

It allows us to pass 1D weights matching a single "collapse dimension".
From the code, that dimension can come from any single cube dimension (not just the last),
or a flattening of multiple cube dims -- in which case supplying a 1D array would be a bit odd (and we can't ensure that the dim ordering is correct).

It still can't broadcast any multi-dimensional weights, even if the dims match by normal array rules
-- e.g. cube is (100, 3, 4), collapse over last 2 dims, with weights of shape (3,4) or (12,).

So, ideally, I feel we should be able to pass weights matching the shape of the collapsed-dims (not just the whole cube), or the flattened equivalent.
E.G. if cube has dims x/T/y with sizes (3, 100, 4), we should be able to do ...

`cube.collapsed(['x', 'y'], ... weights=np.ones((3, 4)))', or
`cube.collapsed(['y', 'x'], ... weights=np.ones((4, 3)))', or
`cube.collapsed(['x', 'y'], ... weights=np.ones(12))'
(obviously, not all ones, but weights of those shapes)

I assume this is more or less why you are using broadcast_to_shape
Does this make some sense to you, and is it worth considering such extra cases ?
(but maybe in another PR)

TomekTrzeciak · 2021-01-04T14:42:02Z

So, ideally, I feel we should be able to pass weights matching the shape of the collapsed-dims (not just the whole cube), or the flattened equivalent.
E.G. if cube has dims x/T/y with sizes (3, 100, 4), we should be able to do ...

`cube.collapsed(['x', 'y'], ... weights=np.ones((3, 4)))', or

`cube.collapsed(['y', 'x'], ... weights=np.ones((4, 3)))', or

`cube.collapsed(['x', 'y'], ... weights=np.ones(12))'
(obviously, not all ones, but weights of those shapes)

I assume this is more or less why you are using broadcast_to_shape
Does this make some sense to you, and is it worth considering such extra cases ?

It does make sense (with question mark about the flattened array case), but these cases are also less common and not directly supported at the numpy layer (would require broadcast_to_shape internally).

(but maybe in another PR)

Might as well in the interest of getting this in quickly.

pp-mo · 2021-01-04T18:37:44Z

How does #3943 look ?

TomekTrzeciak · 2021-01-04T19:08:50Z

How does #3943 look ?

Looks all good to me 👍

rcomer · 2021-04-14T20:35:01Z

This is linked to #3943, which was merged. So I believe it should be closed.

TomekTrzeciak changed the title ~~Why iris.cube.Cube.collapse requires weights with full shape?~~ Why iris.cube.Cube.collapsed requires weights with full shape? May 14, 2020

pp-mo mentioned this issue Jan 4, 2021

Add support for 1-d weights in collapse. #3943

Merged

rcomer closed this as completed Apr 14, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Why iris.cube.Cube.collapsed requires weights with full shape? #3707

Why iris.cube.Cube.collapsed requires weights with full shape? #3707

TomekTrzeciak commented May 14, 2020 •

edited

Loading

rcomer commented May 19, 2020

TomekTrzeciak commented May 19, 2020

TomekTrzeciak commented Dec 17, 2020

pp-mo commented Jan 4, 2021

pp-mo commented Jan 4, 2021 •

edited

Loading

TomekTrzeciak commented Jan 4, 2021

pp-mo commented Jan 4, 2021

TomekTrzeciak commented Jan 4, 2021

rcomer commented Apr 14, 2021

Why iris.cube.Cube.collapsed requires weights with full shape? #3707

Why iris.cube.Cube.collapsed requires weights with full shape? #3707

Comments

TomekTrzeciak commented May 14, 2020 • edited Loading

rcomer commented May 19, 2020

TomekTrzeciak commented May 19, 2020

TomekTrzeciak commented Dec 17, 2020

pp-mo commented Jan 4, 2021

pp-mo commented Jan 4, 2021 • edited Loading

TomekTrzeciak commented Jan 4, 2021

pp-mo commented Jan 4, 2021

TomekTrzeciak commented Jan 4, 2021

rcomer commented Apr 14, 2021

TomekTrzeciak commented May 14, 2020 •

edited

Loading

pp-mo commented Jan 4, 2021 •

edited

Loading