Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Weights shape error #45

Closed
FrankFrank9 opened this issue Apr 16, 2024 · 17 comments
Closed

Weights shape error #45

FrankFrank9 opened this issue Apr 16, 2024 · 17 comments

Comments

@FrankFrank9
Copy link

Hello,

when I run in parallel the SPOD with a custom weighting matrix (area of the elements) I get the following error but everything is fine when I run in serial mode. Do you have any idea on that?

 ValueError: parameter ``weights`` must be cast into 1d array with dimension equal to flattened spatial dimension of data.
    raise ValueError(
    raise ValueError(
ValueError: parameter ``weights`` must be cast into 1d array with dimension equal to flattened spatial dimension of data.
    raise ValueError(
ValueError: parameter ``weights`` must be cast into 1d array with dimension equal to flattened spatial dimension of data.
    raise ValueError(
ValueError: parameter ``weights`` must be cast into 1d array with dimension equal to flattened spatial dimension of data.
    raise ValueError(
ValueError: parameter ``weights`` must be cast into 1d array with dimension equal to flattened spatial dimension of data.
    raise ValueError(
ValueError: parameter ``weights`` must be cast into 1d array with dimension equal to flattened spatial dimension of data.
    raise ValueError(
ValueError: parameter ``weights`` must be cast into 1d array with dimension equal to flattened spatial dimension of data.
ValueError: parameter ``weights`` must be cast into 1d array with dimension equal to flattened spatial dimension of data.
    raise ValueError(
ValueError: parameter ``weights`` must be cast into 1d array with dimension equal to flattened spatial dimension of data.
    raise ValueError(
ValueError: parameter ``weights`` must be cast into 1d array with dimension equal to flattened spatial dimension of data
@dalcinl
Copy link
Collaborator

dalcinl commented Apr 17, 2024

What's the shape of your weights? Can you try passing weithgs.reshape(-1) instead?

@FrankFrank9
Copy link
Author

Thank you now I'm able to run with 2-3 ranks but I get the same error when I scale this up.
For reference my weights shape is:

(1394730,)

and my time series data shape is

(200, 278946, 5)

The error arise from this piece of code

def distribute_dimension(data, max_axis, comm):
    """
    Distribute desired spatial dimension, splitting partitions
    by value // comm.size, with remainder = value % comm.size
    """
    ## distribute largest spatial dimension based on data
    if comm is not None:
        size = comm.size
        rank = comm.rank
        shape = data.shape
        index = [np.s_[:]] * len(shape)
        N = shape[max_axis]
        n, s = _blockdist(N, size, rank)
        index[max_axis] = np.s_[s:s+n]
        index = tuple(index)
        data = data[index]
        comm.Barrier()
    else:
        data = data
    return data

Best

@dalcinl
Copy link
Collaborator

dalcinl commented Apr 18, 2024

and my time series data shape is

(200, 278946, 5)

So do you have 200 time samples, each comprising of 278946 spatial points with 5 variables per point?

I think the weights correspond to just to spatial points and not variables, therefore you should provide 278946 weights, and not 278946 * 5 = 1394730. @mrogowski Can you confirm?

@mrogowski
Copy link
Collaborator

We should support weight per spatial point per variable. Looking quickly at the code, I think we may have a bug. We tested the one variable branch heavily in parallel, but not so much for data with multiple variables. @FrankFrank9, what is the format of your data? Could you come up with a simple reproducer?

@FrankFrank9
Copy link
Author

Unfortunately I can't make an easy reproducible thing. I guess anything with those shapes should work. It is an error in redistributing data. Let me know

@mrogowski
Copy link
Collaborator

Can you try to run with this change in PySPOD?

@FrankFrank9
Copy link
Author

I get the same error:

ValueError: cannot reshape array of size 139473 into shape (139475,1)

During handling of the above exception, another exception occurred:

@dalcinl
Copy link
Collaborator

dalcinl commented Apr 21, 2024

Unfortunately I can't make an easy reproducible thing.

Not even using random data with shapes that match your data?

@mrogowski
Copy link
Collaborator

I generated random data:

data matrix X (200, 278946, 5)
weights (278946, 1, 5)

and tried with 7, 8, 9, 10, 11, 12 processes. All seem to have worked. Any reproducer would be very helpful to assist you.

@FrankFrank9
Copy link
Author

I generated random data:

data matrix X (200, 278946, 5)
weights (278946, 1, 5)

and tried with 7, 8, 9, 10, 11, 12 processes. All seem to have worked. Any reproducer would be very helpful to assist you.

Now it works, the weights need the second axis as well , mine were just (npts, nvars).
Thanks for looking into this !

@dalcinl
Copy link
Collaborator

dalcinl commented Apr 21, 2024

Oh, but then that means we can do better, that is, add the missing axis, right Marcin?

@mrogowski
Copy link
Collaborator

Now it works, the weights need the second axis as well , mine were just (npts, nvars).
Thanks for looking into this !

Good to hear! Like I said before, most of the runs we did so far were for 1 variable 2D data, so you may spot some issues with 1D and/or multivariable data. Let us know and we'll try to fix it.

@mrogowski
Copy link
Collaborator

Oh, but then that means we can do better, that is, add the missing axis, right Marcin?

I'll try to reproduce the issue that @FrankFrank9 ran into and fix it. I used (278946, 1, 5) because that's what I got from utils_weights.geo_trapz_2D. It just happens that it was the problem.

@FrankFrank9
Copy link
Author

Now it works, the weights need the second axis as well , mine were just (npts, nvars).
Thanks for looking into this !

Good to hear! Like I said before, most of the runs we did so far were for 1 variable 2D data, so you may spot some issues with 1D and/or multivariable data. Let us know and we'll try to fix it.

Thanks a lot! If I find any other issue I'll post here

Best

@mrogowski
Copy link
Collaborator

I'll try to reproduce the issue that @FrankFrank9 ran into and fix it.

I couldn't - worked for me with (278946, 5) weights as well.

@FrankFrank9
Copy link
Author

At this point I don't know, the version I was using with the error was coming from

pip install pyspod

Is it the same version?

@mrogowski
Copy link
Collaborator

pip install pyspod would install the last published version which does not contain this fix. You'd need to pip install git+https://github.com/MathEXLab/PySPOD@refs/pull/48/head or manually clone the repo from the PR and pip install it.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants