Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implementing Putmask #667

Merged
merged 25 commits into from
Nov 16, 2022
Merged

Implementing Putmask #667

merged 25 commits into from
Nov 16, 2022

Conversation

ipdemes
Copy link
Contributor

@ipdemes ipdemes commented Oct 24, 2022

No description provided.

@ipdemes ipdemes added the category:new-feature PR introduces a new feature and will be classified as such in release notes label Oct 24, 2022
@ipdemes ipdemes self-assigned this Oct 24, 2022
@ipdemes
Copy link
Contributor Author

ipdemes commented Oct 26, 2022

This PR implements logic for putmask funtion.
It will also optimize case of Advanced Indexing operation like a[bool_indices]=scalar by calling putmask instead of doing Sparse Indirect Copy.
The last commit is a replacement to PR #639

cunumeric/module.py Show resolved Hide resolved
};

template <>
struct BoolMaskPolicy<VariantKind::GPU, false> {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

does this need to be specific to boolean masks? this policy cannot be used without materializing a boolean mask in a case where, for example, the kernel needs to run when the input is not NaN. it'd be more general if this policy took a predicate that evaluates to a boolean value on each point.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Makes sense. I will make it more general and call it parallel_loop, probably.

is_scalar_value = False
if values.shape != self.shape and values.size == 1:
if values.shape == ():
values = values._convert_future_to_regionfield(True)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is there a reason that you do this? can't you just promote this scalar store to match the shape of self?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

replaced this with broadcasting

task.add_input(mask.base)
task.add_input(values.base)
task.add_output(self.base)
task.add_scalar_arg(is_scalar_value, bool) # value is scalar
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

again, is there a reason that you don't do numpy broadcasting on the scalar value? that would make the code a lot simpler.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

makes sense. replaced the logic with broadcasting

task.add_broadcast(values.base)
else:
task.add_alignment(self.base, values.base)
task.add_broadcast(self.base, axes=range(1, self.ndim))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't understand this broadcast constraint... why is this necessary?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

removed. I don't need it anymore

using namespace Legion;
using namespace legate;

template <VariantKind KIND, LegateTypeCode CODE, int DIM, int VDIM, bool SCALAR_VALUE = false>
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

like I said in the other comment, if you do numpy broadcasting on the values, you can remove the SCALAR_VALUE == true case.


if a.dtype != values.dtype:
values = values._warn_and_convert(a.dtype)
if values.shape != a.shape and values.size != 1:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the first condition should be values.size != size. Here's an example you can handle without wrapping or tiling but simply using numpy broadcasting:

import numpy as np
a = np.full((3, 3), 3)
np.putmask(a, np.full((3, 3), True), np.full(3, 10))

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

unless stated otherwise, it's safe to assume that all related array arguments follow the numpy broadcasting semantics.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I adding the check if arrays can be broadcasted. We will cal wrap only for the case when they can't

num.putmask(num_arr, num_mask, num_val)
assert np.array_equal(np_arr, num_arr)

# val is different shape
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you need to subdivide this test into two: 1) when the shapes are different but the sizes are the same, 2) when both the shapes and sizes are different.

Comment on lines +593 to +601
has_set_value = set_value is not None and set_value.size == 1
if has_set_value:
mask = DeferredArray(
self.runtime,
base=key_store,
dtype=self.dtype,
)
rhs.putmask(mask, set_value)
return False, rhs, rhs, self
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not a big fan of this code. the name of the function is _create_indexing_array, but this code makes it do more than just creating indexing arrays. and in fact, the function was already complicated enough to warrant a refactoring, which I was thinking of doing at some point. I guess I'm fine with accepting this edit for this PR, but keep in mind that _create_indexing_array can use some refactoring for better maintainability.

@@ -3506,6 +3506,58 @@ def put(
a.put(indices=indices, values=values, mode=mode)


def _can_broadcast_to_shape(self: ndarray, shape: NdShape) -> bool:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can't we just use np.broadcast_shapes instead of this custom code?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sure, will use it instead with `try\ except"

@ipdemes ipdemes merged commit db2a4f8 into nv-legate:branch-22.12 Nov 16, 2022
@ipdemes ipdemes deleted the putmask branch January 12, 2023 05:21
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
category:new-feature PR introduces a new feature and will be classified as such in release notes
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants