Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Pure-python masked UDFs #9174

Merged

Conversation

brandon-b-miller
Copy link
Contributor

@brandon-b-miller brandon-b-miller commented Sep 2, 2021

Replaces C++ implementation of masked UDF pipeline with a pure python version which compiles and launches the entire kernel using numba. This solves a bunch of problems:

  • CUDA 11.0 support is now available since the impl no longer needs cuda::std::tuple to work with NVRTC 11.0.
  • Support for special functions which compile to multiple function definitions, such as pow, sin, and cos is now provided since all the PTX is compiled and linked inside numba (Fixes [FEA] Support UDF Runtime compilation for incoming PTX with non-inlineable callees  #8470)
  • Allows us to support this corner case which would require a separate c++ kernel in previous implementation
def f(x):
    return 42
  • Makes developing/adding features to the impl much easier

@brandon-b-miller brandon-b-miller added feature request New feature or request 2 - In Progress Currently a work in progress Python Affects Python cuDF API. labels Sep 2, 2021
@brandon-b-miller brandon-b-miller self-assigned this Sep 2, 2021
@github-actions github-actions bot added CMake CMake build issue libcudf Affects libcudf (C++/CUDA) code. labels Sep 2, 2021
@brandon-b-miller
Copy link
Contributor Author

cc @gmarkall

@github-actions github-actions bot removed the CMake CMake build issue label Sep 21, 2021
@github-actions github-actions bot removed the libcudf Affects libcudf (C++/CUDA) code. label Sep 21, 2021
@brandon-b-miller
Copy link
Contributor Author

Upon consulting with @shwina we decided it was better to leave the c++ alone in this PR and open a separate one removing the c++ side of things. In that case it could be reverted separately without having to revert the new pipeline as well.

@brandon-b-miller
Copy link
Contributor Author

This should be ready to go.

@@ -11,7 +11,7 @@

from cudf.core.udf.typing import MaskedType, NAType

from . import classes
from . import api
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we make these imports absolute to be consistent with the rest of the code-base?

Comment on lines 20 to 21
from . import api
from ._ops import arith_ops, comparison_ops
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Absolute imports here too.

@brandon-b-miller brandon-b-miller changed the base branch from branch-21.10 to branch-21.12 September 23, 2021 19:20
Copy link
Contributor

@gmarkall gmarkall left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is now looking pretty good - there are some small suggestions on the diff but they're fairly minor. I wanted to double-check the suggestions made sense so I had to implement them - if you like them, you could just pull in the last three commits from https://github.com/gmarkall/cudf/tree/masked-udfs-20210927 which implement the suggestions on the diff here.

*(col.mask is None for col in df._data.values()),
)
if precompiled.get(cache_key) is not None:
kernel, scalar_return_type = precompiled[cache_key]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you just return precompiled[cache_key] here to save the rest of the function living in an else block?

# The python function definition representing the kernel
_kernel = local_exec_context["_kernel"]
kernel = cuda.jit(sig)(_kernel)
precompiled[cache_key] = (kernel, scalar_return_type)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you cache (kernel, numpy_support.as_dtype(scalar_return_type) so that you don't need to call as_dtype on the scalar_return_type each time it's returned?

python/cudf/cudf/core/udf/pipeline.py Outdated Show resolved Hide resolved
Copy link
Contributor

@gmarkall gmarkall left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!

@brandon-b-miller
Copy link
Contributor Author

rerun tests

cvarbytes = b""

key = (type_signature, codebytes, cvarbytes)
key = make_cache_key(udf, type_signature)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's also move the 2 comments above into the make_cache_key function instead, as a "docstring".

@shwina
Copy link
Contributor

shwina commented Sep 29, 2021

@gpucibot merge

@rapids-bot rapids-bot bot merged commit f9ce870 into rapidsai:branch-21.12 Sep 29, 2021
@shwina
Copy link
Contributor

shwina commented Sep 29, 2021

This is awesome work, @brandon-b-miller!

rapids-bot bot pushed a commit that referenced this pull request Oct 1, 2021
Depends on #9174

Adds `Series.apply` which applies a scalar UDF elementwise to the series data returning a new series. Null sensitive. Works in terms of our numba `MaskedType` extension type. Similar to `pd.Series.apply`.

Authors:
  - https://github.com/brandon-b-miller

Approvers:
  - Ashwin Srinath (https://github.com/shwina)
  - H. Thomson Comer (https://github.com/thomcom)
  - GALI PREM SAGAR (https://github.com/galipremsagar)

URL: #9217
rapids-bot bot pushed a commit that referenced this pull request Dec 1, 2021
This PR removes the c++ side of the original masked UDF code introduced in #8213. These kernels had some limitations and are now superseded by the numba-generated versions we moved to in #9174. As far as I can tell, cuDF python was the only thing consuming this API for the short time it has existed. However I am marking this breaking just in case.

Authors:
  - https://github.com/brandon-b-miller

Approvers:
  - Mark Harris (https://github.com/harrism)
  - David Wendt (https://github.com/davidwendt)
  - Vyas Ramasubramani (https://github.com/vyasr)

URL: #9792
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
3 - Ready for Review Ready for review by team feature request New feature or request non-breaking Non-breaking change Python Affects Python cuDF API.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[FEA] Support UDF Runtime compilation for incoming PTX with non-inlineable callees
6 participants