-
Notifications
You must be signed in to change notification settings - Fork 901
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[REVIEW] Add Series.loc and DataFrame.loc #1622
Conversation
…prove-series-indexing
Note that for the scalar index case it's possible to just implement In [2]: a = cudf.Series(['one', 'two', 'three'], index=[1, 2, 3])
In [3]: print(a[a.index == 1])
1 one
dtype: object However, this will need to wait for the Cythonized gather (#1604) to work for all index types. Until then, I have implemented the placeholder |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Mostly needs to handle the null cases, otherwise additional changes can be done in a follow up PR
if len(arg) == 0: | ||
arg = Series(np.array([], dtype='int32')) | ||
else: | ||
arg = Series(arg) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We could likely operate better against a column than a Series here since I assume this input argument is converted to a Series just to move it to device. That way there isn't any unnecessary index handling when creating these Series objects.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
See comment below
or is_datetime_or_timedelta_dtype(val) | ||
or isinstance(val, pd.Timestamp) | ||
or isinstance(val, pd.Categorical) | ||
) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
May want to move this to utils
so it can be reused elsewhere
…prove-series-indexing # Conflicts: # CHANGELOG.md
…prove-series-indexing
…prove-series-indexing
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Will need some follow ups for cleanup / optimization, but vastly improves the current state. Great work @shwina!
@shwina I can approve but you need to do a merge to resolve the conflicts because I don't want to mess it up. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No C++ changes...
…prove-series-indexing # Conflicts: # CHANGELOG.md # python/cudf/dataframe/column.py
🔥 😤 MAKE INDEXING IN CUDF GREAT AGAIN 😤 🔥
This PR brings changes and improvements to make indexing in cuDF more performant, and feel more natural and Pandas-like.
cuDF (latest):
0.8 (this PR):
Fixes:
#1494
#1513
#1731
#1459
#1444