Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Refactor ColumnMethods and its subclasses to remove column argument and require parent argument #8306

Merged
merged 18 commits into from
Jul 15, 2021

Conversation

shwina
Copy link
Contributor

@shwina shwina commented May 20, 2021

Prior to this PR, ColumnMethods takes a column argument as well as an optional parent argument.

  • When passed only a column argument, its methods return Columns. We have made use of this internally to do certain operations on Columns.

  • When passed a parent argument, its methods return objects of the type of the parent. This enables us to use the same class to implement accessor methods for both Series and Index.

This PR makes it so that we only accept (and now require) a parent argument, simplifying the class and its usage. All instances of ColumnMethods being used internally have been replaced.

Part of the Array refactor.

@shwina shwina requested a review from a team as a code owner May 20, 2021 22:03
@github-actions github-actions bot added the Python Affects Python cuDF API. label May 20, 2021
@shwina shwina added non-breaking Non-breaking change tech debt improvement Improvement / enhancement to an existing function labels May 20, 2021
@brandon-b-miller
Copy link
Contributor

Few small questions otherwise lgtm.

@brandon-b-miller
Copy link
Contributor

Also might want to target 21.08

@shwina shwina requested review from a team as code owners July 13, 2021 13:33
@shwina shwina requested review from devavret and hyperbolic2346 and removed request for a team July 13, 2021 13:33
@shwina shwina changed the base branch from branch-21.06 to branch-21.08 July 13, 2021 14:48
@shwina shwina removed request for a team, devavret and hyperbolic2346 July 13, 2021 14:49
@shwina
Copy link
Contributor Author

shwina commented Jul 13, 2021

Sorry for the delay in updating this PR. I've merged conflicts and addressed your comments @brandon-b-miller

@codecov
Copy link

codecov bot commented Jul 13, 2021

Codecov Report

Merging #8306 (0311aa0) into branch-21.08 (b0d86d2) will increase coverage by 0.00%.
The diff coverage is 0.00%.

❗ Current head 0311aa0 differs from pull request most recent head f38dc49. Consider uploading reports for the commit f38dc49 to get more accurate results
Impacted file tree graph

@@              Coverage Diff               @@
##           branch-21.08    #8306    +/-   ##
==============================================
  Coverage         10.66%   10.66%            
==============================================
  Files               109      110     +1     
  Lines             18302    18680   +378     
==============================================
+ Hits               1951     1993    +42     
- Misses            16351    16687   +336     
Impacted Files Coverage Δ
python/cudf/cudf/_lib/strings/__init__.py 0.00% <0.00%> (ø)
python/cudf/cudf/_typing.py 0.00% <0.00%> (ø)
python/cudf/cudf/api/types.py 0.00% <0.00%> (ø)
python/cudf/cudf/core/column/__init__.py 0.00% <0.00%> (ø)
python/cudf/cudf/core/column/categorical.py 0.00% <0.00%> (ø)
python/cudf/cudf/core/column/column.py 0.00% <0.00%> (ø)
python/cudf/cudf/core/column/lists.py 0.00% <0.00%> (ø)
python/cudf/cudf/core/column/methods.py 0.00% <0.00%> (ø)
python/cudf/cudf/core/column/string.py 0.00% <0.00%> (ø)
python/cudf/cudf/core/column/struct.py 0.00% <0.00%> (ø)
... and 60 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update b0d86d2...f38dc49. Read the comment docs.

Copy link
Contributor

@isVoid isVoid left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks great. This has some overlapping with my Categorical refactor branch. Should make it a bit easier to progress.

The PR description only captures part of what's going on though - maybe adding two bullet points mentioning the changes to CategoricalAccessor and StringMethods?

@@ -2299,7 +2299,62 @@ def full(size: int, fill_value: ScalarLike, dtype: Dtype = None) -> ColumnBase:
return ColumnBase.from_scalar(cudf.Scalar(fill_value, dtype), size)


def _concat_columns(objs: "MutableSequence[ColumnBase]") -> ColumnBase:
def _copy_type_metadata_from_arrow(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Bad merge? I thought this was removed.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Indeed, good catch! Fixed, and also changed title

@shwina shwina changed the title Cleanup ColumnMethods Refactor ColumnMethods and its subclasses to remove column argument and require parent argument Jul 14, 2021
@shwina
Copy link
Contributor Author

shwina commented Jul 15, 2021

@gpucibot merge

@rapids-bot rapids-bot bot merged commit 65a38af into rapidsai:branch-21.08 Jul 15, 2021
rapids-bot bot pushed a commit that referenced this pull request Jul 20, 2021
Part of #8660. Note that the issue is asking for this feature in _dask-cudf_, which this PR does not implement.

Depends on: #8306

Authors:
  - Ashwin Srinath (https://github.com/shwina)

Approvers:
  - https://github.com/brandon-b-miller
  - Vyas Ramasubramani (https://github.com/vyasr)

URL: #8729
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
improvement Improvement / enhancement to an existing function non-breaking Non-breaking change Python Affects Python cuDF API.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants