Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fixing symmetrize_ddf #1686

Merged
merged 2 commits into from
Jun 30, 2021
Merged

Conversation

jnke2016
Copy link
Contributor

fix the function symmetrizing the dask dataframe

@jnke2016 jnke2016 requested a review from a team as a code owner June 24, 2021 18:46
@codecov-commenter
Copy link

codecov-commenter commented Jun 24, 2021

Codecov Report

Merging #1686 (18543f6) into branch-21.08 (0cbbdd8) will decrease coverage by 0.01%.
The diff coverage is 20.00%.

❗ Current head 18543f6 differs from pull request most recent head 32b5704. Consider uploading reports for the commit 32b5704 to get more accurate results
Impacted file tree graph

@@               Coverage Diff                @@
##           branch-21.08    #1686      +/-   ##
================================================
- Coverage         59.77%   59.75%   -0.02%     
================================================
  Files                80       80              
  Lines              3540     3546       +6     
================================================
+ Hits               2116     2119       +3     
- Misses             1424     1427       +3     
Impacted Files Coverage Δ
python/cugraph/structure/symmetrize.py 65.45% <20.00%> (-3.18%) ⬇️
python/cugraph/_version.py 44.80% <0.00%> (+0.39%) ⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 0cbbdd8...32b5704. Read the comment docs.

@BradReesWork BradReesWork added 3 - Ready for Review improvement Improvement / enhancement to an existing function non-breaking Non-breaking change labels Jun 25, 2021
@BradReesWork BradReesWork added this to the 21.08 milestone Jun 25, 2021
@ayushdg
Copy link
Member

ayushdg commented Jun 29, 2021

cc: @rjzamora It would be great if you could confirm that the use of shuffle + map_partitions looks about right to replace the groupby call here.

Comment on lines +143 to +146
result = ddf.shuffle(on=[
src_name, dst_name], ignore_index=True, npartitions=num_workers)
result = result.map_partitions(lambda x: x.groupby(
by=[src_name, dst_name], as_index=False).min().reset_index(drop=True))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ayushdg - This looks reasonable to me :)

@BradReesWork
Copy link
Member

@gpucibot merge

@rapids-bot rapids-bot bot merged commit 187f2bc into rapidsai:branch-21.08 Jun 30, 2021
@jnke2016 jnke2016 deleted the fix_symmetrize_ddf branch September 24, 2022 23:07
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
improvement Improvement / enhancement to an existing function non-breaking Non-breaking change
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants