Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[REVIEW] Fix duplicate names issue in MultiIndex.deserialize #9258

Merged
merged 2 commits into from
Sep 20, 2021

Conversation

galipremsagar
Copy link
Contributor

Fixes: #9254

This PR fixes deserialize in cudf.MultiIndex so that there is no data-corruption happening when there are duplicate names.

@galipremsagar galipremsagar added bug Something isn't working 3 - Ready for Review Ready for review by team Python Affects Python cuDF API. 4 - Needs cuDF (Python) Reviewer non-breaking Non-breaking change labels Sep 20, 2021
@galipremsagar galipremsagar self-assigned this Sep 20, 2021
@galipremsagar galipremsagar requested a review from a team as a code owner September 20, 2021 20:34
@galipremsagar galipremsagar changed the title [REVIEW] Fix duplicate names issues in MultiIndex.deserialize [REVIEW] Fix duplicate names issue in MultiIndex.deserialize Sep 20, 2021
Copy link
Contributor

@vyasr vyasr left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fix looks good, one minor suggested improvement to the tests.

python/cudf/cudf/tests/test_multiindex.py Outdated Show resolved Hide resolved
@galipremsagar galipremsagar added 5 - Ready to Merge Testing and reviews complete, ready to merge and removed 3 - Ready for Review Ready for review by team 4 - Needs cuDF (Python) Reviewer labels Sep 20, 2021
@codecov
Copy link

codecov bot commented Sep 20, 2021

Codecov Report

Merging #9258 (d138cf5) into branch-21.10 (3ee3ecf) will decrease coverage by 0.00%.
The diff coverage is 0.00%.

❗ Current head d138cf5 differs from pull request most recent head ce5f6a7. Consider uploading reports for the commit ce5f6a7 to get more accurate results
Impacted file tree graph

@@               Coverage Diff                @@
##           branch-21.10    #9258      +/-   ##
================================================
- Coverage         10.85%   10.85%   -0.01%     
================================================
  Files               115      116       +1     
  Lines             19158    19168      +10     
================================================
  Hits               2080     2080              
- Misses            17078    17088      +10     
Impacted Files Coverage Δ
python/cudf/cudf/__init__.py 0.00% <ø> (ø)
python/cudf/cudf/_lib/__init__.py 0.00% <ø> (ø)
python/cudf/cudf/core/multiindex.py 0.00% <0.00%> (ø)
python/cudf/cudf/io/__init__.py 0.00% <0.00%> (ø)
python/cudf/cudf/io/text.py 0.00% <0.00%> (ø)
python/cudf/cudf/utils/ioutils.py 0.00% <0.00%> (ø)
python/cudf/cudf/_fuzz_testing/fuzzer.py 0.00% <0.00%> (ø)
python/cudf/cudf/utils/hash_vocab_utils.py 0.00% <0.00%> (ø)

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 4defd25...ce5f6a7. Read the comment docs.

@galipremsagar
Copy link
Contributor Author

@gpucibot merge

@rapids-bot rapids-bot bot merged commit 1fdd62f into rapidsai:branch-21.10 Sep 20, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
5 - Ready to Merge Testing and reviews complete, ready to merge bug Something isn't working non-breaking Non-breaking change Python Affects Python cuDF API.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[BUG] Deserialization of cudf.MultiIndex corrupts data when duplicate names are present
2 participants