Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix double counting of selected columns in CSV reader #8508

Merged
merged 8 commits into from
Jul 6, 2021

Conversation

ochan1
Copy link
Contributor

@ochan1 ochan1 commented Jun 12, 2021

Fix a bug where the program can Segmentation Fault as a result of selecting the same column twice using index + column_name OR the column_name is typed twice

"num_active_cols_" will double count when a particular column is selected twice by the user in an index + column_name situation and when the column_name is listed twice
@ochan1 ochan1 requested a review from a team as a code owner June 12, 2021 22:58
@GPUtester
Copy link
Collaborator

Can one of the admins verify this patch?

1 similar comment
@GPUtester
Copy link
Collaborator

Can one of the admins verify this patch?

@github-actions github-actions bot added the libcudf Affects libcudf (C++/CUDA) code. label Jun 12, 2021
@rgsl888prabhu
Copy link
Contributor

@ochan1, awesome find, it was great that you pushed a PR for this. The code looks good, it would be better if there is a test case to verify. Can you please add a test case for this here

@rgsl888prabhu rgsl888prabhu added 3 - Ready for Review Ready for review by team 4 - Needs cuIO Reviewer cuIO cuIO issue non-breaking Non-breaking change bug Something isn't working labels Jun 13, 2021
@harrism harrism changed the title Double Count of Selected Columns Fix double counting of selected columns Jun 15, 2021
@harrism harrism changed the title Fix double counting of selected columns Fix double counting of selected columns in CSV reader Jun 15, 2021
@karthikeyann karthikeyann self-assigned this Jul 2, 2021
Copy link
Contributor

@karthikeyann karthikeyann left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added. unit tests.
Fixed similar issue in duplicate select indices (causes seg fault)

@karthikeyann
Copy link
Contributor

rerun tests

@karthikeyann
Copy link
Contributor

rerun tests

@jjacobelli
Copy link
Contributor

ok to test

@jjacobelli
Copy link
Contributor

rerun tests

@karthikeyann karthikeyann self-requested a review July 5, 2021 08:30
@karthikeyann
Copy link
Contributor

ok to test

@karthikeyann
Copy link
Contributor

@ochan1 could you comment "ok to test" and "rerun tests" in separate comments?

@rgsl888prabhu
Copy link
Contributor

ok to test

@rgsl888prabhu
Copy link
Contributor

rerun tests

@ajschmidt8
Copy link
Member

ok to test

@codecov
Copy link

codecov bot commented Jul 6, 2021

Codecov Report

Merging #8508 (8f39c6a) into branch-21.08 (fba09e6) will increase coverage by 0.01%.
The diff coverage is n/a.

❗ Current head 8f39c6a differs from pull request most recent head 85c8160. Consider uploading reports for the commit 85c8160 to get more accurate results
Impacted file tree graph

@@               Coverage Diff                @@
##           branch-21.08    #8508      +/-   ##
================================================
+ Coverage         10.60%   10.61%   +0.01%     
================================================
  Files               109      109              
  Lines             18280    18645     +365     
================================================
+ Hits               1938     1980      +42     
- Misses            16342    16665     +323     
Impacted Files Coverage Δ
python/cudf/cudf/io/hdf.py 0.00% <0.00%> (ø)
python/cudf/cudf/io/orc.py 0.00% <0.00%> (ø)
python/cudf/cudf/_version.py 0.00% <0.00%> (ø)
python/cudf/cudf/core/abc.py 0.00% <0.00%> (ø)
python/cudf/cudf/api/types.py 0.00% <0.00%> (ø)
python/cudf/cudf/io/dlpack.py 0.00% <0.00%> (ø)
python/cudf/cudf/core/frame.py 0.00% <0.00%> (ø)
python/cudf/cudf/core/index.py 0.00% <0.00%> (ø)
python/cudf/cudf/io/feather.py 0.00% <0.00%> (ø)
python/cudf/cudf/io/parquet.py 0.00% <0.00%> (ø)
... and 44 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update fba09e6...85c8160. Read the comment docs.

@karthikeyann
Copy link
Contributor

@gpucibot merge

@rapids-bot rapids-bot bot merged commit c1bdbfa into rapidsai:branch-21.08 Jul 6, 2021
@ochan1 ochan1 deleted the column_select_double_count branch July 6, 2021 20:22
@vyasr vyasr added 4 - Needs Review Waiting for reviewer to review or respond and removed 4 - Needs cuIO Reviewer labels Feb 23, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
3 - Ready for Review Ready for review by team 4 - Needs Review Waiting for reviewer to review or respond bug Something isn't working cuIO cuIO issue libcudf Affects libcudf (C++/CUDA) code. non-breaking Non-breaking change
Projects
None yet
Development

Successfully merging this pull request may close these issues.

8 participants