Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Memory map the input file only when GDS compatiblity mode is not used #7717

Merged
merged 9 commits into from
Mar 29, 2021

Conversation

vuule
Copy link
Contributor

@vuule vuule commented Mar 25, 2021

mmap is expensive on some systems and we can expect better performance with file reads when GDS is used, especially with compatibility mode.
This PR adds a source type that does not use mmap for host reads. This type is used when GDS and its compatibility mode are enabled.
file_source is now a base class for file-based input and only implements the device_read functions.
memory_mapped_source class implements the host reads through the memory mapped file.
direct_read_source is a newly implemented class that uses read for host reads, no mmap.
Selection is done in datasource::create based on cufile_config.

@vuule vuule added libcudf Affects libcudf (C++/CUDA) code. cuIO cuIO issue improvement Improvement / enhancement to an existing function non-breaking Non-breaking change labels Mar 25, 2021
@vuule vuule self-assigned this Mar 25, 2021
@codecov
Copy link

codecov bot commented Mar 25, 2021

Codecov Report

Merging #7717 (b386930) into branch-0.19 (7871e7a) will increase coverage by 0.65%.
The diff coverage is n/a.

❗ Current head b386930 differs from pull request most recent head ca63a42. Consider uploading reports for the commit ca63a42 to get more accurate results
Impacted file tree graph

@@               Coverage Diff               @@
##           branch-0.19    #7717      +/-   ##
===============================================
+ Coverage        81.86%   82.52%   +0.65%     
===============================================
  Files              101      101              
  Lines            16884    17458     +574     
===============================================
+ Hits             13822    14407     +585     
+ Misses            3062     3051      -11     
Impacted Files Coverage Δ
python/cudf/cudf/utils/gpu_utils.py 53.65% <0.00%> (-4.88%) ⬇️
python/cudf/cudf/core/column/lists.py 87.68% <0.00%> (-3.72%) ⬇️
python/cudf/cudf/core/column/decimal.py 92.95% <0.00%> (-1.92%) ⬇️
python/cudf/cudf/core/abc.py 87.23% <0.00%> (-1.14%) ⬇️
python/cudf/cudf/core/column/numerical.py 94.83% <0.00%> (-0.20%) ⬇️
python/cudf/cudf/core/column/column.py 87.61% <0.00%> (-0.15%) ⬇️
python/cudf/cudf/utils/utils.py 85.36% <0.00%> (-0.07%) ⬇️
python/cudf/cudf/io/feather.py 100.00% <0.00%> (ø)
python/cudf/cudf/utils/ioutils.py 78.71% <0.00%> (ø)
python/cudf/cudf/comm/serialize.py 0.00% <0.00%> (ø)
... and 45 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update cddafd9...ca63a42. Read the comment docs.

@vuule vuule changed the title Input file is not mapped into memory when GDS compatibility mode is enabled Memory map the input file only when GDS compatiblity mode is not used Mar 25, 2021
@vuule vuule marked this pull request as ready for review March 26, 2021 01:39
@vuule vuule requested a review from a team as a code owner March 26, 2021 01:39
@vuule vuule requested a review from devavret March 29, 2021 19:17
@vuule
Copy link
Contributor Author

vuule commented Mar 29, 2021

@gpucibot merge

@rapids-bot rapids-bot bot merged commit 4dd75c4 into rapidsai:branch-0.19 Mar 29, 2021
@vuule vuule deleted the fea-avoid-mmap-gds-compat branch March 30, 2021 00:51
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
cuIO cuIO issue improvement Improvement / enhancement to an existing function libcudf Affects libcudf (C++/CUDA) code. non-breaking Non-breaking change
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants