Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
multibyte_split #8702
multibyte_split #8702
Changes from 41 commits
e1b71e6
836773a
3e06c18
ac14dbd
ea8cee2
a4a8dd0
094d2d2
1117ab8
51b1444
d1f7eb3
a628d73
e1cc84d
c7177bc
42dc014
5171711
6b62ceb
a2c9756
21b8b25
f59a93e
5fa112a
cf42fd0
e6e9741
b5c2e05
738af48
0121b22
a233ca2
4946058
d69aeca
079d1ea
970aac2
fee7ebb
e5a5204
216d620
65af4de
a4fe128
9bc6c89
08b3069
7088791
b61c14f
017f05d
f432e68
59a70a9
f1d3b4a
51ac35c
3d04556
1fb36ee
ecf440a
9e34efb
fc014e5
896ed31
7792521
2f75b50
162e9cf
ade1150
8e08012
5ad2148
511ab9f
69280e8
2d37dc9
9c6bf2a
ee817b1
4cdbee5
c3783db
432399c
ad21c4f
18e0863
45e5b65
ee122a8
c9d2889
ca6bbac
d68d951
9684646
d3de062
d392140
42b8c88
b976525
d50f815
40d81e8
3171339
63c4bb0
eda265b
05cdecf
a4d4d79
cef897d
097cadd
89ce0aa
d2735dd
5a1e4d6
c15b5d2
615534d
bd67026
a61fd09
b0d4135
File filter
Filter by extension
Conversations
Jump to
There are no files selected for viewing
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should we add a parameter dimension that specifies the ratio of delimiters per character (e.g.,
1:8
,1:16
,1:32
) and test performance as a function of it?There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes. benchmark is not up to snuff. if there are a significant amount of delimiters, perf could be lower, but only because it would influence the number of offsets that need to be written. otherwise there's not a lot of divergence in the implementation.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Going to talk to @davidwendt about how to generate data for this benchmark.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this intentionally taking ownership of the
istream
?There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes it is.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Shouldn't it just be like this then?
There's no reason to mandate how the lifetime of the
istream
object is managed, right?There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This can silently break in at least two scenarios. First, if
find_or_create_data
is called from two threads with their per-thread default stream. In both cases thecudaStream_t
's value will be 0, but they refer to different streams.Second, if a stream is created, used in
find_or_create_data
, destroyed, and a new stream is created and reuses the same stream ID, then once again you'll have a handle to the wrong stream.Any time you're trying to use a stream ID as a key you run into these problems.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I understand that this is an important thing to get right, especially if we decide to re-use it in other places in cuio. Right now the scope of this cache is limited to a single reading of the file, so the only invocations of this function will always see non-default streams from a stream pool, and once the work is done both the stream pool and cache will be destroyed.
How important is it to get this code reusable in this PR vs get it right the next time we try to use it? Would it be better to document the caveats and make a more general solution later?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't see these being a problem in the current use case, and won't have a better solution until there are more use cases. Ok to resolve?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I can get rid of the thread-id based lookup and use a concurrent collection to store unused tickets. if no tickets are available, one is created (up to a certain number). when a ticket is no longer in use, it can be returned to the collection. this way it doesn’t matter what stream is used to read the chunk.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is impractical to address differently unless we change
rmm
's sync-and-steal behavior or usecudaStreamAddCallback
to indicate when a buffer is ready for re-use. :(There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It can be resolved by mapping the user specified stream to an internal side stream and synchronizing the user specified stream with the internal stream. That way you know the lifetime of the internal streams.