-
Notifications
You must be signed in to change notification settings - Fork 432
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Experimental Explicit Stream Annotation #17982
Closed
chaserileyroberts
wants to merge
1
commit into
openxla:main
from
chaserileyroberts:chase/stream_annotation
Closed
Experimental Explicit Stream Annotation #17982
chaserileyroberts
wants to merge
1
commit into
openxla:main
from
chaserileyroberts:chase/stream_annotation
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
chaserileyroberts
force-pushed
the
chase/stream_annotation
branch
3 times, most recently
from
October 15, 2024 01:08
fa650cd
to
018ed51
Compare
chaserileyroberts
force-pushed
the
chase/stream_annotation
branch
from
October 15, 2024 01:13
018ed51
to
d64fa07
Compare
chaserileyroberts
changed the title
[DRAFT] Stream Annotation Prototype
Experimental Explicit Stream Annotation
Oct 15, 2024
Can you split the PR into:
|
Yes I will do that. |
copybara-service bot
pushed a commit
that referenced
this pull request
Oct 29, 2024
Imported from GitHub PR #18448 First part of splitting up #17982 Copybara import of the project: -- 8ecc06c by chaser <[email protected]>: Don't inline stream annotated kCalls Merging this change closes #18448 FUTURE_COPYBARA_INTEGRATE_REVIEW=#18448 from chaserileyroberts:chase/stream_call_noinline 8ecc06c PiperOrigin-RevId: 691070675
copybara-service bot
pushed a commit
that referenced
this pull request
Nov 1, 2024
Imported from GitHub PR #18448 First part of splitting up #17982 Copybara import of the project: -- 8ecc06c by chaser <[email protected]>: Don't inline stream annotated kCalls Merging this change closes #18448 FUTURE_COPYBARA_INTEGRATE_REVIEW=#18448 from chaserileyroberts:chase/stream_call_noinline 8ecc06c PiperOrigin-RevId: 691070675
copybara-service bot
pushed a commit
that referenced
this pull request
Nov 6, 2024
Imported from GitHub PR #18448 First part of splitting up #17982 Copybara import of the project: -- 8ecc06c by chaser <[email protected]>: Don't inline stream annotated kCalls Merging this change closes #18448 FUTURE_COPYBARA_INTEGRATE_REVIEW=#18448 from chaserileyroberts:chase/stream_call_noinline 8ecc06c PiperOrigin-RevId: 691070675
copybara-service bot
pushed a commit
that referenced
this pull request
Nov 7, 2024
Imported from GitHub PR #18448 First part of splitting up #17982 Copybara import of the project: -- 8ecc06c by chaser <[email protected]>: Don't inline stream annotated kCalls Merging this change closes #18448 FUTURE_COPYBARA_INTEGRATE_REVIEW=#18448 from chaserileyroberts:chase/stream_call_noinline 8ecc06c PiperOrigin-RevId: 691070675
copybara-service bot
pushed a commit
that referenced
this pull request
Nov 7, 2024
Imported from GitHub PR #18448 First part of splitting up #17982 Copybara import of the project: -- 8ecc06c by chaser <[email protected]>: Don't inline stream annotated kCalls Merging this change closes #18448 FUTURE_COPYBARA_INTEGRATE_REVIEW=#18448 from chaserileyroberts:chase/stream_call_noinline 8ecc06c PiperOrigin-RevId: 691070675
copybara-service bot
pushed a commit
that referenced
this pull request
Nov 7, 2024
Imported from GitHub PR #18448 First part of splitting up #17982 Copybara import of the project: -- 8ecc06c by chaser <[email protected]>: Don't inline stream annotated kCalls Merging this change closes #18448 FUTURE_COPYBARA_INTEGRATE_REVIEW=#18448 from chaserileyroberts:chase/stream_call_noinline 8ecc06c PiperOrigin-RevId: 691070675
copybara-service bot
pushed a commit
that referenced
this pull request
Nov 7, 2024
Imported from GitHub PR #18448 First part of splitting up #17982 Copybara import of the project: -- 8ecc06c by chaser <[email protected]>: Don't inline stream annotated kCalls Merging this change closes #18448 FUTURE_COPYBARA_INTEGRATE_REVIEW=#18448 from chaserileyroberts:chase/stream_call_noinline 8ecc06c PiperOrigin-RevId: 691070675
copybara-service bot
pushed a commit
that referenced
this pull request
Nov 7, 2024
Imported from GitHub PR #18448 First part of splitting up #17982 Copybara import of the project: -- 8ecc06c by chaser <[email protected]>: Don't inline stream annotated kCalls Merging this change closes #18448 FUTURE_COPYBARA_INTEGRATE_REVIEW=#18448 from chaserileyroberts:chase/stream_call_noinline 8ecc06c PiperOrigin-RevId: 691070675
copybara-service bot
pushed a commit
that referenced
this pull request
Nov 8, 2024
Imported from GitHub PR #18448 First part of splitting up #17982 Copybara import of the project: -- 8ecc06c by chaser <[email protected]>: Don't inline stream annotated kCalls Merging this change closes #18448 FUTURE_COPYBARA_INTEGRATE_REVIEW=#18448 from chaserileyroberts:chase/stream_call_noinline 8ecc06c PiperOrigin-RevId: 691070675
copybara-service bot
pushed a commit
that referenced
this pull request
Nov 8, 2024
Imported from GitHub PR #18448 First part of splitting up #17982 Copybara import of the project: -- 8ecc06c by chaser <[email protected]>: Don't inline stream annotated kCalls Merging this change closes #18448 COPYBARA_INTEGRATE_REVIEW=#18448 from chaserileyroberts:chase/stream_call_noinline 8ecc06c PiperOrigin-RevId: 694522066
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
This is the first PR that is intended to support explicit stream annotations for GPU runtimes.
Why do we want/need this?
There are a few optimizations that are possible with existing hardware, but they are difficult to generate from XLA. The intention is that by allowing explicit annotation of what stream a subcomputation should run with, we can allow users to define their own stream assignment strategies
Certain configurations of parallelization that would allow for overlapping
all-gather
andall-reduce
operations on independent networking hardware (i.e., one operation is running on NVLinks and the other exclusively using IB). Right now there is no way to do this explicitly in JAX.Certain kernels do not utilize all of the SMs for the full duration of their computation, leaving some idling. Other independent compute kernels could utilize these SMs for better e2e performance. We do this already for some of our collective matmul implementations, but there is currently no way to do this explicitly in JAX.
What are the code changes?
xla_gpu_experimental_stream_annotation
CallInliner
.operation_queue_id
and add wrapped async operationsCurrent known limitations
Add
will not correctly lower to a fusion.