-
-
Notifications
You must be signed in to change notification settings - Fork 5.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Improve optimization of overlay methods #52940
Comments
maleadt
added
gpu
Affects running Julia on a GPU
compiler:effects
effect analysis
labels
Jan 17, 2024
I will work on this issue together with #51080 again this week. |
I prefer testing #54322 with the GPUCompiler use case first. |
aviatesk
added a commit
that referenced
this issue
Jun 18, 2024
This PR serves to replace #51080 and close #52940. It extends the `:nonoverlayed` to `UInt8` and introduces the `CONSISTENT_OVERLAY` effect bit, allowing for concrete evaluation of overlay methods using the original non-overlayed counterparts when applied. This newly added `:nonoverlayed`-bit is enabled through the newly added `Base.Experimental.@consistent_overlay mt def` macro. `@consistent_overlay` is similar to `@overlay`, but it sets the `:nonoverlayed`-bit to `CONSISTENT_OVERLAY` for the target method definition, allowing the method to be concrete-evaluated. To use this feature safely, I have also added quite precise documentation to `@consistent_overlay`.
aviatesk
added a commit
that referenced
this issue
Jun 18, 2024
This PR serves to replace #51080 and close #52940. It extends the `:nonoverlayed` to `UInt8` and introduces the `CONSISTENT_OVERLAY` effect bit, allowing for concrete evaluation of overlay methods using the original non-overlayed counterparts when applied. This newly added `:nonoverlayed`-bit is enabled through the newly added `Base.Experimental.@consistent_overlay mt def` macro. `@consistent_overlay` is similar to `@overlay`, but it sets the `:nonoverlayed`-bit to `CONSISTENT_OVERLAY` for the target method definition, allowing the method to be concrete-evaluated. To use this feature safely, I have also added quite precise documentation to `@consistent_overlay`.
KristofferC
pushed a commit
that referenced
this issue
Jun 18, 2024
This PR serves to replace #51080 and close #52940. It extends the `:nonoverlayed` to `UInt8` and introduces the `CONSISTENT_OVERLAY` effect bit, allowing for concrete evaluation of overlay methods using the original non-overlayed counterparts when applied. This newly added `:nonoverlayed`-bit is enabled through the newly added `Base.Experimental.@consistent_overlay mt def` macro. `@consistent_overlay` is similar to `@overlay`, but it sets the `:nonoverlayed`-bit to `CONSISTENT_OVERLAY` for the target method definition, allowing the method to be concrete-evaluated. To use this feature safely, I have also added quite precise documentation to `@consistent_overlay`.
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Opening an issue to track JuliaGPU/GPUCompiler.jl#384, and discuss how we can get something like #51080 merged.
Context: The GPU stack is a heavy user of overlay methods, to make functionality GPU-compatible or otherwise provide a GPU-specific implementation. One area where we need such overlays, are the outlined
throw_XXX
methods that throw objects requiring allocations. For example,InexactError
contains untyped fields and as such is currently GPU incompatible, so we overlayCore.throw_inexacterror
with a simplified version that only throws a message.Most of the time, overlay methods are unsafe to execute on the host, e.g., because they use GPU-specific functionality. AFAIU, that's why concrete evaluation of them is prohibited. However, because we overlay very common core functionality, that prevents lots of functionality being optimized and frequently results in GPU-incompatible code being generated.
An example that resulted from replacing
@pure
with effects modeling (#44776):Or, for a MWE without the GPU stack:
Another example is #48097, which we "fixed" by avoiding the calls to
Core.throw_inexacterror
in #48116. That kind of solution obviously doesn't scale.To properly solve this, we probably have to define precise semantics of method overlays, and how they affect optimization.
For example, we could offer the following possibilities:
:taint
(the default, and current behavior): concrete evaluation of a call is disabled if it calls this overlay method:equivalent
: the overlay method is functionally equivalent to the original method, so the compiler can use information from (i.e., concretely evaluate) the original method to optimize the call:executable
: the overlay method is safe to execute on the host, so concrete evaluation can use it directly:equivalent
semantics are required for most GPU overlays (e.g., when replacing openlibm functions with NVIDIA's GPU-only math library), but are slightly dangerous as I can imagine it could be tricky to guarantee that the overlay is actually functionally identical. That's why, when possible, I would think the:executable
semantic to be a better option.Note that I'm writing the above from the perspective of the GPUCompiler.jl use case, without much experience with the optimizer/irinterp/effects, so I'm probably missing some important details.
#51080 by @aviatesk implements something similar to this, basically making it possible to mark overlay methods as non-overlay, but as @Keno mentions there we probably need to be slightly more precise.
Tentatively putting this on the milestone, as we're running into this more often now that the optimizer is relying on effects more.
The text was updated successfully, but these errors were encountered: