Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Unsupported operation when using CUDA #713

Open
salbert83 opened this issue Oct 13, 2024 · 0 comments
Open

Unsupported operation when using CUDA #713

salbert83 opened this issue Oct 13, 2024 · 0 comments

Comments

@salbert83
Copy link

salbert83 commented Oct 13, 2024

I previously raised this issue here, FluxML/Zygote.jl#1532, but was recommended it would be more appropriate here.

My environment. I have seen the same issue on Linux machines
Julia Version 1.11.0
Commit 501a4f25c2 (2024-10-07 11:40 UTC)
Build Info:
Official https://julialang.org/ release
Platform Info:
OS: Windows (x86_64-w64-mingw32)
CPU: 8 × Intel(R) Core(TM) i7-1065G7 CPU @ 1.30GHz
WORD_SIZE: 64
LLVM: libLLVM-16.0.6 (ORCJIT, icelake-client)
Threads: 1 default, 0 interactive, 1 GC (on 8 virtual cores)
Status C:\Users\salbe\OneDrive\Documents\Research\JuliaBugs\Project.toml
[052768ef] CUDA v5.5.2
[7073ff75] IJulia v1.25.0
[e88e6eb3] Zygote v0.6.71

The example:

using CUDA, Zygote, LinearAlgebra  # [edit -- added using + code block]

f₁(x) = sum(abs2, exp.(log.(x) .* (1:length(x))))
f₂(x) = sum(abs2, x.^(1:length(x)))
x = randn(ComplexF64, 5);
z = CuArray{ComplexF64}(x);

# **Check the gradient calculations are consistent between the 2 functons**
test₁ = Zygote.gradient(f₁, x)[1]
test₂ = Zygote.gradient(f₂, x)[1]
norm(test₁ - test₂) / norm(test₁)
# **Output:**  2.2530284453414604e-16 <-- This is reasonable

# **Check the calculation using CUDA**
test₃ = Zygote.gradient(f₁, z)[1];
norm(test₁ - Array(test₃))/ norm(test₁)
# **Output:** 2.0454901873585542e-16

# **However, using f₂ generates an exception**
test₄ = Zygote.gradient(f₂, z)

Output:

InvalidIRError: compiling MethodInstance for (::GPUArrays.var"#34#36")(::CUDA.CuKernelContext, ::CuDeviceVector{GPUArrays.BrokenBroadcast{Union{}}, 1}, ::Base.Broadcast.Broadcasted{CUDA.CuArrayStyle{1, CUDA.DeviceMemory}, Tuple{Base.OneTo{Int64}}, Zygote.var"#1409#1410"{typeof(^)}, Tuple{Base.Broadcast.Extruded{CuDeviceVector{ComplexF64, 1}, Tuple{Bool}, Tuple{Int64}}, Base.Broadcast.Extruded{UnitRange{Int64}, Tuple{Bool}, Tuple{Int64}}}}, ::Int64) resulted in invalid LLVM IR
Reason: unsupported dynamic function invocation (call to ≺(a, b) @ ForwardDiff [C:\Users\salbe\.julia\packages\ForwardDiff\PcZ48\src\dual.jl:54](file:///C:/Users/salbe/.julia/packages/ForwardDiff/PcZ48/src/dual.jl#line=53))
Stacktrace:
 [1] promote_rule
   @ [C:\Users\salbe\.julia\packages\ForwardDiff\PcZ48\src\dual.jl:407](file:///C:/Users/salbe/.julia/packages/ForwardDiff/PcZ48/src/dual.jl#line=406)
 [2] promote_type
   @ [.\promotion.jl:318](http://localhost:8888/promotion.jl#line=317)
 [3] ^
   @ [.\complex.jl:886](http://localhost:8888/complex.jl#line=885)
 [4] #1409
   @ [C:\Users\salbe\.julia\packages\Zygote\Tt5Gx\src\lib\broadcast.jl:276](file:///C:/Users/salbe/.julia/packages/Zygote/Tt5Gx/src/lib/broadcast.jl#line=275)
 [5] _broadcast_getindex_evalf
   @ [.\broadcast.jl:673](http://localhost:8888/broadcast.jl#line=672)
 [6] _broadcast_getindex
   @ [.\broadcast.jl:646](http://localhost:8888/broadcast.jl#line=645)
 [7] getindex
   @ [.\broadcast.jl:605](http://localhost:8888/broadcast.jl#line=604)
 [8] #34
   @ [C:\Users\salbe\.julia\packages\GPUArrays\qt4ax\src\host\broadcast.jl:59](file:///C:/Users/salbe/.julia/packages/GPUArrays/qt4ax/src/host/broadcast.jl#line=58)
Hint: catch this exception as `err` and call `code_typed(err; interactive = true)` to introspect the erronous code with Cthulhu.jl

Stacktrace:
  [1] check_ir(job::GPUCompiler.CompilerJob{GPUCompiler.PTXCompilerTarget, CUDA.CUDACompilerParams}, args::LLVM.Module)
    @ GPUCompiler [C:\Users\salbe\.julia\packages\GPUCompiler\2CW9L\src\validation.jl:147](file:///C:/Users/salbe/.julia/packages/GPUCompiler/2CW9L/src/validation.jl#line=146)
  [2] macro expansion
    @ [C:\Users\salbe\.julia\packages\GPUCompiler\2CW9L\src\driver.jl:382](file:///C:/Users/salbe/.julia/packages/GPUCompiler/2CW9L/src/driver.jl#line=381) [inlined]
  [3] macro expansion
    @ [C:\Users\salbe\.julia\packages\TimerOutputs\NRdsv\src\TimerOutput.jl:253](file:///C:/Users/salbe/.julia/packages/TimerOutputs/NRdsv/src/TimerOutput.jl#line=252) [inlined]
  [4] macro expansion
    @ [C:\Users\salbe\.julia\packages\GPUCompiler\2CW9L\src\driver.jl:381](file:///C:/Users/salbe/.julia/packages/GPUCompiler/2CW9L/src/driver.jl#line=380) [inlined]
  [5] emit_llvm(job::GPUCompiler.CompilerJob; toplevel::Bool, libraries::Bool, optimize::Bool, cleanup::Bool, validate::Bool, only_entry::Bool)
    @ GPUCompiler [C:\Users\salbe\.julia\packages\GPUCompiler\2CW9L\src\utils.jl:108](file:///C:/Users/salbe/.julia/packages/GPUCompiler/2CW9L/src/utils.jl#line=107)
  [6] emit_llvm
    @ [C:\Users\salbe\.julia\packages\GPUCompiler\2CW9L\src\utils.jl:106](file:///C:/Users/salbe/.julia/packages/GPUCompiler/2CW9L/src/utils.jl#line=105) [inlined]
  [7] codegen(output::Symbol, job::GPUCompiler.CompilerJob; toplevel::Bool, libraries::Bool, optimize::Bool, cleanup::Bool, validate::Bool, strip::Bool, only_entry::Bool, parent_job::Nothing)
    @ GPUCompiler [C:\Users\salbe\.julia\packages\GPUCompiler\2CW9L\src\driver.jl:100](file:///C:/Users/salbe/.julia/packages/GPUCompiler/2CW9L/src/driver.jl#line=99)
  [8] codegen
    @ [C:\Users\salbe\.julia\packages\GPUCompiler\2CW9L\src\driver.jl:82](file:///C:/Users/salbe/.julia/packages/GPUCompiler/2CW9L/src/driver.jl#line=81) [inlined]
  [9] compile(target::Symbol, job::GPUCompiler.CompilerJob; kwargs::@Kwargs{})
    @ GPUCompiler [C:\Users\salbe\.julia\packages\GPUCompiler\2CW9L\src\driver.jl:79](file:///C:/Users/salbe/.julia/packages/GPUCompiler/2CW9L/src/driver.jl#line=78)
 [10] compile
    @ [C:\Users\salbe\.julia\packages\GPUCompiler\2CW9L\src\driver.jl:74](file:///C:/Users/salbe/.julia/packages/GPUCompiler/2CW9L/src/driver.jl#line=73) [inlined]
 [11] #1145
    @ [C:\Users\salbe\.julia\packages\CUDA\2kjXI\src\compiler\compilation.jl:250](file:///C:/Users/salbe/.julia/packages/CUDA/2kjXI/src/compiler/compilation.jl#line=249) [inlined]
 [12] JuliaContext(f::CUDA.var"#1145#1148"{GPUCompiler.CompilerJob{GPUCompiler.PTXCompilerTarget, CUDA.CUDACompilerParams}}; kwargs::@Kwargs{})
    @ GPUCompiler [C:\Users\salbe\.julia\packages\GPUCompiler\2CW9L\src\driver.jl:34](file:///C:/Users/salbe/.julia/packages/GPUCompiler/2CW9L/src/driver.jl#line=33)
 [13] JuliaContext(f::Function)
    @ GPUCompiler [C:\Users\salbe\.julia\packages\GPUCompiler\2CW9L\src\driver.jl:25](file:///C:/Users/salbe/.julia/packages/GPUCompiler/2CW9L/src/driver.jl#line=24)
 [14] compile(job::GPUCompiler.CompilerJob)
    @ CUDA [C:\Users\salbe\.julia\packages\CUDA\2kjXI\src\compiler\compilation.jl:249](file:///C:/Users/salbe/.julia/packages/CUDA/2kjXI/src/compiler/compilation.jl#line=248)
 [15] actual_compilation(cache::Dict{Any, CuFunction}, src::Core.MethodInstance, world::UInt64, cfg::GPUCompiler.CompilerConfig{GPUCompiler.PTXCompilerTarget, CUDA.CUDACompilerParams}, compiler::typeof(CUDA.compile), linker::typeof(CUDA.link))
    @ GPUCompiler [C:\Users\salbe\.julia\packages\GPUCompiler\2CW9L\src\execution.jl:237](file:///C:/Users/salbe/.julia/packages/GPUCompiler/2CW9L/src/execution.jl#line=236)
 [16] cached_compilation(cache::Dict{Any, CuFunction}, src::Core.MethodInstance, cfg::GPUCompiler.CompilerConfig{GPUCompiler.PTXCompilerTarget, CUDA.CUDACompilerParams}, compiler::Function, linker::Function)
    @ GPUCompiler [C:\Users\salbe\.julia\packages\GPUCompiler\2CW9L\src\execution.jl:151](file:///C:/Users/salbe/.julia/packages/GPUCompiler/2CW9L/src/execution.jl#line=150)
 [17] macro expansion
    @ [C:\Users\salbe\.julia\packages\CUDA\2kjXI\src\compiler\execution.jl:380](file:///C:/Users/salbe/.julia/packages/CUDA/2kjXI/src/compiler/execution.jl#line=379) [inlined]
 [18] macro expansion
    @ [.\lock.jl:273](http://localhost:8888/lock.jl#line=272) [inlined]
 [19] cufunction(f::GPUArrays.var"#34#36", tt::Type{Tuple{CUDA.CuKernelContext, CuDeviceVector{GPUArrays.BrokenBroadcast{Union{}}, 1}, Base.Broadcast.Broadcasted{CUDA.CuArrayStyle{1, CUDA.DeviceMemory}, Tuple{Base.OneTo{Int64}}, Zygote.var"#1409#1410"{typeof(^)}, Tuple{Base.Broadcast.Extruded{CuDeviceVector{ComplexF64, 1}, Tuple{Bool}, Tuple{Int64}}, Base.Broadcast.Extruded{UnitRange{Int64}, Tuple{Bool}, Tuple{Int64}}}}, Int64}}; kwargs::@Kwargs{})
    @ CUDA [C:\Users\salbe\.julia\packages\CUDA\2kjXI\src\compiler\execution.jl:375](file:///C:/Users/salbe/.julia/packages/CUDA/2kjXI/src/compiler/execution.jl#line=374)
 [20] cufunction
    @ [C:\Users\salbe\.julia\packages\CUDA\2kjXI\src\compiler\execution.jl:372](file:///C:/Users/salbe/.julia/packages/CUDA/2kjXI/src/compiler/execution.jl#line=371) [inlined]
 [21] macro expansion
    @ [C:\Users\salbe\.julia\packages\CUDA\2kjXI\src\compiler\execution.jl:112](file:///C:/Users/salbe/.julia/packages/CUDA/2kjXI/src/compiler/execution.jl#line=111) [inlined]
 [22] #launch_heuristic#1200
    @ [C:\Users\salbe\.julia\packages\CUDA\2kjXI\src\gpuarrays.jl:17](file:///C:/Users/salbe/.julia/packages/CUDA/2kjXI/src/gpuarrays.jl#line=16) [inlined]
 [23] launch_heuristic
    @ [C:\Users\salbe\.julia\packages\CUDA\2kjXI\src\gpuarrays.jl:15](file:///C:/Users/salbe/.julia/packages/CUDA/2kjXI/src/gpuarrays.jl#line=14) [inlined]
 [24] _copyto!
    @ [C:\Users\salbe\.julia\packages\GPUArrays\qt4ax\src\host\broadcast.jl:78](file:///C:/Users/salbe/.julia/packages/GPUArrays/qt4ax/src/host/broadcast.jl#line=77) [inlined]
 [25] copyto!
    @ [C:\Users\salbe\.julia\packages\GPUArrays\qt4ax\src\host\broadcast.jl:44](file:///C:/Users/salbe/.julia/packages/GPUArrays/qt4ax/src/host/broadcast.jl#line=43) [inlined]
 [26] copy
    @ [C:\Users\salbe\.julia\packages\GPUArrays\qt4ax\src\host\broadcast.jl:29](file:///C:/Users/salbe/.julia/packages/GPUArrays/qt4ax/src/host/broadcast.jl#line=28) [inlined]
 [27] materialize(bc::Base.Broadcast.Broadcasted{CUDA.CuArrayStyle{1, CUDA.DeviceMemory}, Nothing, Zygote.var"#1409#1410"{typeof(^)}, Tuple{CuArray{ComplexF64, 1, CUDA.DeviceMemory}, UnitRange{Int64}}})
    @ Base.Broadcast [.\broadcast.jl:867](http://localhost:8888/broadcast.jl#line=866)
 [28] broadcast_forward(::Function, ::CuArray{ComplexF64, 1, CUDA.DeviceMemory}, ::UnitRange{Int64})
    @ Zygote [C:\Users\salbe\.julia\packages\Zygote\Tt5Gx\src\lib\broadcast.jl:282](file:///C:/Users/salbe/.julia/packages/Zygote/Tt5Gx/src/lib/broadcast.jl#line=281)
 [29] adjoint
    @ [C:\Users\salbe\.julia\packages\Zygote\Tt5Gx\src\lib\broadcast.jl:361](file:///C:/Users/salbe/.julia/packages/Zygote/Tt5Gx/src/lib/broadcast.jl#line=360) [inlined]
 [30] _pullback(::Zygote.Context{false}, ::typeof(Base.Broadcast.broadcasted), ::CUDA.CuArrayStyle{1, CUDA.DeviceMemory}, ::Function, ::CuArray{ComplexF64, 1, CUDA.DeviceMemory}, ::UnitRange{Int64})
    @ Zygote [C:\Users\salbe\.julia\packages\ZygoteRules\M4xmc\src\adjoint.jl:67](file:///C:/Users/salbe/.julia/packages/ZygoteRules/M4xmc/src/adjoint.jl#line=66)
 [31] _apply(::Function, ::Vararg{Any})
    @ Core [.\boot.jl:946](http://localhost:8888/boot.jl#line=945)
 [32] adjoint
    @ [C:\Users\salbe\.julia\packages\Zygote\Tt5Gx\src\lib\lib.jl:203](file:///C:/Users/salbe/.julia/packages/Zygote/Tt5Gx/src/lib/lib.jl#line=202) [inlined]
 [33] _pullback
    @ [C:\Users\salbe\.julia\packages\ZygoteRules\M4xmc\src\adjoint.jl:67](file:///C:/Users/salbe/.julia/packages/ZygoteRules/M4xmc/src/adjoint.jl#line=66) [inlined]
 [34] broadcasted
    @ [.\broadcast.jl:1326](http://localhost:8888/broadcast.jl#line=1325) [inlined]
 [35] f₂
    @ [.\In](http://localhost:8888/In)[3]:2 [inlined]
 [36] _pullback(ctx::Zygote.Context{false}, f::typeof(f₂), args::CuArray{ComplexF64, 1, CUDA.DeviceMemory})
    @ Zygote [C:\Users\salbe\.julia\packages\Zygote\Tt5Gx\src\compiler\interface2.jl:0](file:///C:/Users/salbe/.julia/packages/Zygote/Tt5Gx/src/compiler/interface2.jl#line=-1)
 [37] pullback(f::Function, cx::Zygote.Context{false}, args::CuArray{ComplexF64, 1, CUDA.DeviceMemory})
    @ Zygote [C:\Users\salbe\.julia\packages\Zygote\Tt5Gx\src\compiler\interface.jl:90](file:///C:/Users/salbe/.julia/packages/Zygote/Tt5Gx/src/compiler/interface.jl#line=89)
 [38] pullback
    @ [C:\Users\salbe\.julia\packages\Zygote\Tt5Gx\src\compiler\interface.jl:88](file:///C:/Users/salbe/.julia/packages/Zygote/Tt5Gx/src/compiler/interface.jl#line=87) [inlined]
 [39] gradient(f::Function, args::CuArray{ComplexF64, 1, CUDA.DeviceMemory})
    @ Zygote [C:\Users\salbe\.julia\packages\Zygote\Tt5Gx\src\compiler\interface.jl:147](file:///C:/Users/salbe/.julia/packages/Zygote/Tt5Gx/src/compiler/interface.jl#line=146)
 [40] top-level scope
    @ In[7]:1
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant