-
-
Notifications
You must be signed in to change notification settings - Fork 13
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Gather throw error on cpu and gpu array inputs #52
base: master
Are you sure you want to change the base?
Conversation
3b4e280
to
72356fd
Compare
@DhairyaLGandhi would you please review this? |
dstsize = (size(src)[1:Nsrc-M]..., size(idx)...) | ||
dst = similar(src, Tsrc, dstsize) | ||
return NNlib.gather!(dst, src, idx) | ||
end |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why we need this method? The one in Nlib is not enough?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I take it from NNlib. The dispatching rule needs to distinguish (::AnyCuArray, ::AnyCuArray)
from (::AnyCuArray, ::AbstractArray)
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I still don't see why we need this since we have (::AbstractArray, ::AbstractArray)
in NNlib
I'm not sure we should do this. It is a technically breaking change. Also CUDA.jl allows for cpu index, so maybe gather should do the same |
Good point. This might be over-engineered. If users care about the speed it's the standard practice to set using CUDA, NNlib
CUDA.allowscalar(false)
a = CUDA.rand(200,3000,64)
idx = cu(rand(1:64,5))
NNlib.gather(a, idx)
ERROR: Scalar indexing is disallowed.
Invocation of getindex resulted in scalar indexing of a GPU array.
This is typically caused by calling an iterating implementation of a method.
Such implementations *do not* execute on the GPU, but very slowly on the CPU,
and therefore are only permitted from the REPL for prototyping purposes.
If you did intend to index this array, annotate the caller with @allowscalar. |
I am also not sure we need it but maybe users expect some guidance to prevent from abuse. |
Closes FluxML/NNlib.jl#415