-
-
Notifications
You must be signed in to change notification settings - Fork 608
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
deprecate DepthwiseConv once we have groups in standard conv #1667
Comments
Tensorflow has both, and it seems like that would be a nice distinction to maintain. |
we could keep the layer/function but just as a wrapper around |
Why not add the required groups support to the dims object now that we have the infrastructure to do so? Wrapping it over |
I think the suggestion is that both |
Am I correct to think that |
You are correct, I think they could be tweaked to have the same code. And I was going to suggest it as a convenience constructor at first, but I think keeping it as its own layer might be nice for two reasons: a) future optimizations (minor) and b) seeing |
How does CUDNN handle the depthwise conv |
It doesn't, one uses groups = channels_in in the standard convolution (the only one cudnn exposes) |
Which is covered by FluxML/NNlibCUDA.jl#9, no additional machinery required :) My main issue with having a separate |
We could want a separate layer for printing, discoverability, and to avoid breakage. I would also add |
So if someone creates a |
I agree that it is not straightforward, but I am inclined to think that if someone creates a I think what's more likely is that if someone who doesn't know the equivalence calls |
Having a depthwise conv seems strictly better. People who don't know/ don't want to know the groups setting can use |
My thought was to actually leverage the type system here. Since Julia is smart enough to print out aliases when they're defined, creating a This assumes the only reason to have a separate type over a constructor is for better display of course. If so, I'm of the opinion that big compound names generally hide an abstraction waiting to be let out. For example, if we add |
The value of having it isn't in the printing, it's in making code readable. Adding checks to return (or print) mismatched types would also needlessly complicate Flux, which I am strictly against. We already have overloads for We haven't heard complaints over its existence, and I doubt anyone is going to be mad either, especially if it's made GPU friendly. |
Are you referring to user code or library code? Because for user code, nothing changes. You call If this is about internal complexity, which I assume is what you mean by
Then I think it's worth looking at what changes would have to be made. This: struct Conv{N,M,F,A,V}
...
end Becomes struct Conv{InC,OutC,Groups,N,M,F,A,V} # type params left unabbreviated for clarity
...
end
const DepthwiseConv = Conv{InC,OutC,InC} Since we don't presently have an easy way of querying the number of channels or groups from a Here are the all of the potential hangups I can think of:
|
Also ref FluxML/NNlibCUDA.jl#22 (comment) |
DepthwiseConv is a grouped convolution with
groups = channels_in
The text was updated successfully, but these errors were encountered: