-
Notifications
You must be signed in to change notification settings - Fork 18.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
In-place computation can break gradient computation #2015
Comments
Note: you may want to coordinate with #1979 which fixes some bugs in MVNLayer. |
According to @mfigurnov, cuDNN max pooling is also a layer that requires its top data during backward. |
It's probably worth adding a mechanism to each layer that says whether it (a) does in-place computation and (b) can support the next layer doing in-place computation. Then, the net could check that all of the layers are compatible upon startup. |
Further thoughts from Sean Bell in #2853:
|
For instance, MVNLayer reads data from its top blob during the backward pass, under the assumption that this data is exactly the same as the output it created. If it's been modified by a later layer that does in-place computation, the gradient will be computed incorrectly.
In general, caffe should not rely on the user to know under what circumstances a layer can safely be done in-place.
The text was updated successfully, but these errors were encountered: