You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I just lost a couple of hours on debugging because I forgot that softmax_over_spatial, which is what nn.softmaxmaps to, does something completely different than the old "softmax" layer. Because it does not do the softmax per default over "F", but something different (defaulting to the time axis). This is really dangerous when you expect that you can use nn.softmax as an activation function.
I am not sure how to solve this best, I would say"softmax_over_spatial" is okay in its behavior (so no RETURNN changes), but nn.softmax should definitely not default to that behavior.
Maybe this issue is already resolved if nn.softmax needs an explicit dimension tag in the future, but if not, it needs to be fixed.
The text was updated successfully, but these errors were encountered:
I just lost a couple of hours on debugging because I forgot that
softmax_over_spatial
, which is whatnn.softmax
maps to, does something completely different than the old "softmax" layer. Because it does not do the softmax per default over "F", but something different (defaulting to the time axis). This is really dangerous when you expect that you can usenn.softmax
as an activation function.I am not sure how to solve this best, I would say"softmax_over_spatial" is okay in its behavior (so no RETURNN changes), but
nn.softmax
should definitely not default to that behavior.Maybe this issue is already resolved if
nn.softmax
needs an explicit dimension tag in the future, but if not, it needs to be fixed.The text was updated successfully, but these errors were encountered: