-
Notifications
You must be signed in to change notification settings - Fork 130
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix MaskedComputationLayer for RETURNN principles #976
Conversation
9f8a4b2
to
6723531
Compare
This comment was marked as outdated.
This comment was marked as outdated.
6723531
to
fca3def
Compare
This comment was marked as resolved.
This comment was marked as resolved.
This comment was marked as outdated.
This comment was marked as outdated.
What actually would be the expected behavior for this inside a rec layer:
Well, the expected behavior is mostly clear for the case when it is inside a rec loop and not optimized. Currently, however, when this is optimized, the result would be wrong (or not match this expected behavior) because It would also still be wrong with all the new changes here because the automatic unmasking as implemented here happens only in The question is, what follow-up operations would we usually use actually? In case of a transducer, it is common to have a joint network on top. We should actually carefully check current setups. I think current setups might actually really expect the current behavior, and not the "expected" behavior.
So, we really expect that we do not make the both axes T and U here compatible, and instead that we keep both. Edit |
This comment was marked as off-topic.
This comment was marked as off-topic.
This comment was marked as off-topic.
This comment was marked as off-topic.
This comment was marked as off-topic.
This comment was marked as off-topic.
The implication from my previous post is basically:
It also means, this PR here changes a bit. We don't need such automatic broadcasting logic in |
This comment was marked as off-topic.
This comment was marked as off-topic.
This comment was marked as off-topic.
This comment was marked as off-topic.
This comment was marked as outdated.
This comment was marked as outdated.
This comment was marked as resolved.
This comment was marked as resolved.
So, you always have |
cd37938
to
206577b
Compare
206577b
to
79846ef
Compare
79846ef
to
3f938d5
Compare
14c8fa5
to
6848570
Compare
6848570
to
c492582
Compare
Fix #769. Specifically, it fixes the problem that the behavior when optimized out of a rec loop is not consistent and opaque to the user.
The
masked_from
option still is an exception though.