Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Prev: in Rec #6

Closed
Atticus1806 opened this issue Jun 15, 2021 · 2 comments
Closed

Prev: in Rec #6

Atticus1806 opened this issue Jun 15, 2021 · 2 comments

Comments

@Atticus1806
Copy link
Contributor

What would be the proper way to access the previous timestep for the Rec class? Is there a logic for that already? Just using a function get_special_layer would not makes sense I think, due to this being a crucial feature of a RecUnit so making it part of the class would be good.

@Atticus1806 Atticus1806 changed the title Prev: in RecUnit Prev: in Rec Jun 15, 2021
@albertz
Copy link
Member

albertz commented Jun 15, 2021

This is still missing. And this is a good question. I was thinking about it but I'm not really sure yet about the best way. It also has to be consistent to the naming logic (not only consistent, but we have to ensure that it will be always correct).

Originally I was thinking about requiring all recurrent state as arguments to step. So very analogue to the body function in tf.while_loop where you explicitly specify the loop_vars, and that is what body gets, and it is supposed to return the next iteration.
But I don't like this too much. This is somewhat un-RETURNN-ic. Also you could not have custom sub-networks which would have some internal hidden state. Or it would be quite annoying to get that. Also the step arguments are also the arguments from outside the loop (like the forward arguments), and this would be somehow mixed.

I'm thinking about some explicit mechanism like:

prev_x = prev("lstm")  # this would use this name, but you could use whatever you like
...
x = self.lstm(..., prev=prev_x)  # the prev arg would make sure that it uses the right name etc
...

But I'm not sure if there is some better way.

Also, I'm thinking about maybe requiring information about the shape. Sth like prev("lstm", shape=[dim]) or so. Not sure if we maybe need that at some point. But maybe not yet. Maybe this should first be thought out on RETURNN side. This is to avoid the heuristic template construction.

@albertz
Copy link
Member

albertz commented Oct 13, 2021

This is resolved now via Loop.state, as discussed in #16.

@albertz albertz closed this as completed Oct 13, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants