Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve verify_out_shape, ignore unrelated dims #121

Open
albertz opened this issue Mar 4, 2022 · 1 comment
Open

Improve verify_out_shape, ignore unrelated dims #121

albertz opened this issue Mar 4, 2022 · 1 comment
Assignees
Milestone

Comments

@albertz
Copy link
Member

albertz commented Mar 4, 2022

One further aspect to the dim tag handling (#17), I'm not sure this was already discussed or mentioned:

I think most modules or functions should actually not have a particular expectation of the incoming shape, except for the specific dims they operate on, like in_dim, in_spatial_dims, axis, axes. So these axes are expected to be there. And they might be transformed to out_dim, out_spatial_dims etc in the output.

All other dims should just stay as is. All other dims would behave like the batch dim.

RETURNN layers basically behave like this now (via rwth-i6/returnn#597).

For example, when we would introduce a search beam as a separate dimension (not merged into the batch), which would make things much cleaner and nicer, all code should just work as before.

This implies that verify_out_shape is actually not useful because this would break it. We should make it difficult that any user code could break this. Or we should improve verify_out_shape somehow.

I'm not sure if there might be other things as well.

@albertz
Copy link
Member Author

albertz commented May 16, 2022

To add a bit here:

There is no real suggestion here yet on how to actually go it. Except that the current verify_out_shape is not so useful. I'm even afraid that people might use verify_out_shape in the wrong way and then have modules restricted on specific assumptions, like that there is exactly one batch dim, which is the global batch dim. This is specifically what we do not want.

We should change verify_out_shape in a way that the straightforward usage would not result in such code.

Maybe even having a global batch_dim is counter productive. We should only really need that for the extern_data definition and nowhere else. But I'm not sure if we should change that now. (@Zettelkasten ?)

Maybe verify_out_shape could look like:

output.verify_out_shape(
  inputs=...,
  input_dims=...,  # all explicit dims, e.g. spatial, feature, but not batch
  output_dims=...  # all explicit dims
)

Internally, it basically would collect all dims from all the inputs, remove the input_dims, add the output_dims, and check if that matches the output.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants