Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Only functional wrapped layers except VariableLayer? #82

Closed
albertz opened this issue Nov 16, 2021 · 1 comment
Closed

Only functional wrapped layers except VariableLayer? #82

albertz opened this issue Nov 16, 2021 · 1 comment
Milestone

Comments

@albertz
Copy link
Member

albertz commented Nov 16, 2021

We want to be able to perform some generic things on parameters, such as weight norm, wight dropout or L2 loss (see #59) in a unified and straightforward way.

When we have some modules where the parameters are hidden inside the RETURNN layer (e.g. Linear), any such logic could be quite counter-intuitive, complicated and potentially even buggy. I expect that when we can directly see all parameters in the returnn-code, that this should become much easier (see e.g. the code behind torch.nn.utils.weight_norm, which is quite simple, but would be tricky if parameters are hidden in RETURNN layers).

There are actually not much such modules:

  • Linear
  • Conv
  • TransposedConv
  • BatchNorm
  • RelativePositionalEncoding

We also need to have a functional variant of the RecLayer (rwth-i6/returnn#817).

That's all. And they are all very simple to be reimplemented using pure functional modules, e.g. dot etc.
Specifically:

  • Linear: Use dot
  • Conv: Use the functional variant of ConvLayer
  • TransposedConv: Use the functional variant of TransposedConvLayer
  • BatchNorm: reimplement, maybe even more efficient by more directly wrapping fused TF ops
  • RelativePositionalEncoding: anyway reimplement, see discussion in Transformer Modules #55

So then the only module which really is a tf.Variable is the Variable module (or maybe rename to Parameter, to be more consistent to PyTorch). We can also easily implement functions like parameters() and named_parameters() for modules, and then follow very similar simple logic for things like weight norm etc as in PyTorch.

@albertz
Copy link
Member Author

albertz commented Dec 16, 2021

After further thought, I think we should do this. This can simplify many things.

albertz added a commit that referenced this issue Jan 6, 2022
Those were the remaining modules with params.

#82
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant