Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

RF random init should use inplace ops when available #1299

Closed
albertz opened this issue Apr 3, 2023 · 0 comments
Closed

RF random init should use inplace ops when available #1299

albertz opened this issue Apr 3, 2023 · 0 comments

Comments

@albertz
Copy link
Member

albertz commented Apr 3, 2023

For reference on the RETURNN frontend (RF): #1120

Currently we call rf.random, which always allocates a new tensor, and then we call Backend.set_parameter_initial_value which copies over the values to the param.

Instead, many eager-based frameworks have inplace ops for many things, including generating random values. That's how random init is usually done in PyTorch, e.g. by calling torch.nn.init.uniform_.

We should also do this for our param init when such ops are available.

This means we must change our ParamInit API. E.g. like adding out, like:

class ParamInit:
    """API for param init"""

    def __call__(
        self, dims: Sequence[Dim], dtype: str, sparse_dim: Optional[Dim] = None, out: Optional[Tensor] = None,
    ) -> Union[Tensor, rf.RawTensorTypes]:
        raise NotImplementedError

We also have to extend rf.random in a similar way, by adding out.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

1 participant