RF random init should use inplace ops when available #1299

albertz · 2023-04-03T07:58:43Z

For reference on the RETURNN frontend (RF): #1120

Currently we call rf.random, which always allocates a new tensor, and then we call Backend.set_parameter_initial_value which copies over the values to the param.

Instead, many eager-based frameworks have inplace ops for many things, including generating random values. That's how random init is usually done in PyTorch, e.g. by calling torch.nn.init.uniform_.

We should also do this for our param init when such ops are available.

This means we must change our ParamInit API. E.g. like adding out, like:

class ParamInit:
    """API for param init"""

    def __call__(
        self, dims: Sequence[Dim], dtype: str, sparse_dim: Optional[Dim] = None, out: Optional[Tensor] = None,
    ) -> Union[Tensor, rf.RawTensorTypes]:
        raise NotImplementedError

We also have to extend rf.random in a similar way, by adding out.

The text was updated successfully, but these errors were encountered:

albertz mentioned this issue Apr 3, 2023

Frontend API and PyTorch backend #1120

Open

albertz closed this as completed in b21e57f Apr 3, 2023

albertz added the returnn-frontend label May 24, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

RF random init should use inplace ops when available #1299

RF random init should use inplace ops when available #1299

albertz commented Apr 3, 2023 •

edited

Loading

RF random init should use inplace ops when available #1299

RF random init should use inplace ops when available #1299

Comments

albertz commented Apr 3, 2023 • edited Loading

albertz commented Apr 3, 2023 •

edited

Loading