Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

The transpose function is problematic #228

Closed
rgommers opened this issue Jul 14, 2021 · 7 comments
Closed

The transpose function is problematic #228

rgommers opened this issue Jul 14, 2021 · 7 comments
Labels
API change Changes to existing functions or objects in the API. topic: Linear Algebra Linear algebra.

Comments

@rgommers
Copy link
Member

rgommers commented Jul 14, 2021

There are multiple problems, both related to adoption due to conflicting definitions, and because NumPy's transpose does something no one really wants (namely, swap all axes).

The current array API transpose may not be correct:

PyTorch names the numpy.transpose equivalent permute, and PyTorch's transpose is np.swapaxes. Relevant issues PyTorch issues:

Relevant NumPy issues:

There's multiple ways of dealing with this problem:

  1. Accept the status quo
  2. Remove transpose completely, and use permute or some other function name with the behavior we want by default (do the right thing for batches of 2-D matrices)
  3. Add both, by adding a matrix_transpose function and a corresponding .mT (or .MT) attribute

I searched the JAX and TensorFlow issue trackers as well and, somewhat surprisingly, didn't find an issue about this topic.

@rgommers rgommers added the topic: Linear Algebra Linear algebra. label Jul 14, 2021
@shoyer
Copy link
Contributor

shoyer commented Jul 14, 2021

I think separating the functionality of np.transpose() into two separate functions/methods would make sense. As you note, there are two distinct use cases:

  1. Swap the last two dimensions
  2. Permute axes

The only real relation is that (1) can be implemented via (2).

From my perspective, the functionality (1) would be the best fit for the name "transpose", and (2) could use some other name, like "permute axes".

@rgommers rgommers added the API change Changes to existing functions or objects in the API. label Jul 15, 2021
@kgryte
Copy link
Contributor

kgryte commented Aug 23, 2021

During the consortium meeting on 2021-07-14, the following was considered a path forward:

  1. Limit the .T attribute on array instances to 2D. If an array instance has more than two dimensions is not two-dimensional, an error should be raised. This allows retaining the current behavior for .T for the primary use case of taking the transpose of a matrix, while removing the less desired behavior of reversing all axes. PR: gh-245.
  2. Introduce a .mT attribute on array instances which performs a batched matrix transpose for the last two dimensions. PR: gh-246.
  3. Remove the transpose function from the specification and replace with a permute API. PR: gh-247.
  4. Introduce a matrix_transpose function for performing a batched matrix transpose. PR: gh-248.

@shoyer
Copy link
Contributor

shoyer commented Aug 31, 2021

  • Remove the transpose function from the specification and replace with a permute API. PR: gh-247.

any thoughts on the name permute vs permute_axes?

I like permute_axes a little bit better, because I can imagine "permute" alternatively referring to array elements.

@rgommers
Copy link
Member Author

Good point @shoyer, there's also https://numpy.org/devdocs/reference/random/generated/numpy.random.Generator.permuted.html which is quite similar. I don't have a strong preference, but permute_axes is indeed slightly clearer.

@kgryte
Copy link
Contributor

kgryte commented Sep 2, 2021

Some other reference points:

I am fine with either permute or permute_axes. Although, maybe we should be consistent with names. For example,

  • expand_dims, which is currently included in the specification.
  • while not currently included in the specification, swapaxes.

In which case, to throw another hat in the ring, maybe permute_dims, if we want to be more explicit about what is being permuted?

@rgommers
Copy link
Member Author

permute_dims seems quite reasonable too. If we consistently name keywords axis/axes, using _dims for function names and ndim for attribute is at least some kind of consistency.

@kgryte
Copy link
Contributor

kgryte commented Sep 20, 2021

As gh-245, gh-246, gh-247, and gh-248 have all been merged, closing this out.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
API change Changes to existing functions or objects in the API. topic: Linear Algebra Linear algebra.
Projects
None yet
Development

No branches or pull requests

3 participants