Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Migration] Torchaudio Complex Tensor Support and Migration #1337

Closed
18 of 21 tasks
mthrok opened this issue Mar 2, 2021 · 4 comments
Closed
18 of 21 tasks

[Migration] Torchaudio Complex Tensor Support and Migration #1337

mthrok opened this issue Mar 2, 2021 · 4 comments

Comments

@mthrok
Copy link
Collaborator

mthrok commented Mar 2, 2021

Torchaudio Complex Tensor Support and Migration

Overview

torchaudio has been expressing complex numbers by having an extra dimension for real-part and imaginary-part. (We will refer this format as "pseudo complex type")

waveform = torch.randn(...)
spectrogram = torchaudio.functional.spectrogram(waveform, ..., power=None)

# the last dimension represents real and imaginary parts of complex tensors.
print(spectrogram.dtype, spectrogram.shape)
>>> torch.float32, torch.Size([... ,2])

PyTorch 1.6 introduced complex Tensor type, such as torch.complex64 (torch.cfloat) and torch.complex128 (torch.cdouble). (Will be refered as "native complex type")

The natitve complex type comes with handy methods for complex operation such as abs, angle and magphase. (Please refer to the official documentation for the detail.)

Over the few coming releases, we plan to migrate torchaudio's functions and transforms to the native complex type. This issues describes the planned approaches/works/changes/timeline. If you have a question, a concern or a suggestion. Feel free to leave a comment.

Migration Stages

We will perform the migration in multiple stages. At this moment, the completion of the later migration stages re not tied with specific releases yet.

✅ Stage 0 (~ 0.8)

Up to release 0.8, torchaudio exclusively used pseudo complex type. In PyTorch 1.7, PyTorch started the adaptation of native complex type and the migration of torch.fft namespace. Because of this, torchaudio already uses native complex type in some implementations (F.vad, T.Vad, kaldi.spectrogram and kaldi.fbank) but all the user facing APIs use use pseudo complex type.

✅ Stage 1 Add support for native complex type and deprecate pseudo complex type)

Completed: PyTorch 1.9 / torchaudio 0.9

Library code change

In this stage, torchaudio will support both pseudo complex type and native complex type. This means that

  • Functions that accept complex input should be able to handle both pseudo and native complex types.
  • Functions that return complex values can return both pseudo and native complex types.
    • For real-to-complex functions, a new argument return_complex will be added so that users can switch the behavior.

Test code update

In addition to the above library code changes, we are going to add a set of tests to make sure that native complex types work in common use cases. This includes;

  • Gradient check
  • JIT
  • Performance
  • Distributed training
  • nn.Module compatibility

✅Stage 2 (Switch to native complex type by default)

Completed: main branch. To be released as part of PyTorch 1.10 / torchaudio 0.10

The default value for return_complex is changed to True.

👉 Stage 3 (Remove the support for pseudo complex type)

In this stage, we will remove the support for pseudo complex type.

  • Functions that work with complex types should handle native complex types exclusively.
  • Passing pseudo complex type results in an error.
  • return_complex argument added in Stage. 1 is deprecated and eventually removed.

Affected Functions

The following figure illustrates the functions that handle complex values and their dependencies.

Screen Shot 2021-03-02 at 10 59 30

Utility functions

F.angle, F.complex_norm, T.ComplexNorm, F.magphase

These functions are deprecated in Stage.1 and will be removed in Stage.3.
For F.angle, native complex tensors provide the angle() function.
For F.complex_norm / T.ComplexNorm, the equivalent computation can be performed with abs().pow(n).
F.magphase is a convenient function to call F.angle and F.complex_norm, therefore, this function is deprecated as well.

Real to real functions

F.griffinlim, T.GriffinLim

Changes to these functions are kept internal, therefore we can simply change the internals without disturbing the downstream users.

Complex to complex functions

F.phase_vocoder, T.TimeStretch

When adding support for native complex type, we can simplify the interface change as follow

  • If the input is pseudo complex type, return pseudo complex type
  • If the input is native complex type, return native complex type

Real to complex functions

F.spectrogram, T.Spectrogram

These functions return either real valued Tensor (power, energy) or complex valued Tensor (frequency representation), which depends on what power argument was provided. When power is not provided, these functions return a complex-valued Tensor. In this case, users have the option to receive the result in pseudo complex type or native complex type. return_complex argument will be added for this choice. If return_complex is True, then native complex type is returned. See #1009 for the discussion.

Timeline

Migration Phase 1 2 3
PyTorch/torchaudio versions 1.9 / 0.9 1.10 / 0.10 TBD
Class / function Type
F.angle, F.complex_norm, F.magphase, T.ComplexNorm C->R, utility Deprecated Deprecated Removed
F.griffinlim, T.GriffinLim R->R Adopts native complex internally No change
F.phase_vocoder, T.TimeStretch C->C
  • Support for native complex type is added
  • Support for pseudo complex type is deprecated.
  • The function returns the same type as the input.
    (native for native, pseudo for pseudo)
Support for pseudo complex type is removed.
Only handles native complex type.
F.spectrogram, T.Spectrogram R->C (when power=None) Argument return_complex is added. (default value is False)
When the return value is complex-valued (power=None),
the type of the returned Tensor can be switched with return_complex.
The default value of return_complex is changed to True. The return_complex argument is deprecated.

Migration steps

F.angle, F.complex_norm, F.magphase and T.ComplexNorm

~0.8 0.9~
spectrogram = ...  # Tensor with pseudo complex type (shape == (..., 2))
angle = F.angle(spectrogram)
magnitude = F.complex_norm(spectrogram, norm=1)
power = F.complex_norm(spectrogram, norm=2)
norm = F.complex_norm(spectrogram, norm=norm)
magnitude, phase = F.magphase(spectrogram, n)
spectrogram = ...  # Tensor with pseudo complex type (shape == (..., 2))
spectrogram = torch.view_as_complex(spectrogram)  
angle = spectrogram.angle()
magnitude = spectrogram.abs()
power = spectrogram.abs().pow(2)
norm = spectrogram.abs().pow(norm)
magnitude, phase = spectrogram.abs().pow(n), spectrogram.angle()

F.phase_vocoder, T.TimeStretch

~0.8 0.9~
spec = ... # pseudo complex (..., 2) 

## If using functional form
spec = F.phase_vocoder(spec, ...)
## else using transform
transform = T.TimeStretch(...)
spec = transform(spec)
# Convert to native complex
spec = ... # pseudo complex (..., 2) 
# convert to native complex type
spec = torch.view_as_complex(spec)

# Perform the operation
## If using functional form
spec = F.phase_vocoder(spec, ...)
## else using transform
transform = T.TimeStretch(...)
spec = transform(spec)

# Convert back to pseudo complex type
# (If your downstream code still expects pseudo complex type)
spec = torch.view_as_real(spec)

F.spectrogram, T.Spectrogram

~0.8 0.9~
spec = F.spectrogram(waveform, ..., power=None)

transform = T.Spectrogram(..., power=None)
spec = transform(waveform)  # pseudo complex (..., 2) 
spec = F.spectrogram(waveform, ..., power=None, return_complex=True)

transform = T.Spectrogram(..., power=None, return_complex=True)
spec = transform(waveform)  # native complex
# If your downstream code still expects pseudo complex type
spec = torch.view_as_real(spec)

PRs - TODO (@mthrok)

Migration

Phase 1

Code Change
Add deprecation Warnings

Phase 2

Change the default value of return_complex to True.
Update the deprecation warnings to indicate the version of removal.

Phase 3

Remove the support for pseudo complex type.

Surrounding works

Conjugate input tests

Autograd tests

Ensuring TorchScript support

Check if all the functionals/transforms are covered by TorchScript consistency test and add if missing

Benchmark

cc @anjali411 @vincentqb

@mthrok mthrok modified the milestones: Complex Tensor Migration, v0.9 Apr 5, 2021
@mthrok mthrok pinned this issue Apr 9, 2021
@mthrok mthrok changed the title Torchaudio Complex Tensor Support and Migration [Migration] Torchaudio Complex Tensor Support and Migration Apr 9, 2021
mthrok added a commit to mthrok/audio that referenced this issue Jun 2, 2021
…rogram

Part of pytorch#1337 .

- This code changes the return type of spectrogram to be native complex dtype,
when (and only when) returning raw (complex-valued) spectrogram.
- Change `return_complex=False` to `return_complex=True` in spectrogram ops.
- `return_complex` is only effective when `power` is `None`. It is ignored for
cases where `power` is not `None`. Because the returned Tensor is power spectrogram,
which is real-valued Tensors.
mthrok added a commit to mthrok/audio that referenced this issue Jun 3, 2021
…rogram

Part of pytorch#1337 .

- This code changes the return type of spectrogram to be native complex dtype,
when (and only when) returning raw (complex-valued) spectrogram.
- Change `return_complex=False` to `return_complex=True` in spectrogram ops.
- `return_complex` is only effective when `power` is `None`. It is ignored for
cases where `power` is not `None`. Because the returned Tensor is power spectrogram,
which is real-valued Tensors.
mthrok added a commit to mthrok/audio that referenced this issue Jun 3, 2021
…rogram

Part of pytorch#1337 .

- This code changes the return type of spectrogram to be native complex dtype,
when (and only when) returning raw (complex-valued) spectrogram.
- Change `return_complex=False` to `return_complex=True` in spectrogram ops.
- `return_complex` is only effective when `power` is `None`. It is ignored for
cases where `power` is not `None`. Because the returned Tensor is power spectrogram,
which is real-valued Tensors.
mthrok added a commit that referenced this issue Jun 4, 2021
#1549)

* [BC-Breaking] Default to native complex type when returning raw spectrogram

Part of #1337 .

- This code changes the return type of spectrogram to be native complex dtype,
when (and only when) returning raw (complex-valued) spectrogram.
- Change `return_complex=False` to `return_complex=True` in spectrogram ops.
- `return_complex` is only effective when `power` is `None`. It is ignored for
cases where `power` is not `None`. Because the returned Tensor is power spectrogram,
which is real-valued Tensors.
mthrok added a commit that referenced this issue Nov 3, 2021
Following the plan #1337, this commit drops the support for pseudo complex type from 
`F.spectrogram` and `T.Spectrogram`.

It also deprecates the use of `return_complex` argument.
mthrok added a commit that referenced this issue Nov 3, 2021
…retch (#1957)

Following the plan #1337, this commit drops the support for pseudo complex type from `F.phase_vocoder` and `T.TimeStretch`.
@JuanFMontesinos
Copy link

JuanFMontesinos commented Mar 3, 2022

Please, before going forward with the deprecation, note that complex32 format is poorly supported on cuda.
Cannot do complex32 product, stft and istft allows power2 kernels only and a large list of drawbacks.

Had to rewrite whole code back to pseudo complex to be able to work with half precision...

@mthrok
Copy link
Collaborator Author

mthrok commented Mar 5, 2022

Cannot do complex32 product, stft and istft allows power2 kernels only and a large list of drawbacks.

Hi @JuanFMontesinos

Thanks for letting us know.

TorchAudio is not tested on fp16 nor complex32, and they are not part of officially supported types. So we did not realize that pseudo complex can be used for a workaround of complex32.

Unfortunately, we wrapped up the release v0.11 (scheduled to be out in about one week) and PyTorch removed complex32 type and torchaudio removed the support for pseudo complex type. So I assume it will be unusable to you. I can try reverting the pseudo complex support if that's the best course of action. (however it is known that some operations with pseudo complex have issues with accuracy, so it's not the best workaround, which is one we wanted to migrate to native complex type.)

The treatment of complex32 indeed needs improvement and there is an issue created for this in PyTorch. pytorch/pytorch#71680 The most ideal outcome is that PyTorch core adds complex32 support quickly but I am not sure if that can happen quickly. However, your voice matters a lot here, so would you be willing to provide features that will be most relevant for you in pytorch/pytorch#71680? That way, the PyTorch core team can prioritize it if they decide to work on them.

@jeffeuxMartin
Copy link

There were some typos here, so the migration made the program crash.

AttributeError: 'Tensor' object has no attribute 'power'

The correct version should be

power = spectrogram.abs().pow(2)
norm = spectrogram.abs().pow(norm)
magnitude, phase = spectrogram.abs().pow(n), spectrogram.angle()

@mthrok
Copy link
Collaborator Author

mthrok commented Mar 15, 2022

@jeffeuxMartin thanks for the report. Fixed it.

mthrok pushed a commit to mthrok/audio that referenced this issue Dec 13, 2022
Summary:
Mention new profiler API.

Test Plan:
make html-noplot
@mthrok mthrok closed this as completed Dec 22, 2022
@mthrok mthrok unpinned this issue Dec 22, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants
@mthrok @JuanFMontesinos @jeffeuxMartin and others