Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Resampling #1487

Closed
6 tasks done
carolineechen opened this issue May 5, 2021 · 7 comments
Closed
6 tasks done

Resampling #1487

carolineechen opened this issue May 5, 2021 · 7 comments
Assignees

Comments

@carolineechen
Copy link
Contributor

carolineechen commented May 5, 2021

We are revisiting the resampling feature, the way it is implemented and intended usage. Here are the improvements we plan to make:

Expanding resampling capabilities and quality

Add rolloff as an adjustable parameter

  • Currently a fixed value of 0.99 is used, but exposing the parameter can allow for control of the degree of anti-aliasing and highest frequency allowed when resampling.

Kaiser window support

  • Currently, only sinc window is supported. Kaiser window will support an additional shape parameter (β) that controls the smoothness and impulse width of the filter, allowing for a wider range of resampling capabilities and signal quality.

Improve transforms.Resample speed

  • Improve speed of transforms.Resample by precomputing and caching the window kernel computation, which is the same for a given set of resampling parameters. After this change, we encourage users to use a single transforms.Resample class over functional.resample when calling resample multiple times with the same parameters, such as in cases of data loading.

Additional

warning for non-integer valued sampling rates

  • Sampling rates are currently being cast to ints during computation (see float sample_rate #891), as the resampling algorithm dependent on an integer ratio for the original / new frequency. We will issue a warning (error in the next release) and encourage users to manually convert their non-integer sampling rate parameters to integers that retain this same ratio rather than forcing integer inputs (ex/ using orig_freq=8, new_freq=1 instead of orig_freq=44100, new_freq=5512.5).
  • If there are situations requiring float sampling rates (that can not be manually converted to integers) that would benefit from approximate resampling and removing the int requirement, please leave a comment below so that we are aware of these use cases

Deprecate and remove resample_waveform from kaldi compliance

  • resample_waveform in kaldi compliance simply calls functional.resample

Progress Tracker

cc @mthrok

@mthrok
Copy link
Collaborator

mthrok commented May 6, 2021

cc @small-yellow-duck @faroit @adefossez Please let us know if you have an opinion on resampling.

@small-yellow-duck
Copy link

Earlier discussion here:
#908

  • some discussion about how to test the module against librosa

Earlier PR here:
#911

  • uses pytorch conv ops to do faster resampling
  • includes Kaiser window support and handles the edge case where the audio signal is very short

@mthrok
Copy link
Collaborator

mthrok commented May 20, 2021

@small-yellow-duck So we landed #1509. We appreciate if you can try it and give us a feedback.

@mthrok mthrok added this to the v0.9 milestone May 25, 2021
@mthrok mthrok mentioned this issue May 25, 2021
14 tasks
@mthrok mthrok removed this from the v0.9 milestone May 25, 2021
@mthrok
Copy link
Collaborator

mthrok commented May 26, 2021

We can add resmaple parameter to load function.

@mthrok
Copy link
Collaborator

mthrok commented May 29, 2021

@carolineechen Could you add test cases that verifies that the resulting waveform is same when the new sampling rate is same with the original sample rate?

@mthrok
Copy link
Collaborator

mthrok commented Jun 4, 2021

@carolineechen It looks like most of the compliance_kaldi_test.py can be safely removed. and so are assets/kaldi and its generation scripts.

When you remove resample_waveform function from kaldi module, can you remove these as well?

@nvssynthesis
Copy link

nvssynthesis commented Nov 11, 2024

I have a use case that could benefit from floating point orig_freq and new_freq. I am trying to fit periodic wavelengths (of any frequency under Nyquist) into a constant length of samples via resampling.
Sure, maybe the results could usually be 'good enough' with rounding, or maybe use fractions.Fraction(ratio).limit_denominator(max_denominator), to get decent results, but it seems like a feature that could be appreciated for similar cases.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants