Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Adding test for T.SlidingWindowCmn #1482

Merged
merged 5 commits into from
May 3, 2021
Merged
Changes from 4 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
13 changes: 13 additions & 0 deletions test/torchaudio_unittest/transforms/autograd_test_impl.py
Original file line number Diff line number Diff line change
Expand Up @@ -154,6 +154,19 @@ def test_vol(self, gain, gain_type):
waveform = get_whitenoise(sample_rate=sample_rate, duration=0.05, n_channels=2)
self.assert_grad(transform, [waveform])

@parameterized.expand([
({'cmn_window': 600, 'min_cmn_window': 100, 'center': False, 'norm_vars': False}, ),
({'cmn_window': 600, 'min_cmn_window': 100, 'center': True, 'norm_vars': False}, ),
({'cmn_window': 600, 'min_cmn_window': 100, 'center': False, 'norm_vars': True}, ),
({'cmn_window': 600, 'min_cmn_window': 100, 'center': True, 'norm_vars': True}, ),
({'cmn_window': 500, 'min_cmn_window': 50, 'center': False, 'norm_vars': False}, ),
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

cmn_window =600 and min_cmn_window=100 look too big for the input with 8000 * 0.05 == 400 (then FFT applied) can you make them somewhat smaller than the number of frames in time axis of the input tensor?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thanks @mthrok will try to implement your suggestions

])
def test_sliding_window_cmn(self, kwargs):
sample_rate = 8000
transform = T.SlidingWindowCmn(**kwargs)
waveform = get_whitenoise(sample_rate=sample_rate, duration=0.05, n_channels=2)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The input to SlidingWindowCmn is supposed to be spectrogram.
This has been fixed in the master documentation https://pytorch.org/audio/master/functional.html#torchaudio.functional.sliding_window_cmn

Can you use get_spectrogram, then flip the last axis so that Tensor dimension is [... time, freq]?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

From the doc of torch.stft I could find that it returns a tensor in the shape (* × N × T) so do you suggest using torch.transpose(-2, -1) on the output?

self.assert_grad(transform, [waveform])

@unittest.expectedFailure
def test_timestretch_zeros_fail(self):
"""Test that ``T.TimeStretch`` fails gradcheck at 0
Expand Down