Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add TorchScript-based SoX I/O backend #726

Merged
merged 1 commit into from
Jul 1, 2020

Conversation

mthrok
Copy link
Collaborator

@mthrok mthrok commented Jun 17, 2020

This PR (and dependent PRs) adds a new backend "sox_io" backend.

The new "sox_io" backend has the following advantages;

Note The current binary distribution of torchaudio does not contain ogg/vorbis codecs. To handle these files, you need to build torchaudio from the source. Refer to README for the instruction. #750

@mthrok mthrok force-pushed the sox_io_backend branch 7 times, most recently from 54fbcc2 to cb85a45 Compare June 18, 2020 14:57
@mthrok mthrok changed the title Add TorchScript-able SoX I/O backend Add TorchScript-based SoX I/O backend Jun 18, 2020
@mthrok mthrok force-pushed the sox_io_backend branch 12 times, most recently from f85d969 to 167cc55 Compare June 18, 2020 20:57
mthrok added a commit that referenced this pull request Jun 18, 2020
This is a part of PRs to add new "sox_io" backend. #726

This PR adds `SignalInfo` structure, which is data exchange interface between Python and C++ in coming TorchScript-based sox IO backend.
For the case, where C++ extension is not available (i.e. Windows), this PR also adds dummy class and module that will be substituted.
This logic is implemented in `torchaudio.extension` moduel.
@mthrok mthrok force-pushed the sox_io_backend branch 2 times, most recently from 4bf7016 to bfee816 Compare June 19, 2020 12:12
@mthrok mthrok force-pushed the sox_io_backend branch 9 times, most recently from d33a6ff to 3ffc88e Compare June 25, 2020 23:08
mthrok added a commit that referenced this pull request Jun 25, 2020
This is a part of PRs to add new "sox_io" backend. #726 and depends on #718 and #728 .

This PR adds `load` function to "sox_io" backend, which is  tested on the following audio formats;
 - `wav`
 - `mp3`
 - `flac`
 - `ogg/vorbis` *

By default, "sox_io" backend returns Tensor with `float32` dtype and the shape of `[channel, time]`. The samples are normalized to fit in the range of `[-1.0, 1.0]`.

Unlike existing "sox" backend, the new `load` function can handle WAV file natively, when the input format is WAV with integer type, (such as 32-bit signed integer, 16-bit signed integer and 8-bit unsigned integer) by providing `normalize=False`, this function can return integer Tensor, where the samples are expressed within the whole range of the corresponding dtype, that is, `int32` tensor for `32-bit PCM`, `int16` for `16-bit PCM` and `uint8` for `8-bit PCM`. This behavior follows [scipy.io.wavfile.read](https://docs.scipy.org/doc/scipy/reference/generated/scipy.io.wavfile.read.html). `normalize` parameter has no effect for other formats and the load function always return normalized value with `float32` Tensor.

__* Note__ The current binary distribution of torchaudio does not contain `ogg/vorbis` and `opus` codecs. To handle these files, one needs to build torchaudio from the source with proper codecs in the system.

__Note 2__ Since this PR, `scipy` becomes required module for running test.
@codecov
Copy link

codecov bot commented Jun 25, 2020

Codecov Report

Merging #726 into master will increase coverage by 0.02%.
The diff coverage is 100.00%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master     #726      +/-   ##
==========================================
+ Coverage   89.14%   89.16%   +0.02%     
==========================================
  Files          32       32              
  Lines        2561     2566       +5     
==========================================
+ Hits         2283     2288       +5     
  Misses        278      278              
Impacted Files Coverage Δ
torchaudio/backend/utils.py 89.13% <100.00%> (+1.32%) ⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update a20da5e...a4c824a. Read the comment docs.

@mthrok mthrok force-pushed the sox_io_backend branch 3 times, most recently from 8ae9ebd to a46c34f Compare June 30, 2020 01:46
mthrok added a commit that referenced this pull request Jul 1, 2020
This is a part of PRs to add new "sox_io" backend. #726 and depends on #718, #728 and #731.

This PR adds `save` function to "sox_io" backend, which can save Tensor to a file with the following audio formats;
 - `wav`
 - `mp3`
 - `flac`
 - `ogg/vorbis`
@mthrok mthrok marked this pull request as ready for review July 1, 2020 21:16
@mthrok mthrok requested a review from vincentqb July 1, 2020 21:16
Comment on lines +33 to +34
if backend == 'sox_io':
continue
Copy link
Contributor

@vincentqb vincentqb Jul 1, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is sox_io made a special case and skipped here?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The test cases in this class depends on the global state of the previous test ran, which brakes the principle of unit test and having sox_io breaks it.

Copy link
Contributor

@vincentqb vincentqb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@mthrok mthrok merged commit 4b583ea into pytorch:master Jul 1, 2020
@mthrok
Copy link
Collaborator Author

mthrok commented Jul 1, 2020

thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants