Backport patch for torchaudio, cleanups #27

h-vetinari · 2022-07-25T20:32:39Z

CC @mmcauliffe, since I saw the other patch was from you - would appreciate your input on this.

…nda-forge-pinning 2022.07.25.18.25.33

conda-forge-linter · 2022-07-25T20:32:43Z

Hi! This is the friendly automated conda-forge-linting service.

I just wanted to let you know that I linted all conda-recipes in your PR (recipe) and found it was in an excellent condition.

mmcauliffe · 2022-07-25T20:51:33Z

recipe/patches/0003-Fix-fileobj-I-O-un-deterministic-behavior.patch

+> ⚠️ the upstream has a fix https://sourceforge.net/p/sox/code/ci/bb38934e11035c8fab141f70dabda3afdd17da36/,
+> but this fix seems to make `is_seekable` return `true` for the in-memory file
+> object case (need to verify this), which is the opposite of the behavior we
+> want in torchaudio implementation.


What's the issue for torchaudio with seekable in-memory file objects?

Hey @mmcauliffe, thanks for the quick response!

What's the issue for torchaudio with seekable in-memory file objects?

We'd have to ask @mthrok who did the debugging/digging/fixing in pytorch/audio#1297 ff.

It's described in the upstream patch, the original implementation fstat(fileno((FILE*)ft->fp), &st) did not check the return value of fileno. and the fstat is failing but ignored, as a result st.st_mode attribute is not initialized and has random value. So it caused inconsistent behavior.

I think the patch from the upstream fixes this properly and elegantly, while my patch makes the code return the opposite result, which makes it impossible for torchaudio to adopt the upstream libsox codebase.

So I do not recommend applying my patch. I plan to get rid of this in torchaudio by migrating the in-memory decoding/encoding capability to FFmpeg.

I do not know the policy of condo-forge about BC-breaking, but once you apply my patch, you will have technical debt hard to fix.

Thanks for the response!

I think the patch from the upstream fixes this properly and elegantly, while my patch makes the code return the opposite result, which makes it impossible for torchaudio to adopt the upstream libsox codebase.

I don't understand the need for divergence (if their solution is elegant, why couldn't it work for torchaudio? If it doesn't work for torchaudio, how can it be fixed properly?)

I do not know the policy of conda-forge about BC-breaking, but once you apply my patch, you will have technical debt hard to fix.

Not sure what you mean with BC, maybe binary compatibility? We shouldn't break except where upstream does so as well (e.g. major releases). Thanks for the heads up. I might have to bite the bullet and take the vendored sox for torchaudio then until you get rid of it (IIUC)...

I don't understand the need for divergence (if their solution is elegant, why couldn't it work for torchaudio? If it doesn't work for torchaudio, how can it be fixed properly?)

I debugged the issue myself and created the patch tailored for torchaudio's need, and after a while I realized that the upstream had fix which would not work for torchaudio.

Not sure what you mean with BC

Applying the patch from upstream will fix the issue of is_seekable() returning random value, and it will return false for in-memory audio data. If you apply my patch, then is_seekable() will return true for in-memory audio data. If you apply my patch, and when torchaudio gets rid of my patch, libsox from condo-forge will be the only one with the custom behavior, which I think is something you want to get rid of at some point. Then getting rid of my patch incurs backward-incompatible change, which will disturb user experience.

well, so far we don't have torchaudio at all, so there's no user experience to break yet.

And IIUC, you're going to keep the torchaudio-specific behaviour also when you switch to ffmpeg as a backend?

However, the sox builds in conda-forge (with an "a" 😉) are not just used by torchaudio, so we have to keep that in mind.

This leaves three options:

do nothing --> users stay exposed to the hard to debug behaviour

use the torchaudio-fix --> should be an improvement, but they might get used to the "wrong" behaviour (assuming sox ever does another release, last one was 7.5 years ago...)

use the upstream fix --> incompatible with torchaudio, but compatible with possible future sox releases (also means we have to vendor lots of stuff into torchaudio, which is also unappealing).

They all have their pros & cons TBH. Not sure if anyone else from @conda-forge/sox has an opinion on this?

mmcauliffe · 2022-07-25T21:10:48Z

Ok, yeah, not a huge deal for the difference between upstream and the patch here, mostly just curious. I don't think it would impact any of my use cases, since MFA just uses file objects on disk. But yeah, looks good to me!

h-vetinari added 9 commits July 25, 2022 21:48

normalize patches

9c1b1bb

backport patch necessary for torchaudio

3bbace0

bump build number

3de0413

MNT: Re-rendered with conda-build 3.21.9, conda-smithy 3.21.0, and co…

4adfce5

…nda-forge-pinning 2022.07.25.18.25.33

actually, take patch from torchaudio repo

346e989

clean up bld.bat

022c42a

use environment variables directly

3d0239e

clean up build.sh

c49ed08

use build folder

973d5ca

h-vetinari requested review from 183amir, alexbw and sdvillal as code owners July 25, 2022 20:32

h-vetinari mentioned this pull request Jul 25, 2022

WIP: package torchaudio conda-forge/staged-recipes#17082

Closed

6 tasks

mmcauliffe reviewed Jul 25, 2022

View reviewed changes

h-vetinari mentioned this pull request Jul 25, 2022

RFC: The future of Kaldi compliance module pytorch/audio#1269

Open

build only shared libs

39f5ca8

mmcauliffe approved these changes Jul 25, 2022

View reviewed changes

h-vetinari added 2 commits July 25, 2022 23:34

clean up tests

37fdbbf

the magic sauce to get a .lib

5506a3f

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Backport patch for torchaudio, cleanups #27

Backport patch for torchaudio, cleanups #27

h-vetinari commented Jul 25, 2022

conda-forge-linter commented Jul 25, 2022

mmcauliffe Jul 25, 2022

h-vetinari Jul 25, 2022

mthrok Jul 25, 2022

h-vetinari Jul 25, 2022

mthrok Jul 25, 2022

h-vetinari Jul 25, 2022

h-vetinari Jul 25, 2022

mmcauliffe commented Jul 25, 2022

Backport patch for torchaudio, cleanups #27

Are you sure you want to change the base?

Backport patch for torchaudio, cleanups #27

Conversation

h-vetinari commented Jul 25, 2022

conda-forge-linter commented Jul 25, 2022

mmcauliffe Jul 25, 2022

Choose a reason for hiding this comment

h-vetinari Jul 25, 2022

Choose a reason for hiding this comment

mthrok Jul 25, 2022

Choose a reason for hiding this comment

h-vetinari Jul 25, 2022

Choose a reason for hiding this comment

mthrok Jul 25, 2022

Choose a reason for hiding this comment

h-vetinari Jul 25, 2022

Choose a reason for hiding this comment

h-vetinari Jul 25, 2022

Choose a reason for hiding this comment

mmcauliffe commented Jul 25, 2022