-1. sometimes flips to 1. when reading FLAC format #274

gbeckers · 2020-07-28T18:53:44Z

I noticed when converting WAV PCM_24 to FLAC PCM_24 (but same goes for PCM_16) that values -1. in the wave file sometimes are flipped to 1 when reading the FLAC file. This was in a long sound file, but I managed to reproduce the issue with the code and fragment of the sound pasted below. The issue seems to be when reading the file because when I open the flac in Audacity, it is OK, no sign flip. This is on windows, pysoundfile version 0.9.0, libsnd version 1.0.27.

import numpy as np
import soundfile as sf
print(sf.__version__)
print(sf.__libsndfile_version__)

ar  = np.array([[ 0.99999976,  0.90140069],
       [ 0.99999988,  0.9933517 ],
       [ 0.99999988,  0.99999988],
       [ 0.99999976,  0.99999976],
       [ 0.99999988,  0.99999976],
       [ 0.99999976,  0.99999988],
       [ 0.99999976,  0.99999976],
       [ 0.99999988,  0.99999988],
       [ 0.99999976,  0.99999988],
       [ 0.99999988,  0.99999988],
       [ 0.99999988,  0.99999988],
       [ 0.99999976,  0.99999988],
       [ 0.93490779,  0.96048272],
       [ 0.58039367,  0.87977421],
       [ 0.43389213,  0.85307777],
       [ 0.37074029,  0.88655353],
       [ 0.07136083,  0.94096255],
       [-0.50475669,  0.9776206 ],
       [-0.99999988,  0.99565017],
       [-1.        ,  0.99999988]], dtype='float64')

f = sf.write('test.wav', ar, samplerate=44100, format='WAV', subtype='PCM_16')
f = sf.write('test.flac', ar, samplerate=44100, format='FLAC', subtype='PCM_16')
print('wav', sf.read('test.wav')[0][-1])
print('flac', sf.read('test.flac')[0][-1]) # -1 flipped to 1!

gbeckers · 2020-07-28T18:56:42Z

Forgot to append the output of the code on my machine:

0.9.0
1.0.27
wav [-1. 0.99996948]
flac [1. 0.99996948]

gbeckers · 2020-07-28T19:59:07Z

Sorry see that this was already reported in #265, and that the cause is in libsndfile. I hope this is fixed because converting from wav to flac is something that is probably done a lot to compress losslessly.

Edit: it even seems to be related to FLAC itself: see https://sourceforge.net/p/flac/bugs/476/ But it appears that this bug does not have a high priority, which I find very surprising. It introduces large artifacts in audio files without warning. I would warn people not to use FLAC for important applications. The fact that it is seen as lossless may lead people to delete their wav files after converting (I almost did this, glad I verified first...).

bastibe · 2020-07-29T07:14:37Z

Thank you for reporting this error! This is troubling indeed. Let's hope they publish a fix soon.

gbeckers · 2020-07-29T11:55:58Z

In the meantime, for those who want to avoid being bitten by this bug: when reading FLAC data in SoundFile, don't use the default dtype='float64' (which causes the buggy normalization to [-1,1) in the underlying libsndfile). Read it with dtype='int32', which avoids normalization and the bug.

Having said that, I was just trying to verify the above in practice practive by creating 16+ hours of FLAC data from WAV PCM_24, and when comparing them. I find indeed no numeric differences when using 'int32', whereas 'float64' does lead to normalization sign-flip errors. But I do get "RuntimeError: Internal psf_fseek() failed" in one of the FLAC files when reading somewhere in the middle.... whereas trying to read before or after this spot is fine. I can play the file fine in VLC or open it in Audacity.

mgeier · 2020-08-18T12:28:02Z

This (or a similar) problem has also been mentioned in the libsndfile-devel mailing list on Jun 1, 2020, 12:29 PM. I didn't find a mailing list archive, so I'm pasting the text here:

Hello,

in a recent project I started using flac files read and written with libsndfile.
We soon encountered problems whenever the samples stored are clipped by libsndfile.
After looking into the source code src/flac.c I discovered that the
clipping code is a copy paste of the code used for other formats, however in the flac format case
it does not work because the array type for flac is alway int32_t.

Here an example

f2flac16_clip_array (const float *src, int32_t *dest, int count, int normalize)
{       float normfact, scaled_value ;

        normfact = normalize ? (8.0 * 0x1000) : 1.0 ;

        while (--count >= 0)
        {       scaled_value = src [count] * normfact ;
cut ...
                if (CPU_CLIPS_NEGATIVE == 0 && scaled_value <= (-8.0 * 0x1000))
                {       dest [count] = 0x8000 ;
                        continue ;
                        } ;
                dest [count] = lrintf (scaled_value) ;
                } ;

This line

dest [count] = 0x8000 ;

would produce a negative value if dest would be int16_t
but here dest is int32_t and so the result is positive.
Subsequently, flac does not handle these out of bounds values correctly
and after storing and retrieving the file in float format with libsndfile
we received float values of +10 for values that were initially sligthly
below -1.

In cas this may be of interest for my python interface I have created a patch here

https://github.com/roebel/conda_packages/blob/master/pysndfile/flac.c.patch

Best

-- 
Axel Roebel

Here's a link to the mentioned patch: https://github.com/roebel/conda_packages/blob/master/pysndfile/flac.c.patch

gbeckers mentioned this issue Jul 29, 2020

Unable to complete round trip tests for a FLAC File using libsnd library libsndfile/libsndfile#504

Closed

mgeier mentioned this issue Aug 5, 2020

unable to write FLAC files with SoundFile bastibe/libsndfile-binaries#2

Open

carolineechen mentioned this issue Jun 23, 2021

[BC-Breaking] Default to PCM_16 for flac on soundfile backend pytorch/audio#1604

Merged

RetroCirce mentioned this issue Feb 14, 2023

RuntimeError: Internal psf_fseek() failed. LAION-AI/CLAP#61

Closed

rlangman mentioned this issue Sep 21, 2023

[TTS] Read audio as int32 to avoid flac read errors NVIDIA/NeMo#7477

Merged

8 tasks

twdragon mentioned this issue Apr 24, 2024

soundfile FLAC bitflip issue dscripka/openWakeWord#163

Open

tompollard mentioned this issue Jul 10, 2024

Broken with libFLAC 1.4.0 through 1.4.2 ("internal psf_fseek() failed") MIT-LCP/wfdb-python#488

Open

m1cha3lya1r mentioned this issue Nov 20, 2024

flac returning ValueError: array is too big #361

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

-1. sometimes flips to 1. when reading FLAC format #274

-1. sometimes flips to 1. when reading FLAC format #274

gbeckers commented Jul 28, 2020

gbeckers commented Jul 28, 2020

gbeckers commented Jul 28, 2020 •

edited

Loading

bastibe commented Jul 29, 2020

gbeckers commented Jul 29, 2020

mgeier commented Aug 18, 2020

-1. sometimes flips to 1. when reading FLAC format #274

-1. sometimes flips to 1. when reading FLAC format #274

Comments

gbeckers commented Jul 28, 2020

gbeckers commented Jul 28, 2020

gbeckers commented Jul 28, 2020 • edited Loading

bastibe commented Jul 29, 2020

gbeckers commented Jul 29, 2020

mgeier commented Aug 18, 2020

gbeckers commented Jul 28, 2020 •

edited

Loading