Skip to content
This repository has been archived by the owner on Dec 3, 2024. It is now read-only.

Dataset synthesis step failing #8

Open
A-Telfer opened this issue Apr 30, 2023 · 8 comments
Open

Dataset synthesis step failing #8

A-Telfer opened this issue Apr 30, 2023 · 8 comments

Comments

@A-Telfer
Copy link

A-Telfer commented Apr 30, 2023

Running the dataset synthesis step

python microsoft_dns/noisyspeech_synthesizer.cfg -root ./

... result in this error

  File "microsoft_dns/noisyspeech_synthesizer.cfg", line 35
    audioformat: *.wav
@kazi-m22
Copy link

kazi-m22 commented May 1, 2023

could you solve the issue?

@A-Telfer
Copy link
Author

A-Telfer commented May 1, 2023

I think it was a typo in the docs and we were supposed to run the similarly named .py file, but there are some other errors with that. Still debugging

@BujSet
Copy link

BujSet commented May 3, 2023

I've had a few issues getting the dataset built as well. This is what I have had to do so far:

  1. I modified the origin points in the noisyspeech_synthesizer.py file these lines since the default download script extracts the raw files to /microsoft_dns/datasets_fullband/datasets_fullband/.
  2. I have multiple versions of python on my environment, so for me the the command to be run is
    python noisyspeech_synthesizer.py -root ./
  3. However, when I first ran that command, I got many versioning errors. After a little digging, it seems that the latest version of librosa is not compatible with the latest version of numpy. I had to downgrade my numpy version from 1.24.3 to 1.23.5, and downgrade my librosa version from 0.10.0 to 0.8.1.
  4. After that, running the command in step 2 generates the training and validation files (I think). This is currently in progress for me, but I'll followup if this completes without error.

@A-Telfer
Copy link
Author

A-Telfer commented May 4, 2023

Exactly the same here, except rather than rename the path I just moved the raw files up a level

I started running 4 on my laptop and gave up waiting after 60,000+ or so since there was no progress bar (code uses while loops so not immediately clear how long it would take) and started thinking it might not work on a partial dataset download.

@BujSet
Copy link

BujSet commented May 9, 2023

As an update, the synthesizer takes quite a while depending on your machine. I'm using a AMD Ryzen Threadripper processor with the raw files loaded onto a native SSD, it took about ~115 hrs just to generate the training_set. Seems like the validation set will take roughly the same.

@A-Telfer I'mnot sure what you mean by "no progress bar," as I periodically saw output from the synthesizer script indicating when it had to retry synthesizing some of the audio files:

Number of files to be synthesized: 60000
Start idx: 0
Stop idx: 59999
Generating synthesized data in ./
Warning: File #5 has unexpected clipping, returning without writing audio to disk
Warning: File #29 has unexpected clipping, returning without writing audio to disk
...
Warning: File #1114 has unexpected clipping, returning without writing audio to disk
Warning: File #1130 has unexpected clipping, returning without writing audio to disk
Found exception
Input signal length=0 is too small to resample from 48000->16000
Trying again
Found exception
Input signal length=0 is too small to resample from 48000->16000
Trying again
Found exception
Input signal length=0 is too small to resample from 48000->16000
Trying again
Found exception
Input signal length=0 is too small to resample from 48000->16000
Trying again
Warning: File #1151 has unexpected clipping, returning without writing audio to disk
Warning: File #1164 has unexpected clipping, returning without writing audio to disk
...
Warning: File #29071 has unexpected clipping, returning without writing audio to disk
Warning: File #29090 has unexpected clipping, returning without writing audio to disk
Found exception
Input signal length=0 is too small to resample from 48000->16000
Trying again
Found exception
Input signal length=0 is too small to resample from 48000->16000
Trying again
Found exception
Input signal length=0 is too small to resample from 48000->16000
Trying again
Warning: File #29107 has unexpected clipping, returning without writing audio to disk
Warning: File #29118 has unexpected clipping, returning without writing audio to disk
...
Warning: File #34891 has unexpected clipping, returning without writing audio to disk
Warning: File #34891 has unexpected clipping, returning without writing audio to disk
Found exception
Input signal length=0 is too small to resample from 48000->16000
Trying again
Found exception
Input signal length=0 is too small to resample from 48000->16000
Trying again
Found exception
Input signal length=0 is too small to resample from 48000->16000
Trying again
Found exception
Input signal length=0 is too small to resample from 48000->16000
Trying again
Found exception
Input signal length=0 is too small to resample from 48000->16000
Trying again
Found exception
Input signal length=0 is too small to resample from 48000->16000
Trying again
Found exception
Input signal length=0 is too small to resample from 48000->16000
Trying again
Warning: File #34919 has unexpected clipping, returning without writing audio to disk
Warning: File #34925 has unexpected clipping, returning without writing audio to disk
...
Warning: File #44648 has unexpected clipping, returning without writing audio to disk
Warning: File #44661 has unexpected clipping, returning without writing audio to disk
Found exception
Input signal length=0 is too small to resample from 48000->16000
Trying again
Found exception
Input signal length=0 is too small to resample from 48000->16000
Trying again
Found exception
Input signal length=0 is too small to resample from 48000->16000
Trying again
Found exception
Input signal length=0 is too small to resample from 48000->16000
Trying again
Warning: File #44699 has unexpected clipping, returning without writing audio to disk
Warning: File #44711 has unexpected clipping, returning without writing audio to disk
...
Warning: File #59935 has unexpected clipping, returning without writing audio to disk
Warning: File #59962 has unexpected clipping, returning without writing audio to disk
Warning: File #59989 has unexpected clipping, returning without writing audio to disk
Warning: File #59991 has unexpected clipping, returning without writing audio to disk
Warning: File #59997 has unexpected clipping, returning without writing audio to disk
Of the 466391 clean speech files analyzed, 2.6% had clipping, and 46.8% had low activity (below 60.0% active percentage)
Of the 221062 noise files analyzed, 18.4% had clipping, and 0.0% had low activity (below 0.0% active percentage)

I believe that it should work on a partial dataset, based on the information given during the orientation. Is this not the output you saw @A-Telfer?

For me, the training_set completed with the following properties: 180,003 items, totalling 172.8 GB. Not entirely sure if this is the expected output of the synth script.

@bamsumit
Copy link
Contributor

Hi all, I have fixed the typo in the readme. As you already noted, it should have been noisyspeech_synthesizer.py, not .cfg

Assuming you have downloaded your dataset in ./data/datasets_fullband/, the commands to execute are

python noisyspeech_synthesizer.py -root ./data/datasets_fullband/
python noisyspeech_synthesizer.py -root ./data/datasets_fullband/ -is_validation_set true

The synthesis does take a lot of time, and there is no progress bar in the script. A way to monitor the progress is:

ls -l data/datasets_fullband/training_set/clean/*.wav | wc -l
ls -l data/datasets_fullband/validation_set/clean/*.wav | wc -l

These should print out the number of samples generated. It will give you the number of samples generated in training and validation set respectively. @BujSet 180,003 items looks correct. It's 60k audio samples for clean, noise, and noisy.

@daevem
Copy link

daevem commented Jun 6, 2023

I've had a few issues getting the dataset built as well. This is what I have had to do so far:

  1. I modified the origin points in the noisyspeech_synthesizer.py file these lines since the default download script extracts the raw files to /microsoft_dns/datasets_fullband/datasets_fullband/.
  2. I have multiple versions of python on my environment, so for me the the command to be run is
    python noisyspeech_synthesizer.py -root ./
  3. However, when I first ran that command, I got many versioning errors. After a little digging, it seems that the latest version of librosa is not compatible with the latest version of numpy. I had to downgrade my numpy version from 1.24.3 to 1.23.5, and downgrade my librosa version from 0.10.0 to 0.8.1.
  4. After that, running the command in step 2 generates the training and validation files (I think). This is currently in progress for me, but I'll followup if this completes without error.

I have librosa v0.10.0 and numpy v1.23.5 and it worked, but in microsoft_dns/noisyspeech_synthetizer_singleprocess.py line 90 I had to change librosa.resample(arg1, arg2, arg3) to librosa.resample(input_audio, orig_sr=fs_input, target_sr=fs_output).

@bamsumit
Copy link
Contributor

bamsumit commented Jun 6, 2023

@daevem thanks for putting up this information. Perhaps the librosa interface has changed at some point. A working combination we have for the current version of code is with

librosa==0.9.2
numpy==1.23.3

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants