-
Notifications
You must be signed in to change notification settings - Fork 29
Dataset synthesis step failing #8
Comments
could you solve the issue? |
I think it was a typo in the docs and we were supposed to run the similarly named .py file, but there are some other errors with that. Still debugging |
I've had a few issues getting the dataset built as well. This is what I have had to do so far:
|
Exactly the same here, except rather than rename the path I just moved the raw files up a level I started running |
As an update, the synthesizer takes quite a while depending on your machine. I'm using a AMD Ryzen Threadripper processor with the raw files loaded onto a native SSD, it took about ~115 hrs just to generate the training_set. Seems like the validation set will take roughly the same. @A-Telfer I'mnot sure what you mean by "no progress bar," as I periodically saw output from the synthesizer script indicating when it had to retry synthesizing some of the audio files:
I believe that it should work on a partial dataset, based on the information given during the orientation. Is this not the output you saw @A-Telfer? For me, the training_set completed with the following properties: 180,003 items, totalling 172.8 GB. Not entirely sure if this is the expected output of the synth script. |
Hi all, I have fixed the typo in the readme. As you already noted, it should have been Assuming you have downloaded your dataset in python noisyspeech_synthesizer.py -root ./data/datasets_fullband/
python noisyspeech_synthesizer.py -root ./data/datasets_fullband/ -is_validation_set true The synthesis does take a lot of time, and there is no progress bar in the script. A way to monitor the progress is:
These should print out the number of samples generated. It will give you the number of samples generated in training and validation set respectively. @BujSet 180,003 items looks correct. It's 60k audio samples for clean, noise, and noisy. |
I have librosa v0.10.0 and numpy v1.23.5 and it worked, but in |
@daevem thanks for putting up this information. Perhaps the
|
Running the dataset synthesis step
... result in this error
The text was updated successfully, but these errors were encountered: