Recreating blizzard baseline #55

ghostcow · 2018-05-08T12:47:21Z

Hi all,

I'm trying to recreate the blizzard baseline model myself, via the following steps:

Downloaded the blizzard 2011 data, then trimmed the wav files using librosa.speech.trim with threshold 15
Ran extract_feats.py to extract features. Split off 1000 random samples for the validation set.
Trained a model using the following training scheme:

python train.py --data data/nancy_orig_feat --noise 1 --expName nancy_init --seq-len 10 --max-seq-len 1600 --nspk 1 --lr 1e-5 --epochs 10 --visualize && \
python train.py --data data/nancy_orig_feat --noise 1 --expName nancy --seq-len 1000 --max-seq-len 1000 --nspk 1 --lr 1e-4 --epochs 90 --visualize --checkpoint checkpoints/nancy_init/bestmodel.pth

Note: this scheme was devised by looking at your published args.pth, because the scheme in your README.md did not converge.

What results is an inferior model to your uploaded pretrained model. Uploaded are 4 samples demonstrating the issue. These are the sentences used:

"Generative adversarial network or variational auto-encoder.",
"Basilar membrane and otolaryngology are not auto-correlations.",
"He has read the whole thing.",
"He reads books."

The samples: samples.zip

What can be wrong? Please help me recreate your baseline.

Thanks

The text was updated successfully, but these errors were encountered:

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Recreating blizzard baseline #55

Recreating blizzard baseline #55

ghostcow commented May 8, 2018

Recreating blizzard baseline #55

Recreating blizzard baseline #55

Comments

ghostcow commented May 8, 2018