-
This code is an implementation of AutoVC. The algorithm is based on the following paper:
-
the official code and demo which I referred are following:
-
Additional refer:
-
torch >= 1.5.0
-
tensorboardX >= 2.0
-
librosa >= 0.7.2
-
matplotlib >= 3.1.3
-
Optional for losses flow
- tensorboard >= 2.2.2
- Currently uploaded code is compatible with the following datasets.
- The O mark to the left of the dataset name is the dataset actually used in the uploaded result.
Dataset | Dataset address | ||
---|---|---|---|
O | VCTK | https://datashare.is.ed.ac.uk/handle/10283/2651 | |
O | LibriTTS | https://openslr.org/60/ | |
X | CMU Arctic | http://www.festvox.org/cmu_arctic/index.html | |
X | VoxCeleb1 | http://www.robots.ox.ac.uk/~vgg/data/voxceleb/ | |
X | VoxCeleb2 | http://www.robots.ox.ac.uk/~vgg/data/voxceleb/ |
Before proceeding, please set the pattern, inference, and checkpoint paths in 'Hyper_Parameter.yaml' according to your environment.
-
Sound
- Setting basic sound parameters.
-
Content_Encoder
- Setting the parameters of content encoder.
-
Style_Encoder
- Setting the parameters of style encoder.
- Encoder is a pre-trained speaker embedding model.
- All parameters must be matched to pre-trained speaker embedding.
-
Decoder
- Setting the parameters of decoder.
-
Postnet
- Setting the parameters of convolution postnet.
-
WaveNet
- Setting the parameters of Vocoder.
- This implementation uses a pre-trained Parallel WaveGAN model.
- If checkpoint path is
null
, model does not exports wav files. - If checkpoint path is not
null
, all parameters must be matched to pre-trained Parallel WaveGAN model.
-
Train
- Setting the parameters of training.
- When the number of speaekrs in your train dataset is small, I recommend to increase the
Train_Pattern/Accumulated_Dataset_Epoch
.
-
Inference_Path
- Setting the inference path
-
Checkpoint_Path
- Setting the checkpoint path
-
Log_Path
- Setting the tensorboard log path
-
Device
- Setting which GPU device is used in multi-GPU enviornment.
- Or, if using only CPU, please set '-1'.
python Pattern_Generate.py [parameters]
At least, one or more of datasets must be used.
- -vctk
- Set the path of VCTK. VCTK's patterns are generated.
- -vc1
- Set the path of VoxCeleb1. VoxCeleb1's patterns are generated.
- -vc2
- Set the path of VoxCeleb2. VoxCeleb2's patterns are generated.
- -libri
- Set the path of LibriTTS. LibriTTS's patterns are generated.
- -cmua
- Set the path of CMU Arctic. CMU Arctic's patterns are generated.
- -vc1t
- Set the path of VoxCeleb1 testset. VoxCeleb1's patterns are generated for an evaluation.
- -mw
- The number of threads used to create the pattern
python Train.py -s <int>
-s <int>
- The resume step parameter.
- Default is 0.
- When this parameter is 0, model try to find the latest checkpoint in checkpoint path.
- Current training....
- Please refer the demo site:
- Current training....
- This is the checkpoint of ? steps of 2 batchs (? epochs).
- Checkpoint link
- Hyperparameter link