Scripts to process festcat and google_tts data, to make them compatible with training of modern TTS architectures
sox
, ffmpeg
- Clone this repository:
git clone [email protected]:projecte-aina/festcat-process.git
- Open the shell script festcat_processing.sh and modify the following variables:
export SPEAKER_NAME=... # Speaker name or ID
export EXTRACT_PATH=... # Absolute path to the script "extract_festcat.py"
export WAVS_PATH=... # Path to where the "upc_ca_(speaker_id)_raw" folder is located. (It must end with /)
export UTTERANCE_PATH=... # Path to where the "upc_ca_(speaker_id)_utt" folder is located. (It must end with /)
- Run the shell script festcat_processing.sh from the directory where "upc_ca_(speaker_id)raw" and "upc_ca(speaker_id)_utt" are located.
- Clone this repository:
git clone [email protected]:projecte-aina/festcat-process.git
- Open the shell script google_tts_processing.sh and modify the following variables:
export SPEAKER_NAME=... # Speaker name or ID
export EXTRACT_PATH=... # Absolute path to the script "extract_google_tts.py"
export WAVS_PATH=... # Path to where the "ca_es_(speaker_id)" folder is located. (It must end with /)
export TSV_PATH=... # Path to where the "line_index_(speaker_id).tsv" file is located. (It must end with)
- Run the shell script google_tts_processing.sh from the directory where "ca_es_(speaker_id)" and "line_index_(speaker_id).tsv" are located.
- Clone this repository:
git clone [email protected]:projecte-aina/festcat-process.git
- Open the shell script common_voice_processing.sh and modify the following variables:
#SBATCH --job-name=... # Set a name for the Job
#SBATCH -D .
#SBATCH --output=.../common_voice.out # Path to ".out" log
#SBATCH --error=.../common_voice.err # Path to ".err" log
#SBATCH --gres=gpu:0
#SBATCH --nodes=1
#SBATCH -c 30
#SBATCH --time=2-0:00:00
export EXTRACT_PATH=".../extract_common_voice.py" # Absolute path to the script "extract_common_voice.py".
export TSV_PATH=".../validated.tsv" # Absolute path to the file "validated.tsv".
export CV_PATH=".../ca/" # Path to where the "clips" folder is located. (It must end with /)
export PROCESS_AUDIO="True" # "True" to process audio files, "False" otherwise.
export N_P="50" # Number of processes when processing audio files.
export SUMMARY="True" # If set to "True" it outputs .tsv files with a summary of the dataset.
export SPEAKERS_ID="[id_1, ..., id_n]" # List with speakers name or ID to be processed. If no List if passed, it process all speakers.
- Run the shell script common_voice_processing.sh.