End to End Speech Recognition with Pykaldi/Deepspeech, webrtcvad, precise

Introduction

This repository is an end to end speech recognition toolkit which uses open source libraries such as PyKaldi, Deepspeech , WebrtcVAD, mycroft precise

Installation

Install Anaconda

conda create --name speech python=3.6
conda activate speech

Install PyKaldi
```
conda install -c pykaldi pykaldi
    
```

Install Deepspeech

pip3 install deepspeech # Install deepspeech-gpu if you want CUDA support
# Download the model
curl -LO https://github.com/mozilla/DeepSpeech/releases/download/v0.7.4/audio-0.7.4.tar.gz
tar xvf audio-0.7.4.tar.gz

Install WebrtcVAD
```
pip install webrtcvad
    
```

Install mycroft-precise

ARCH=x86_64
wget https://github.com/MycroftAI/precise-data/raw/dist/$ARCH/precise-engine.tar.gz
tar xvf precise-engine.tar.gz
sudo apt-get install portaudio19-dev
pip install pyaudio
pip install precise-runner

Steps

Download models

Download Kaldi model from here
Link all the hard coded paths in main.py (you might have to create online.conf, ivector_extractor.conf)
Precise models have to be generated (data can be obtained here)

Train the WakeWord Detector using the following steps:

Activate your anaconda environment
Record your audio samples using
```
precise-collect
    
```
If you are recording by other means, convert the samples to 16kHz 1 channel 16-bit PCM wav audio files
```
ffmpeg input.mp3 -acodec pcm_s16le -ar 16000 -ac 1 output.wav
    
```

Make a folder sequence of this manner

hey-computer/
|
+-- wake-word/
|   +-- hey-computer.00.wav
|   +-- hey-computer.01.wav
|   +-- hey-computer.02.wav
|   +-- hey-computer.03.wav
|   +-- hey-computer.04.wav
|   +-- hey-computer.05.wav
|   +-- hey-computer.06.wav
|   +-- hey-computer.07.wav
|   +-- hey-computer.08.wav
+-- not-wake-word/
+-- test/
    |
    +-- wake-word/
    |   +-- hey-computer.09.wav
    |   +-- hey-computer.10.wav
    |   +-- hey-computer.11.wav
    |   +-- hey-computer.12.wav
    +-- not-wake-word/

Once the data is ready Train it for 60 epochs

precise-train -e 60 hey-computer.net hey-computer/

You can test your code using precise-test

precise-test hey-computer.net hey-computer/

The accuracy will be low and the false activation’s will be high. To account for this we have to augment data with background

mkdir -p data/random
wget http://downloads.tuxfamily.org/pdsounds/pdsounds_march2009.7z
7z x pdsounds_march2009.7z # Install p7zip if not yet installed
cd ../../
SOURCE_DIR=data/random/mp3
DEST_DIR=data/random
for i in $SOURCE_DIR/*.mp3;
do echo "Converting $i..."; fn=${i##*/};
   ffmpeg -i "$i" -acodec pcm_s16le -ar 16000 -ac 1 -f wav "$DEST_DIR/${fn%.*}.wav";
done

Fine-tune your model with the augmented data

precise-train-incremental hey-computer.net hey-computer/

You can test the accuracy of your system using:

precise-test hey-computer.net hey-computer/

Convert your model to Tensorflow model
```
precise-convert hey-computer.net
    
```
To test your code in python use the sample_precise.py file, Change the model path to the required destination and run the code
```
conda activate speech
python sample_precise.py
    
```

Run the main code to test the pipeline

conda activate speech
python main.py

Using the API

The simple way is to call the SpeechRecon as an Object and then use the run method

The object consists of record variable which can be set to either True or False as per requirement

from main import SpeechRecon
speech_pipeline = SpeechRecon(record=False)
speech_pipeline.run()

Results

Authors

Prajwal Rao

Name		Name	Last commit message	Last commit date
Latest commit History 11 Commits
models		models
.gitignore		.gitignore
README.org		README.org
README.pdf		README.pdf
folder_structure.png		folder_structure.png
main.py		main.py
pykaldi_test.py		pykaldi_test.py
reqirements.txt		reqirements.txt
sample_precise.py		sample_precise.py
sample_precise_indefinite.py		sample_precise_indefinite.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

End to End Speech Recognition with Pykaldi/Deepspeech, webrtcvad, precise

Introduction

Installation

Steps

Download models

Train the WakeWord Detector using the following steps:

Run the main code to test the pipeline

Using the API

Results

Authors

About

Releases

Packages

Languages

prajwaljpj/PyKaldi-EndtoEnd-Recognition

Folders and files

Latest commit

History

Repository files navigation

End to End Speech Recognition with Pykaldi/Deepspeech, webrtcvad, precise

Introduction

Installation

Steps

Download models

Train the WakeWord Detector using the following steps:

Run the main code to test the pipeline

Using the API

Results

Authors

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages