13LID

This repository contains a CNN architecture to classify 13 Indian Languages from their spoken utterance. 2 second short utterances have also been classified.

This is not a very complex model but still performs good, gives accuracy of around 70% for 5 second utterances and 66.7% for 2 second ones. Performance through heatmaps (which helps in identifying confused language pairs) is also shown below.

The dataset used in this repo is described in this paper.

Requirements

python3
librosa - to generate mel spectrograms
numpy
sklearn
tensorflow-gpu 1.10.0
keras - 2.2.4

Hardware

1 Nvidia GeForce GTX 1080 Ti gpu was used for training (gpu recommended), which was available through my institute's high performance computing cluster and it uses slurm as workload manager.

How to use it

Make a folder for your dataset, inside which all languages have their separate folders containing their respective wav files. Then change the value of data_folder variable in all feature extraction codes.
Then run feature extraction to generate mel spectrograms.
Then use the classifier notebook to train the CNN models. This notebook saves your trained model in hdf5 format.
You can classify.py script to classify any utterance by typing the following command : python3 classify.py path_file, where path_file is the path to the file you want to classify.

Theory

mel spectrogram = spectrogram with frequency mapped to mel space. See

Now since mel spectrograms are a 2-dimensional signal, we can use CNNs as we do for image classification.

I have used a combination of convolution+pooling+dropout twice, then 2 dense layers followed by softmax activation. All activations are done with Relu. For detailed description regarding the model, please look at the model summary given in the classification notebook.

Heatmaps

We can clearly see that gujarati-marathi is the most confused language pair for our classifier :

Name		Name	Last commit message	Last commit date
Latest commit History 10 Commits
13LID_Feature_extraction_2sec.ipynb		13LID_Feature_extraction_2sec.ipynb
13LID_classifier.ipynb		13LID_classifier.ipynb
13LID_feature_extraction.ipynb		13LID_feature_extraction.ipynb
2sec_heatmap.png		2sec_heatmap.png
README.md		README.md
class_names.pickle		class_names.pickle
classify.py		classify.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

13LID

Requirements

Hardware

How to use it

Theory

Heatmaps

About

Releases

Packages

Languages

pj1527/13LID

Folders and files

Latest commit

History

Repository files navigation

13LID

Requirements

Hardware

How to use it

Theory

Heatmaps

About

Topics

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages