Table of Contents
Deep learning approaches are progressively gaining popularity as alternative to HMM models for speaker identification. Promising results have been obtained with Convolutional Neural Networks (CNNs) fed by raw speech samples or raw spectral features, although this methodology does not fully take into account the temporal sequence in which speech is produced.
DNN-HMM (Deep Neural Network-Hidden Markov Model) is a methodology that combines the statistical modeling power of HMMs with the learning power of deep neural networks. While this technique has seen wide use in speech recognition field, few studies tried to apply it to speaker identification tasks.
This study proposes a novel approach to the DNN-HMM methodology for text-independent speaker identification, involving the use of both convolutional and Long-Short-Term-Memory (LSTM) networks, in order to extract both high-level features from the entire audio and temporal-wise features from each frame, which are then used to predict the emission probabilities of an HMM.
The experiments conducted on the TIMIT dataset showed very promising results, suggesting that the proposed non-sequential architecture may converge faster and perform better than other known methods, if properly tuned.
Install the requirements using the pip utility (may require to run as sudo).
#PyPI
pip install -r requirements.txt
Firstly clone the github repo
git clone https://github.com/MattiaLimone/dnn-hmm.git
Addittionally you have to install this library that probably has failed installing during prerequiste step, it's just a copy LPCTorch https://github.com/yliess86/LPCTorch with updated dependencies.
Use pip uitlity to install the dependency from our Repo (may require to run as sudo).
pip install https://github.com/Attornado/LPCTorch2/archive/refs/heads/master.zip
Contributions are what make the open source community such an amazing place to learn, inspire, and create. Any contributions you make are greatly appreciated.
If you have a suggestion that would make this better, please fork the repo and create a pull request. You can also simply open an issue with the tag "enhancement". Don't forget to give the project a star! Thanks again!
Distributed under the GNU General Public License v2.0. See LICENSE.txt
for more information.
Mattia Limone [Linkedin profile]
Andrea Terlizzi [Send an email]
Carmine Iannotti [Linkedin Profile]
Luca Strefezza