This project was completed during spring term for my Machine Learning module as part of my MSc Data Science and AI at Queen Mary, University of London.
The NLP project consist of a ML pipeline that takes human voice recordings as input and classifies if a recording belongs to a specific participant or not as output. It uses several classification models to do so, and it does seem to succed at the classification task. There is room for improvement on this project, the main thing to improve would be to create a less imbalanced dataset.
The dataset used for this project is avalable on Kaggle: https://www.kaggle.com/datasets/jesusrequena/mlend-london-sounds