-
Notifications
You must be signed in to change notification settings - Fork 0
Speech and Audio Processing
Carlos Lizarraga-Celaya edited this page Nov 7, 2024
·
1 revision
Develop skills in applying deep learning techniques for speech recognition, synthesis, and audio-based applications.
1. Preprocessing and feature extraction from audio data
2. Implementing neural network architectures for speech tasks
3. Deploying speech-based models in production environments
1. Audio data preprocessing (resampling, normalization, windowing)
2. Feature extraction from audio signals (MFCC, spectrogram, wav2vec)
3. Speech recognition using recurrent neural networks and Transformers
4. Text-to-speech and speech synthesis using generative models
5. Audio event detection and classification
- "Deep Learning for Audio, Signal, and Image Processing" by Sudhakar Kumawat et al.
- "Automatic Speech Recognition: A Deep Learning Approach" by Dong Yu and Li Deng
- Coursera course "Audio Signal Processing for Music Applications" by Universitat Pompeu Fabra