Speech is the most natural way of expressing ourselves as humans. It is only natural then to extend this communication medium to computer applications. We define speech emotion recognition (SER) systems as a collection of methodologies that process and classify speech signals to detect the embedded emotions.
1-D CNN model was made when dealing with data was made on it in audio format.
2-D cNN model was made when dealing with data was made on it after converting it to mel-spectrogram format.
(detailed project requirements can be found in the PR-assignment 3 pdf)