This project focuses on detecting deepfake speech using a combination of Convolutional Neural Networks (CNN) and Recurrent Neural Networks (RNN).
-
Data Preparation
- Audio files are organized into a directory structure with 'data/fake' and 'data/real' subdirectories.
- The
generateDataCSV.py
script is used to generate CSV files for organizing the audio dataset into training, validation, and evaluation sets.
-
Data Preprocessing
- The
train1.py
script preprocesses the audio files to extract MFCC features. - MFCC features are saved to disk for future use.
- The
-
Model Training
- The
train1.py
script defines a CNN-RNN model and trains it on the preprocessed data. - The trained model is evaluated on the validation and test data.
- The
-
Model Evaluation
- The
eval.py
script evaluates the trained model on the test data.
- The
-
Running the Application
- The
app.py
script uses the trained model to classify audio files and creates a web-based user interface using Streamlit.
- The
-
Place your audio dataset in the following directory structure:
/path/to/root/dataset/ ├── data │ ├── fake │ └── real ├── generateDataCSV.py ├── train1.py ├── eval.py └── app.py
-
Run the
generateDataCSV.py
script to generate CSV files for organizing the audio dataset:python generateDataCSV.py
- The generated CSV files will be saved in the
csvFilesReduced
directory, as evaluate train and validate.csv.
- The generated CSV files will be saved in the
- Run the
train.py
script to train the model:python train.py
1.1. You may also Run the trainProcessedSample.py
script to train the model, here features are already extracted of a large dataset:
python trainProcessedSample.py
- Run the
eval.py
script to evaluate the trained model:python eval.py
- Run the
app.py
script to start the application:streamlit run app.py
- This is a basic model implementation. Feel free to modify and enhance it based on your requirements and dataset.