Deep Fake Counter or "DFC" aims to bring a classification method to determining if an image was generated by AI or not.
Create the conda environment
conda create -y -n deepfakecounter python=3.10
Activate the environment
conda activate deepfakecounter
Pip install the requirements.txt
pip install -r requirements.txt
Change directory into the src folder then run main.py
cd src/
Go into the config.py file. I'm assuming you're on a gpu. If not, change ACCELERATOR to 'cpu' then run the following
python main.py
The winning architecture of the 4 I implemented, is as follows...
- Tail
- 4 convolutions where after each convolution, we do the following
- BatchNorm
- MaxPool
- RELU Activation
- 4 convolutions where after each convolution, we do the following
- Flatten
- Head
- 2 Linear Layers
- RELU Activation
- Output Linear Layer
- Sigmoid
- 2 Linear Layers
You can find all the implementations in the src/models/cnn.py file. The model not commented is the 4_conv_batch_3_linear architecture.
Architectures / Eval | Training | Test |
---|---|---|
2_conv_2_linear_paper | 0.944 | 0.927 |
3_conv_3_linear | 0.987 | 0.951 |
4_conv_batch_3_linear | 0.984 | 0.952 |
5_conv_batch_3_linear | 0.979 | 0.945 |
I was perfectly able to replicate the CIFAKE paper's test accuracy results. While their proposal was good, I thought to use more convolutions and batch normalization. The reason is simply because when it comes to detecting fake pictures from the real ones, I thought that we needed to pay closer attention to finer details. When you're trying to figure out if this picture of a human has been deep faked, you can tell easily by the hair. Sometimes in fake images, the hair may be dull in texture - this is a dead giveaway. The same can be said about hands along with the background of the subject. Identifying these small pattern differences is what I believe is the "end all be all" for detecting fake images.
Digging for too small of detail may also hinder the performance hence why greater than 4 convolutions started to hinder performance for the model. Finding that sweet spot of 4 convolutions seems to be what works best for identifying fakes.
- Expand work to classify fake human photos. Currently, the model fails to classify fake human pictures correctly.
- Implement a Stable Diffusion model, so I can have more fake image data
- Begin working on deep fake human photos instead of using the CIFAKE dataset.