This repository contains code for building a fake news detector. This project was developed during Data Science for All Women's Summit 2020.
- Fake news is a particularly important problem nowadays as many people rely on social media as their primary news source.1
- Impact of fake news
- Our solution is to develop a model for classifying news articles as real or fake.
- Model training: Fake and real news dataset (Kaggle).
- External validation: Fake news dataset from 2016 (Kaggle).
- Exploratory data analysis & text preprocessing
- Baseline model 1: doc2vec embedding & logistic regression
- Baseline model 2: Recurrent Neural Network
- Advanced model: BERT & transfer learning
- Model interpretability: LIME
- External validation
Please find our results in our project report or click on the following image to view our slides.
Iris Yoon
Rabiya Noori
Jerri Zhang
Renee G. Reynolds
Hannah Mei
1: Americans Who Mainly Get Their News on Social Media Are Less Engaged, Less Knowledgeable
2: A survey of fake news
3: False Rumor of Explosion at White House Causes Stocks to Briefly Plunge