Authors: Katerina Mantaraki, Alexios Papadopoulos Siountris, Sarkis Samouelian
- Download a Python version greather than 3.6
- Clone the repository
cd ai-review-classifier
- Download the Large Movie Review Dataset.
- Make sure to have extracted the aclImdb folder in the directory. Only the imdb.vocab file is needed.
numpy
pandas
tensorflow
scikit-learn
Our RNN model uses pre-trained word embeddings. Run the commands:
pip install fasttext
wget https://dl.fbaipublicfiles.com/fasttext/vectors-crawl/cc.en.300.bin.gz
gzip -d cc.en.300.bin.gz
In order to evaluate our custom implementations on development data run the following programs with your python version of choice:
random_forest.py
adaboost.py
logistic_regression.py
The program testing.py
evaluates our custom classifiers on testing data and compares them to their respective scikit-learn
classifiers.
To evaluate our RNN model on development and testing data, run rnn.py
.