"# cmpe130_group1_spam_classifier"
Python Version: Python 3.6.1
The following python packages are required:
beautifulsoup4==4.6.0
bs4==0.0.1
click==6.7
cycler==0.10.0
Flask==0.12.2
itsdangerous==0.24
Jinja2==2.10
MarkupSafe==1.0
matplotlib==2.1.0
nltk==3.2.5
numpy==1.13.3
olefile==0.44
pandas==0.21.0
Pillow==4.3.0
pyparsing==2.2.0
python-dateutil==2.6.1
pytz==2017.3
six==1.11.0
Werkzeug==0.13
wordcloud==1.3.1
Jupyter Notebook
- Jupyter Notebook is required to run the "Spam Detection.ipynb"
- The cells can be run one-by-one, but can also be ran all at once.
- Output is shown in below in their respective cells.
In order to run Flask app,
- Clone Repo
- Run command
- if using Linux: "export FLASK_APP=app.py"
- Code may require certain packages be downloaded in order to run
- if using Windows: "set FLACK_APP=app.py"
- if using Linux: "export FLASK_APP=app.py"
- flask run
- Open browser and go to http://localhost:5000/
- Once web app opens,
- Enter a comment
- Click "Submit" to test comment
- Result will be displayed as "Spam" or "Ham" where "Spam" means it is considered a Spam comment and "Ham" means it is a legitimate comment.
- You may observe:
- Ham Word Cloud: displays words often used in legitimate comments.
- Spam Word Cloud: displays words often used in spam comments
- Ham Word Frequency: displays top 100 words used in legitimate comments and how many times they appeared
- Spam Word Frequency: displays top 100 words used in spam comments and how many times they appeared.
The site can also be accessed at: https://youtube-spam-classifier.herokuapp.com/