This is a web application that classifies news articles into different categories such as sports, business, politics, tech, and entertainment. The app is built using Flask and Jinja and runs using Gunicorn. It utilizes a Scraper service to scrape articles using Beautiful Soup and a machine learning model built with scikit-learn to perform the classification.
-
Build the container:
docker build https://github.com/udaypat/News-Classifier.git -t news-classifier
-
Run the container:
docker run -p 8000:8000 news-classifier
To install it manually and run the News Article Classifier App on your local machine, follow these steps:
-
Clone the repository from GitHub:
git clone https://github.com/udaypat/News-Classifier.git
-
Navigate to the project directory:
cd server
-
Install the required dependencies using pip:
pip install -r requirements.txt
-
To train the model again:
cd predictor jupyter nbconvert --execute classify.ipyb
-
To start the Flask application:
cd app gunicorn main:app
-
Access the app in your web browser at
http://localhost:5000
.
-
The ML model was trained on bbc-news dataset. - https://storage.googleapis.com/dataset-uploader/bbc/bbc-text.csv
-
After removing stopwords the accurcay was measured of different classifiers.
-
Trained model was dumped using pickle and use by a flask endpoint to make prediction.
-
Scraping of data was done using BeautifulSoup. All
tags were extrated, If there was no
tags Plain text was extracted.
-
On basis of that new predicitons were made and stored in the Sqlite Database.
-
Open the News Article Classifier App in your web browser.
-
On the homepage, you will find a button which will redirect you where can enter the URL of a news article.
-
Enter the URL of the article you want to classify and click the "Classify" button.
-
The app will scrape the article using the Scraper service and pass it to the machine learning model for classification.
-
Once the classification is complete, the app will display the predicted category of the article on the result page.
-
App will store previous results in a database and will show a list after prediction.
This News Article Classifier App is designed to be hosted on Linux. To host the app on Linux, follow these steps:
-
Create a Droplet on DigitalOcean with your preferred specifications and operating system or use your own system.
-
SSH into your Droplet.
ssh root@ip
-
Run
sudo apt update sudo apt install python3-pip
-
Clone the repository onto your Droplet:
git clone https://github.com/udaypat/News-Classifier.git
-
Navigate to the project directory:
cd
-
Install the required dependencies using pip:
pip install -r requirements.txt
-
Install Gunicorn if not already installed:
pip install gunicorn
-
Start the Gunicorn server:
gunicorn main:app
-
Install Nginx
sudo apt install nginx sudo systemctl enable nginx sudo systemctl start nginx
-
Access the app in your web browser using the IP address or domain name of your DigitalOcean Droplet.