We need a corpus of tweets tagged. So we build this project to obtain it fast.
First step is clone this repository.
This project requires mongodb installed and running on port 27017.
You will need to create a virtualenv with python3.7:
virtualenv --python=`which python3.7` venv
source venv/bin/activate
Then install requirements:
pip install -r requirements.txt
This projects install itself with it's own setup.py
, you just need execute:
pip install -e .
We need install Node.js
in order to build frontend code. You can use vnm
utility to manage node installation.
Currently we are using 12.16
:
curl -o- https://raw.githubusercontent.com/nvm-sh/nvm/v0.35.3/install.sh | bash
NODE_VERSION=12.16.2
nvm install $NODE_VERSION
nvm use $NODE_VERSION
Now time to install dependencies and compile frontend:
cd web
npm install
npm run build
You have been generated distributable static code. Last command will give you some warnings, don't be worried of those. Now we need run our application to serve this static files.
We are using uvicorn
to serve our application. Just need execute:
uvicorn tweet_tagger.main:api --debug
Now you can tagger your tweets on http://127.0.0.1:8000
In order to test or analyze the API you can load:
There you can read endpoints documentation, but by the moment no data has been imported... it's time to do it!
The app use MongoDB
so you need install it. You process documenation on it's own webpage:
mongodb.com
One you have done, its time to donwload data and import into database.
We used GetOldTweets3 module to download tweets. With this command you will download all tweets at 10 kms from Seville related with Coronavirus
and Holy Week
in Spanish language:
GetOldTweets3 --querysearch "coronaviru+semana+santa" --near "Sevilla" --within 10km --maxtweets 100 --lang es
The default output csv file name is output_got.csv
. Suppose you download tweets on your ~/data/
folder.
Now you must to use this script to import in mongodb:
python bin/import_tweets.py --csv-path ~/data/output_got.csv
This simple scripts uses mongodb and CSV path settings defined on tweet_tagger.settings
module.
This built code has been served by our fastapi server that must to be running