covid-hate-speech

See https://github.com/VeenaArv/covid-hate-speech for all code.

All Java code is written by Veena Arvind to process tweets and Python code is written by Yusu Qian to process weibo

For processing tweets:

data_ingest.md provides info on how to process the input
ETL code under src/us/ writable: contains custom writable objects to store information about tweets utils: contains utils used by Mappers and Reducers and well as pipeline counters main class: PipelineMain.java consists of 3 jobs (extractTweets, annotateTweets, writeToParquet) check schema.jsonl for parquet schema
Impala code under output.md

** CovidGovernmentResponseWritable.java and CSVToCovidGovernmentResponse.java is not used to produce any output nor joined with tweets.

Data Sources: Kerchner, Daniel; Wrubel, Laura, 2020, "Coronavirus Tweet Ids", https://doi.org/10.7910/DVN/LW0BTB, Harvard Dataverse, V7 Hale, Thomas, Sam Webster, Anna Petherick, Toby Phillips, and Beatriz Kira (2020). Oxford COVID-19 Government Response Tracker, Blavatnik School of Government.

Commands to Build and Run Project.

Follow steps in data ingest to load data into hdfs. Alternatively, a sample is provided at sample.json1 Load into hdfs: hdfs dfs -put sample.jsonl covid/data
clone this repo

git clone https://github.com/VeenaArv/covid-hate-speech

mvn package to build jars.
copy lib and covid-hate-speech-1.0-SNAPSHOT.jar to hdfs
Run hadoop with covid-hate-speech-1.0-SNAPSHOT.jar
Output resides in covid/parquet/tweets/out and can be loaded into impala using output.md

Name		Name	Last commit message	Last commit date
Latest commit History 21 Commits
schema		schema
src/us		src/us
LICENSE		LICENSE
README.md		README.md
data_ingest.md		data_ingest.md
output.md		output.md
pom.xml		pom.xml
sample.josnl		sample.josnl

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

covid-hate-speech

Commands to Build and Run Project.

About

Releases

Packages

Languages

License

VeenaArv/covid-hate-speech

Folders and files

Latest commit

History

Repository files navigation

covid-hate-speech

Commands to Build and Run Project.

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages