GitHub

#Insight Data Science

Run.sh:

run.sh compiles the two C++ programs (average_degree.cpp and tweets_cleaned.cpp) and then executes them back to back. The compilation is done with compiler optimization "-O3". So the run time of the features cannot be called as the runtime of run.sh as it includes both the compilation and execution.

Dependencies:

The code is written in basic C/C++ with almost no prebuilt libraries, so there are no known dependencies.

Feature 1: tweets_cleaned.cpp

###Input: The program reads the file "tweets.txt" located at a relative location of "../tweet_input/tweets.txt". The program expects a new tweet on new line in the JSON format, in which length of text is variable.

###Functionality: The program reads a tweet and extracts text and timestamp from the it.

It removes all the unicodes from the text.
It also removes all the escape characters, if it is not a unicode to make tweet readable.
It counts number of tweets which had unicodes in them.

###Output: The program writes to a file "ft1.txt" located at "../tweet_output/". The format of output is: text (timestamp: xxx) At the end of the file, it prints out the number of tweets which required unicode cleaning.

Feature 2: average_degree.cpp:

###Input: The program reads the file "tweets.txt" located at a relative location of "../tweet_input/tweets.txt". The program expects a new tweet on new line in the JSON format, in which length of text is variable.

###Functionality: This program extracts time stamp and cleans text from the tweet just like the feature 1 again. Then it calculates the rolling average for 60 seconds of history.

It builds a graph out of 60 seconds worth of tweets such that there is an edge between A and B iff there was a tweet in which A and B tagged simulataneously.
It prints out the average degree of vertex, i.e. sum of degrees of all the vertices/number of vertices in graph.

###Output: The program writes to a file "ft2.txt" located at "../tweet_output/". The program writes the average value after reading every tweet.

Name		Name	Last commit message	Last commit date
Latest commit History 10 Commits
src		src
tweet_input		tweet_input
tweet_output		tweet_output
README.md		README.md
run.sh		run.sh

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Run.sh:

Dependencies:

Feature 1: tweets_cleaned.cpp

Feature 2: average_degree.cpp:

About

Releases

Packages

Languages

rachitgarg91/InsightDataScience

Folders and files

Latest commit

History

Repository files navigation

Run.sh:

Dependencies:

Feature 1: tweets_cleaned.cpp

Feature 2: average_degree.cpp:

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages