Raw dataset having over 16000 tweets (including both sarcastic and non-sarcastic) for researchers aspiring to work on Sarcasm Detection in Hindi.
Number of Sarcastic tweets: 6051
Number of Non-Sarcastic Tweets: 10128
These tweets were extracted using tweet scrapping code from the Github repository of Mr. Griffin Leow
It was tweaked to extract tweets in native Hindi and of specific hashtags. The dataset has tweets for the duration 01-01-2012 to 23-06-2020.
Method #1: Run the scrap_tweets_in_Hindi-v1.py
file via IDLE or Jupyter Notebook to re-scrap tweets from twitter.
Method #2: Just download the Jupyter Notebook and run all the cells!
P.S. If you are around, I won't mind if you star the repository! Thanks ;-)