PINYON implements a community detection algorithm that can consider the context and meaning of the entities in a post. PINYON accurately identifies semantically related posts in various contexts.
In order to use PINYON first, we need to pre-process the corpus of social media posts (tweets). The TweetsCOV19(may2020) can be downloaded using this link
After downloading the tweets dataset, we need to execute the three scripts in the tweets_process directory.
Once all the scripts finish executing, we need to obtain the tweets' original text. This can be done using Hydrator
The embedding for each corresponding KG needs to be downloaded and placed in the embedding/data/ directory
Now that we have all the necessary data, we can run the PINYON approach against the three KGs (UMLS, Wikidata, and DBpedia). For example, to run the approach against UMLS, please use the following:
python3 run_umls.py