Skip to content

SDM-TIB/PINYON

Repository files navigation

PINYON

PINYON implements a community detection algorithm that can consider the context and meaning of the entities in a post. PINYON accurately identifies semantically related posts in various contexts.

Using PINYON

Pre-processing

In order to use PINYON first, we need to pre-process the corpus of social media posts (tweets). The TweetsCOV19(may2020) can be downloaded using this link

After downloading the tweets dataset, we need to execute the three scripts in the tweets_process directory.

Once all the scripts finish executing, we need to obtain the tweets' original text. This can be done using Hydrator

Embeddings download

The embedding for each corresponding KG needs to be downloaded and placed in the embedding/data/ directory

DBpedia

Wikidata

UMLS

The PINYON SCD Approach

Now that we have all the necessary data, we can run the PINYON approach against the three KGs (UMLS, Wikidata, and DBpedia). For example, to run the approach against UMLS, please use the following:

python3 run_umls.py

About

No description, website, or topics provided.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages