Skip to content

Entities

Benjamin De Boe edited this page Feb 17, 2020 · 4 revisions

iKnow's primary function is to identify phrase boundaries that define Entities, entirely based on the syntactic structure of the sentences, rather than relying on an upfront dictionary or pre-trained model. This makes iKnow well-suited for initial exploration of a new corpus.

iKnow Entities are not Named Entities in the NER (Named Entity Recognition) sense, but rather the word groups that need to be considered together, representing a concept or relationship as coined by the text author in its entirety. The following examples show the importance of this phrase level to fully capture what the author meant:

iKnow Entity Meaning
Dopamine small molecule
Dopamine receptor drug target
Dopamine receptor antagonist chemical drug
Dopamine receptor gene gene, molecular sequence
Dopamine receptor gene mutation physiological process

iKnow labels every entity with a simple role that is either Concept (usually corresponding to Noun Phrases in Part-Of-Speech lingo) or Relation (verbs, prepositions, ...). Typical stop words that have little meaning of their own get categorized as PathRelevant (e.g. pronouns) or NonRelevant parts, depending on whether they play a role in the sentence structure or are just linguistic fodder.

In the following sample sentence, we've highlighted Concepts, Relations and PathRelevants separately.

Belgian geuze is well-known across the continent for its delicate balance.