-
Notifications
You must be signed in to change notification settings - Fork 1
Using A Custom Knowledge Base
This section explains the steps you need to take if you want to use ELEVANT to evaluate linking results for linkers and benchmarks that link to a custom knowledge base or ontology.
Note that some features are not available when using a custom knowledge base. E.g. some error categories like metonyms, demonyms (which might not even make sense for your knowledge base) and rare errors can not be evaluated separately.
Instead of downloading the data files using the make download_all
command, perform the following steps
within the docker container to setup ELEVANT for your custom KB:
-
Remove all subdirectories in
evaluation-results/
and all contents of thebenchmarks/
directory:rm evaluation-results/* -r rm benchmarks/*
The evaluation results and benchmarks contained in these folders per default are targeted at Wikidata / Wikipedia / DBpedia.
-
Run the Python script
scripts/extract_custom_mappings.py
to extract the necessary name and type mappings from your KB. For this script to work, your KB must be in the turtle (ttl) format.python3 scripts/extract_custom_mappings.py <custom_kb_in_ttl_format> --name_predicate <predicate_for_entity_name> --type_predicate <predicate_for_entity_type>
Per default, the predicate used to extract the entity name is
http://www.w3.org/2004/02/skos/core#prefLabel
and the default predicate used to extract the entity type ishttp://www.w3.org/2000/01/rdf-schema#subClassOf
.This will create three tsv files in
<data_directory>/custom_mappings/
:-
entity_to_name.tsv
where the first column contains the entity URI and the second column contains the entity name, e.g.http://emmo.info/emmo/domain/fatigue#EMMO_c502dbc5-3a11-50c7-baf5-f3ef9c4fe636 Glass Transition Temperature
-
entity_to_types.tsv
where the first column contains the entity URI and all further columns contain URIs of types of this entity, e.g.http://emmo.info/emmo/domain/fatigue#EMMO_6e8610b1-1717-53ff-a2ac-3d48950773fc http://emmo.info/emmo/domain/fatigue#EMMO_15a16e99-19cb-5d5e-84d0-b74029837f28 http://emmo.info/emmo/domain/fatigue#EMMO_cfe4071d-224e-5ae9-abe5-083dc57ee6f9
-
whitelist_types.tsv
where the first column contains the URI of an entity type and the second column contains the name for the entity type, e.g.http://emmo.info/emmo/domain/fatigue#EMMO_15a16e99-19cb-5d5e-84d0-b74029837f28 Mechanical Property
In the web app you will then be able to see evaluation results for each of these whitelist types individually. You can manually filter the set of whitelist types (this is especially important if you have a lot of entity types, e.g. > 50, because then your web app will become cluttered), but then make sure to only include types in the
entity_to_types.tsv
file that are included in this whitelist.
If you don't have your knowledge base in ttl format or can't use the script for other reasons, it is enough to create the three tsv files mentioned above yourself and move them to a directory
<data_directory>/custom_mappings/
. A real world example for how to do this is given in Using A Custom Knowledge Base Example. -
-
To add a benchmark that links mentions to your custom knowledge base, run the
add_benchmark.py
script with the option-c
(for custom KB). The supported benchmark formats for custom KB benchmarks arenif
andsimple-jsonl
. E.g.python3 add_benchmark.py <benchmark_name> -bfile <benchmark_file> -bformat <nif|simple-jsonl> -c
See How To Add A Benchmark for more detailed information on adding a benchmark and the benchmark formats.
-
To add linking results for such a benchmark to ELEVANT, run the
python3 link_benchmark.py
script with the option-c
. The supported linking results formats for custom KB linking results arenif
andsimple-jsonl
. E.g.python3 link_benchmark.py <experiment_name> -pfile <linking_results_file> -pformat <nif|simple-jsonl> -b <benchmark_name> -c
See How To Add An Experiment for more detailed information on adding linking results to ELEVANT and the linking results formats.
-
To evaluate the linking results, run the
evaluate.py
script with the option-c
, e.g.python3 evaluate.py <linking_result_file> -c
where
<linking_result_file>
is the file generated in the previous step. See Evaluating Linking Results for more detailed information. -
Before you start the web app for the first time, in the
evaluation-webapp
directory runln -s <data-directory>/custom_mappings/whitelist_types.tsv whitelist_types.tsv
-
You can then start the web app and inspect your linking results at http://0.0.0.0:8000/ with
make start_webapp