For each dataset, we prepare the following original data:
-
class file
class.json
: the seen and unseen class information in each dataset, including WordNet ID and literal name; -
attribute file
attribute.txt
: attribute information, including custom ID and literal name;attribute_hierarchy.owl
&attribute_group.json
: the categorization information of class attributes;
-
attribute annotation file
- AwA:
binaryAtt_splits.mat
andatt_splits.mat
, download from here and put it into the folderori_data/AwA
; - ImNet-A/O:
class_attribute.json
- AwA:
-
ConceptNet (5.7): the full set can be download from here, its English subset can be extracted by running
conceptnet_en_subgraph.py
For each dataset, we run the following scripts to construct KG:
-
class_hierarchy.py
: build hierarchical structure of classes as the backbone of KG, outputclass_hierarchy_triples.txt
-
attribute_hierarchy.py
: build hierarchical structure of attributes, outputattribute_hierarchy_triples.txt
-
class_attribute_awa.py
&class_attribute_imagenet.py
: build attribute triples, outputclass_attribute_triples.txt
-
literal.py
: add literal information of nodes in graph, outputliterals.txt
-
Link to ConceptNet
conceptnet_alignment_extraction.py
: align classes and attributes to conceptnet entities and extract their one-hop neighbor subgraph;conceptnet_entity_alignment.py
: save the aligned pairs, outputsameAs_triples.txt
;conceptnet_repeat_check.py
: remove repetitive triples, outputconceptnet_triples_filter.txt
-
Set disjointness semantics
disjointness_classes.py
: disjointness between different classes, outputdisjoint_cls_cls_triples.txt
disjointness_cls_atts.py
: disjointness between classes and attributes, outputdisjoint_cls_att_triples.txt
-
Run
output2CSV.py
to save KGs to CSV file. Note we set different parameters to output KGs with different semantic settings. Taking AwA as an example:- generate KG with all semantics by running
python output2CSV.py --dataset AwA --all
- generate KG with semantics of class hierarchy by running
python output2CSV.py --dataset AwA --cls_hie
- generate KG with semantics of class hierarchy and attribute hierarchy by running
python output2CSV.py --dataset AwA --cls_hie --att_hie
- generate KG with all semantics by running
-
Run
output2RDF.py
to save KGs with Turtle and XML files.