Skip to content
jctoledo edited this page Feb 3, 2012 · 5 revisions

The linked data that forms part of Bio2RDF ascribes a to simple set modeling patterns that permit the different datasets to interoperate seamlessly.

==Identifiers== === Entities === The first step of the RDFization process involves using a consistent identifier scheme so that we can syntactically integrate data across the Bio2RDF network. Bio2RDF identifiers are given by the following URI pattern: $sda = 2l;''namespace'':''identifier''

where the ''namespace'' is a short name listed in our [ dataset registry] that uniquely identifies the source (dataset/database). The ''identifier'' is the (alpha)numeric string assigned to identify that entity. For instance, the gene identified by the number 15275 in the NCBI EntrezGene Database (namespace = geneid) has the following identifier:''geneid'':''15275''

===Vocabulary=== The Bio2RDF URI scheme is applied not just to data entries, but also for the vocabulary (types and relations) to describe these entries.''namespace''_term:''term''

For example, the gene identified by geneid:15275 is a kind of Gene, as defined by Entrez Gene.''geneid''_term:''Gene''

==Descriptions== ===Minimum Annotations===

Each resource should contain the following annotations: a human readable title as it appears in the source data. a string that contains the identifier using the following pattern :

rdfs:label a Bio2RDF generated label containing a title followed by the identifier "title [ns:id]". Used by convention in most RDF browsers to render the name of resource instead of its URI.

Taken together, geneid:15275 rdfs:label "Hk1 [geneid:15275]" ; dc:title "Hk1" ; dc:identifier "geneid:15275" ; rdf:type geneid_term:Gene .

===Datasets, Records and Entities===

We recognize a minimum of 3 entities found in biological information resources: physical entities, records and datasets.

  1. Record

Records are information objects that contain a set of statements, primarily about the subject.

namespace_record:identifier bio2rdf_term:has-primary-subject namespace:identifier . namespace:identifier bio2rdf_term:is-described-by namespace_record:identifier .
  1. Dataset Datasets are collections of records.
bio2rdf_dataset: bio2rdf_term:has-item namespace_record:identifer .

Since datasets can be versioned, we bio2rdf_dataset:namespace.version dc:hasVersion "13" ; dc:partOf bio2rdf_dataset:namespace .

==Mappings== this section is about how to create mappings from your dataset specific vocabulary to SIO.


==Scripts== :Category:Scripts


==Loading== Loading the RDF database