-
Notifications
You must be signed in to change notification settings - Fork 44
Linked Environment Data
Since 2010 several projects of the German Federal Environment Agency (FEA) have been contributing to the creation of a public data network based on Linked Data. This effort was started with the Environmental Specimen Bank (ESB) and the Semantic Network Service (SNS), with further information systems being considered for inclusion. It is part of an international collaboration effort with the Ecoinformatics Initiative. innoQ is instrumental in enabling the implementation using the Linked Data approach.
Linking environmental data and terminology has been of interest to the FEA since the 90s, with several projects having been conducted in this area (UMPLIS, UDK, GEIN, SNS, PortalU). However, the existing implementations share two shortcomings:
- Only data containers (databases, information systems, complex web pages) have been linked rather than individual records.
- There is no common access to a shared data structure so references were only meaningful in the context of the host system.
It is these shortcomings that the Linked Data data approach is meant to overcome.
The Environmental Specimen Bank records the accumulation of (harmful) substances in test subjects at certain locations and times. However the UPB itself is not responsible for the comprehensive description of all relevant elements, so specialized information should be referenced instead. For substances such data is provided by GSBL, for species there is EUNIS, for locations and times SNS's geo thesaurus and environmental chronicle, respectively. The environmental thesaurus (UMTHES) provides an overarching envelope which is in turn linked with the international GEMET.
Each record in the UPB can link directly to the information from those specialized systems. Ideally those provide a backreference, enabling two-way navigation.
In addition to the information systems mentioned so far, there are numerous specialized systems operated independently from governmental agencies, e.g. Chemical Entities of Biological Interest ChEBI or GeoNames. Whether those should be referenced is merely a matter of policy - the technical opportunity exists.
A data representation in Resource Description Framework (RDF) format is required in each participating system for cross-linking references. Based on this, individual models (RDF schema or "vocabulary") are described and applied which are roughly comparable to object-relational models, but exceed those in expressiveness. There is already a large number of established RDF vocabularies, which can - and should - be used, combined and extended.
- UBP's data model can be represented using the Data Cubes vocabulary (previously SCOVO). Some extensions are required to represent the domain-specific dimensions (sample type, analyte, location).
- UMTHES's RDF model is an application of the Simple Knowledge Organisation System (SKOS).
- The environmental chronicle's RDF model is an extension of the Event Ontology.
- The geo thesaurus's RDF model is based on the Geonames Ontology and the WGS84 Geo Positioning.
It does not appear efficient for each participating information system to implement Linked Data mechanisms on its own. Instead the FEA will implement a dedicated Linked Data server as shared proxy which dereferences all URIs, redirects to the individual systems' HTML representation if necessary and also provides a SPARQL endpoint.
Each participating system then only has to provide the respective RDF representation of its own data sets, and notify the Linked Data server of any modifications.
Based on this, further visualization services can be implemented, e.g. as is being tested in the US government's Data-gov project.
- innoQ designs und implements Linked Data technologies for both the Semantic Network Service (SNS) and Environmental Specimen Bank.
- We maintain und coordinate development of the iQvoc open source framework.
- We contribute to the coordination and expansion of the Linked Data initiative through active participation in the W3C eGov Interest Group, the Ecoinformatics Initiative as well as via numerous workshops und conferences.