A library for mapping DataCite XML to Ruby objects, based on xml-mapping and xml-mapping_extensions. Full API documentation on RubyDoc.info.
Supports Datacite 4.0; backward-compatible with Datacite 3.1.
The core of the Datacite::Mapping library is the Resource
class, corresponding to the root <resource/>
element
in a Datacite document.
To create a Resource
object from XML file, use Resource.parse_xml
or Resource.load_from_file
,
depending on the data source:
XML source | Method to use |
---|---|
file path | Resource.load_from_file |
String |
Resource.parse_xml |
IO |
Resource.parse_xml |
REXML::Document |
Resource.parse_xml |
REXML::Element |
Resource.parse_xml |
Example:
require 'datacite/mapping'
include Datacite::Mapping
resource = Resource.load_from_file('datacite-example-full-v4.0.xml')
# => #<Datacite::Mapping::Resource:0x007f97689e87a0 …
abstract = resource.descriptions.find { |d| d.type = DescriptionType::ABSTRACT }
# => #<Datacite::Mapping::Description:0x007f976aafa330 …
abstract.value
# => "XML example of all DataCite Metadata Schema v4.0 properties."
Note that Datacite::Mapping uses the TypesafeEnum gem to represent controlled vocabularies such as ResourceTypeGeneral and DescriptionType.
In general, a Resource
object must be provided with all required attributes on initialization.
resource = Resource.new(
identifier: Identifier.new(value: '10.5555/12345678'),
creators: [
Creator.new(
name: 'Josiah Carberry',
identifier: NameIdentifier.new(
scheme: 'ORCID',
scheme_uri: URI('http://orcid.org/'),
value: '0000-0002-1825-0097'
),
affiliations: [
'Department of Psychoceramics, Brown University'
]
)
],
titles: [
Title.new(value: 'Toward a Unified Theory of High-Energy Metaphysics: Silly String Theory')
],
publisher: 'Journal of Psychoceramics',
publication_year: 2008
)
# => #<Datacite::Mapping::Resource:0x007f9768958fb0 …
To create XML from a Resource
object, use Resource.write_xml
, Resource.save_to_file
, or
Resource.save_to_xml
, depending on the destination:
XML destination | Method to use |
---|---|
XML string | Resource.write_xml |
file path | Resource.save_to_file |
REXML::Element |
Resource.save_xml |
Example:
resource.write_xml
# => "<resource xsi:schemaLocation='http://datacite.org/schema/kernel-4 …
To set a prefix for the Datacite namespace, use Resource.namespace_prefix=
:
resource.namespace_prefix = 'dcs'
resource.write_xml
# => "<dcs:resource xmlns:dcs='http://datacite.org/schema/kernel-4' …
In general, Datacite::Mapping is lax on read, accepting either Datacite 3 or Datacite 4 or a mix, and (mostly for historical reasons involving bad data its authors needed to parse) allowing some deviations from the schema. By default, it writes Datacite 4, but can write Datacite 3 by passing an optional argument to any of the writer methods:
resource.write_xml(mapping: :datacite_3) # note schema URL below
# => "<resource xsi:schemaLocation='http://datacite.org/schema/kernel-3
When using the :datacite_3
mapping, the Datacite 4 <geoLocationPolygon/>
and <fundingReference/>
elements, which are not supported in Datacite 3, will be dropped, with a warning. Any
<relatedIdentifier/>
elements of type IGSN will be converted
to Handle identifiers with prefix 10273 (the prefix of the IGSN resolver).
Datacite::Mapping is released under an MIT license. When submitting a pull request,
please make sure the Rubocop style checks pass, as well as making sure unit tests pass with 100%
coverage; you can check these individually with bundle exec rubocop
and bundle exec rake:coverage
,
or run the default rake task which includes both, bundle exec rake
.