-
Notifications
You must be signed in to change notification settings - Fork 3
Home
Welcome to the Spotlight OAI-PMH MODS harvester. The harvester imports sets of MODS or Solr metadata into Spotlight when given a URL, set, and mapping file. The mapping file is a YAML file that maps the spotlight fields to the MODS paths or Solr paths. The OAI-PMH harvester uses two important gems: mods and ruby-oai.
The format takes this form:
`- spotlight-field: xxx (field names should be lowercase, separated with dashes except for the suffix: firstpart-secondpart_ssim or _tesim)
multivalue-breaks: "yes" (optional) - use this for splitting out multiple values to be broken on or faceted on individually (ex - subjects)
default-value: xxx (optional)
delimiter: xxx (optional, what to separate all path values with. Defaults to a space)
mods:
- path: xxx (repeatable - all path fields will be concatenated)
delimiter: xxx (optional)
attribute: xxx (optional)
attribute-value: xxx (optional but paired with attribute)
mods-path: xxx (optional)
mods-value: xxx (optional)
subpaths: (optional)
- subpath: xxx
- subpath: xxx
xpath:
- xpath-value: xxx (repeatable)
xpath-namespace-prefix: xxx (optional)
xpath-namespace-def: xxx (optional)`
NOTE: The spotlight-field name comes from the initial custom metadata field that you add to Spotlight. 'Creation Date' becomes creation-date_ssim/tesim. 'CAPITOL Name' becomes capitol-name_ssim/tesim
Some working examples to see how the fields are used: Most basic:
`- spotlight-field: unique-id_tesim
mods:
- path: recordInfo/recordIdentifier`
Use of attributes. This gets the start date 1788
`- spotlight-field: start-date_tesim
mods:
- path: originInfo/dateCreated
attribute: point
attribute-value: start`
Use of subpaths:
`- spotlight-field: subjects_ssim
delimiter: "|"
mods:
- path: subject
delimiter: "--"
subpaths:
- subpath: name/namePart
- subpath: topic
- subpath: geographic
- subpath: genre`
Use of mods-path to get the creator:
`- spotlight-field: creator_tesim
mods:
- path: plain_name
delimiter: " , "
mods-path: role/roleTerm
mods-value: creator
subpaths:
- subpath: namePart`
Lastly, an exclamation mark can be used in the values (attribute or mods) OR the attribute itself to exclude values. This gives the date without an attribute of 'point':
`- spotlight-field: date_tesim
mods:
- path: originInfo/dateCreated
attribute: '!point'
attribute-value: (note - this is required to exist but is blank because we don't need an acutal value)`
Example 2 gives any name without a role of creator:
`- spotlight-field: contributer_tesim
delimiter: " , "
mods:
- path: plain_name
delimiter: " , "
mods-path: role/roleTerm
mods-value: '!creator'
subpaths:
- subpath: namePart`
A sample mapping file used for our virtual collections can be found here: vc_mapping.yml
`- spotlight-field: xxx (field names should be separated with dashes except for the suffix: firstpart-secondpart_ssim or _tesim)
multivalue-breaks: "yes" (optional) - use this for splitting out multiple values to be broken on (and faceted on) individually (ex - subjects)
default-value: xxx (optional)
delimiter: xxx (optional, what to separate all path values with. Defaults to a space)
solr-field:
- field-name: xxx (repeatable - all path fields will be concatenated)`