-
Notifications
You must be signed in to change notification settings - Fork 38
Indices spec
For each index, certain information is required for proper display to the user. This page details the requirements for each index and specifications deriving from those.
Each index requires its items to have, for each instance:
- A link to the inscription
- The identifier of the inscription
- The number of the text part containing the instance
- The line number containing the instance
- An indicator of whether it is partially or completely restored
Do specific indices have further requirements? IOSPE has tei:num indexed alongside whether it is a simple value, an at least value, or an at most value, but I don't see any use made of this on the front end.
Each item needs to have its language indexed.
In addition to the actual index, each index needs some or all of: title, introduction/preamble, notes, and index-specific table headings.
IOSPE's Solr index is large and cumbersome to work with (in terms of time taken to index and the need for a special script), and EFES's approach is designed to avoid that. Rather than having a single document for each instance of an index term, every instance for an item is grouped as multiple values within a single doc. This requires encoding all of the information for an instance into a single value (easily doable for identifier, text part number, line number, and restoration state). It does however preclude faceting on these indices.
This approach also requires operating on all of the inscriptions at once.
See https://github.com/EpiDoc/EFES/issues/32
As noted in the section above, faceting is not available, since the Solr index is improved by grouping all instances of the same term in a single doc.
Indices are specified in TEI XML files in content/xml/indices
, one file per type of XML document to which the indices defined therein apply. For EpiDoc files (that live in content/xml/epidoc/
, their indices are defined in content/xml/indices/epidoc.xml
.
Each index is defined within the tei:body
, in an IDed tei:div
with a tei:head
, an optional tei:div[@type='notes']
and an optional tei:div[@type='headings']
. The heading and notes are rendered into HTML in the display of the index. The headings, if specified (in a tei:list
) provide the explicit headings for the index table.
Solr indexing is done through the usual Kiln process, with various map:match
elements defined in sitemaps/solr.xmap
that break the process down into useful pieces. solr.xmap#local-solr-add-indices
handles a single index file, creating a document that XIncludes Cocoon URLs to solr.xmap#local-solr-add-index
, which is responsible for creating the Solr doc for a specific index within an index file. This makes use of index-specific XSLT in stylesheets/solr
. These XSLT follow a common pattern of looping over groups of nodes that share an index term, and creating a doc for each.
The Solr field index_instance_location keeps track of the various pieces of information needed to render a title and link to a document containing the index item. It uses a string with multiple parts separated by "#" to do this. Those parts are: subdirectory of content/xml
containing the document; the path to the document, relative to that subdirectory (and without the file extension, which is assumed to be .xml
); the text part numbers in descending hierarchical sequence, separated by "."; the line number of the instance; and a Boolean marker for whether the instance is restored or not. This string is then parsed on the display side to create a rendering that can mimic that used in IOSPE.
The templating for HTML display of an index is designed to allow for a lot of customisation of individual indices. Most index-specific templates will simply inherit from a more general one (eg, index-epidoc.xml
). The XSLT stylesheets/tei/indices-epidoc.xsl
and stylesheets/tei/indices.xsl
are responsible for displaying the index. Any custom fields for an index will require adding an xsl:template
matching on the Solr arr
or str
for the field, and an xsl:apply-templates
in the template matching on result/doc
to that.