Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Create named requirements for mets root xmlns and schemaLocation #408

Open
koit opened this issue Apr 19, 2019 · 12 comments
Open

Create named requirements for mets root xmlns and schemaLocation #408

koit opened this issue Apr 19, 2019 · 12 comments
Assignees
Labels
enhancement Issues that are an enhancement needed to be evaluated and action decided Proof Read Issues to be dealt with in a final proof read. Solved? Have this issue been handled? Often used in conjunction with the label "help wanted" v2-metsRoot Issues to be picked up under mets root review

Comments

@koit
Copy link
Contributor

koit commented Apr 19, 2019

The current version of 5.3.1. Use of the METS root element (element mets) states:

In addition to its attributes the METS root element mets MUST define all relevant namespaces and XML schema locations used in the package employing the @xmlns and @xsi:schemaLocation attributes.

When implementing and using XML schemas the physical location of any schemas needs to be considered accounting for potential unavailability of any resources required for validation that are hosted externally.

In case XML schemas have been included into the package (i.e. placed into the schemas folder) it is recommended to link to the schemas using the relative path of the schema file (i.e. schemas/mets.xsd).

These are concrete MUST rules, so they should have their own named requirements (and the lines above should be removed from the intro of Chapter 5.3.1.):

ID Name & Location Description & usage Cardinality & Level
CSIP7 Namespace declarations
mets/@xmlns
All XML namespaces used in the METS.xml document MUST be declared in mets/@xmlns attributes. A valid CSIP METS.xml document needs at least the declarations for METS, CSIPExtensionMETS and XMLSchema-instance, and in most cases also xlink. 3..n
MUST
CSIP8 Schema locations
mets/@xsi:schemaLocation
The actual locations of XSD files for all used XML namespaces MUST be declared in the mets/@xsi:schemaLocation attribute.
The schema files are needed at the time of validation, so they should be stored at a location accessible throughout the lifetime of the IP, or included in the schemas folder. In the latter case, it is recommended to link to the schemas using the relative path (e.g. schemas/mets.xsd).
Note 1: it is assumed here that xsi has been declared as the prefix for XMLSchema-instance namespace, but this choice is not mandatory.
Note 2: The xsd file location for XMLSchema-instance does not have to be shown because of its special, built-in status.
1..1
MUST

Side effect: the numbers for current CSIP7 to CSIP112 will need to be incemented by 2.

Related to #228.

@koit koit added the enhancement Issues that are an enhancement needed to be evaluated and action decided label Apr 19, 2019
koit added a commit that referenced this issue Apr 19, 2019
Move xmlns and schemaLocation rules from chapter intro to named rules
CSIP7 and CSIP8. Issue #408.
@PhillipTommerholt
Copy link
Contributor

I really like this solution to have the namespace and schema location requirements stand out along with the other requirements.

I am not sure about the renumbering of the other requirements though.
I think the requirement-IDs need to be stable.

@karinbredenberg
Copy link
Contributor

We dont rename ID's for the requriments at this stage.

@karinbredenberg
Copy link
Contributor

karinbredenberg commented Apr 23, 2019

I'll look into this. But if we go down this route of adding the namespaces as requirements the whole XML-header also needs to be a requirement.

@carlwilson
Copy link
Collaborator

carlwilson commented Apr 23, 2019

Just to say that I agree with @PhillipTommerholt & @karinbredenberg, we now need to resist the temptation to renumber requirements. Permanent IDs (and by extension URLs) for requirements is more important than sequentially numbered requirements. Once anything new arose we were always going to loose neat numbering.

@koit
Copy link
Contributor Author

koit commented Apr 30, 2019

Sorry, I didn't know it was too late. My assumptions were:

  • stability of numbering is important
  • numbers can differ between versions (v.1 had no numbers, v.2 has some, v.3 may have others)
  • numbering for v.2.0 is not locked yet.

There will always be unpredictable additions, so we should come up with a long-term solution. We are wrestling with the classic problem of semantically loaded (or natural) IDs, which can be solved with surrogate (or synthetic) IDs, or with something in between. The semantic load in our case is the logical order of the rules.

I see three solutions:

  1. Natural IDs: Numbers have logical order, new requirements are inserted in the proper place, so the numbering will differ between versions;
  2. Semi-natural IDs: Numbers are grouped to chapters (e.g. CSIP1.1, CSIP1.2, CSIP1.3, CSIP2.1), new requirements are added to the end of the appropriate chapter. An ID once assigned stays the same permanently (unless the chapters are restructured);
  3. Surrogate IDs: IDs are completely meaningless. All IDs are permanent. A possible format is CSIP-xx, where x is [A-Z0-9] (e.g. CSIP-N2, CSIP-05, CSIP-AA, CSIP-Z9). Keeping it case-insensitive would give us a pool of 36^2 = 1296 IDs.

@karinbredenberg
Copy link
Contributor

Its a decision made by the DILCIS Board. We've had decisions and will look up the result and post it here.

(The ID's are xml:ID's which also gives restrictions in the naming of them.)

@PhillipTommerholt
Copy link
Contributor

Nice sum up from the discussion we had in A3 last time, Koit 👍

@carlwilson
Copy link
Collaborator

carlwilson commented May 1, 2019

Discussed both points with @karinbredenberg.

Regarding additional requirements. We're in danger of writing our own version of W3C XML requirements here. Would a better solution (or compromise depending on your POV) be to make explicit recommendations with references to authoritative documentation to address this issue?

Regarding requirement numbering: Under consideration, can see the strengths of keeping related requirements together. Time is against us here but that's got to be balanced with the reality that this is our only chance to make such a change. 👍 from me for changing but will depend on pragmatic reality.

@carlwilson carlwilson added this to the CSIP version 2.0 milestone May 1, 2019
@carlwilson
Copy link
Collaborator

Do take into account the presence of CSIPExtensionMETS which isn't mandatory but IS required for a valid IP.

@karinbredenberg karinbredenberg added the v2-metsRoot Issues to be picked up under mets root review label May 8, 2019
@carlwilson
Copy link
Collaborator

Am now wondering if been explicit in text regarding XML schema validation been part of any validation process. This would at least imply inclusion of the minimal schema set required for a "schema valid" XML document. This wouldn't require a new requirement necessarily but some better explanatory text that highlights use of the main schema and vocab documents.

@karinbredenberg karinbredenberg added Solved? Have this issue been handled? Often used in conjunction with the label "help wanted" Proof Read Issues to be dealt with in a final proof read. labels May 10, 2019
@carlwilson carlwilson modified the milestones: CSIP version 2.0, CSIP v2.0.4 Apr 30, 2020
@carlwilson
Copy link
Collaborator

carlwilson commented Jun 1, 2020

Section 5.3.1 is pretty explicit about namespacing and schema validation will take care of other parts. I'm not against adding specific requirements but only if they can be tested via schema/schematron. I'd suggest we bump this forward but with a specific aim of developing automated tests. If we can then we add the requirements, which won't break backward compatibility as the requirement currently exists, it's just not explicitly stated/enforced. A test may not be as easy to come up with as it appears: https://stackoverflow.com/questions/35467330/xpath-in-schematron-how-to-determine-if-an-xmlns-attribute-is-present-on-a-node

@karinbredenberg karinbredenberg removed this from the CSIP v2.0.4 milestone Jun 15, 2020
@carlwilson
Copy link
Collaborator

This needs moving to the validator issue list as it's now about testing rather than the specification.

@carlwilson carlwilson added this to the CSIP Version 2.1 milestone Oct 5, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement Issues that are an enhancement needed to be evaluated and action decided Proof Read Issues to be dealt with in a final proof read. Solved? Have this issue been handled? Often used in conjunction with the label "help wanted" v2-metsRoot Issues to be picked up under mets root review
Projects
None yet
Development

No branches or pull requests

4 participants