-
Notifications
You must be signed in to change notification settings - Fork 5
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Purpose of CSIP structMap #426
Comments
After analysing it for a few days, the issue of A good example is referencing. mets.xsd prescribes referencing in only one direction and only at the individual file level: from Alternatives for Each has its strengths and weaknesses. Also, A and B would greatly benefit from structuring While making the choice we also need to consider:
Version 2 schedule leaves no time to properly consider all these aspects. So I propose we fix only the obvious mistakes and otherwise leave the current solution as it is. Soon after the release of v.2.0 we should create a task force to develop a complete solution for structMap and fileSec. This should involve analysis of real life IPs from different institutions and prototyping complete IPs for different alternative solutions. |
What hasnt been handled is moved to the next milestone. |
I feel what hasn't been handled is pushed to the next milestone but that this gets serious consideration then. I think a response now might be rushed as we're likely to need some good test cases to illustrate all of the issues. In general, I'm against repetition (I tend to regard all repetition as unnecessary) as it leads to internal inconsistency, i.e. chaos. |
This needs to be pushed to the next major version update. Needs more discussion and investigation to see if the concerns have already been handled and if more rewording is needed. |
The explanation of the purpose of structMap in mets.xsd and METSPrimer.pdf is clear and METSPrimer has some good examples of its use (see pages 62 and 65). The purpose of CSIP structMap is less clear and this makes it hard to contextualise the 33 requirements (and thus, to create a valid IP).
The intro text of 5.3.6. "Use of the METS structural map (element structMap)" states:
This can be summed up as: "The Purpose of CSIP structMap is to mirror the physical folder structure of the IP and if representations are present then point to the METS.xml files that describe them." This conclusion is mirrored by the examples:
But why is it necessary? The same info can be derived by parsing the
fileSec/fileGrp/file
andmdRef
elements of all METS.xml files in the IP. Processing speed could be the added value here, as the structure can be quickly read from strucMap, compared to the effort of reverse engineering the structure from the descriptions of individual files. However, there is duplication here, so a risk of conflicting descriptions.There is also a rather softly posed requirement "Reference the fileGrp which describes all files in all folders /…/" to be used in the case of representations. It seems mandatory when representations are present, so it should be made an explicit SHOULD rule. Also, if it makes sense for representations, it is equally reasonable to have it for non-rep cases, too. There should be clear instructions on where and how to place the fileGrp references.
On another thought, shouldn't the purpose of CSIP structMap be to describe the conceptual, rather than the physical structure of the package? As the folder structure is not mandatory any more, we might see folder structures like this:
In such case, the content files could be randomly spread into the data folders, e.g. to make the data folders fit some size limit. Or grouped by file name such as is often done in uuid-named or sequentially named file and folder structures, e.g. the actual structure of MS Outlook 2011 for Mac (the letters seem to mean Trillion, Billion, Million, Kilo, each folder containing up to 1000 items):
A conceptual CSIP structMap would make a lot of sense in such cases.
Anyway, no matter what the purpose of CSIP structMap, it should be stated clearly, supported by the requirements and illustrated with intuitive examples.
I know it sounds like useless theorising, but structure-related requirements are currently not clear (we experienced this when creating minimal valid IPs, see DILCISBoard/eark-ip-test-corpus#211 and DILCISBoard/eark-ip-test-corpus#212), and I've got a hunch that the unclarity might stem from unclear purpose statements.
The text was updated successfully, but these errors were encountered: