Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ID conventions for OSCAL document #221

Closed
anweiss opened this issue Jul 18, 2018 · 17 comments
Closed

ID conventions for OSCAL document #221

anweiss opened this issue Jul 18, 2018 · 17 comments
Assignees
Labels
LoE: Medium Scope: Modeling Issues targeted at development of OSCAL formats User Story

Comments

@anweiss
Copy link
Contributor

anweiss commented Jul 18, 2018

User Story:

As an OSCAL content creator and/or user, I can refer to a set of clear and concise conventions for the use of the id attribute in an OSCAL document.

Goals:

The only constraint on the id attribute in an OSCAL document today is that it conforms to the xsd:ID type; meaning that it must start with a letter or underscore, and can only contain letters, digits, underscores, hyphens, and periods (http://www.datypic.com/sc/xsd/t-xsd_ID.html). In addition to this, we should also publish best practices and conventions for the use of the id attribute. Questions one might ask:

  • How does one enforce ID uniqueness between OSCAL documents?
  • Are there any reserved prefixes that one should take into account? ... e.g. <catalog id="usnistgov_[catalog_name]">

Dependencies:

This is somewhat related to #39, although that issue pertains to the use of the id attribute in controls rather than the use of the id attribute in a complete OSCAL document. The issue also depends on #218.

Acceptance Criteria

The use of the id attribute in an OSCAL document has well-defined conventions associated with it.

@wendellpiez
Copy link
Contributor

I will produce a stylesheet to help us scan and validate ID (values and format) against their components (document position, label, and type - which may or may not always align). This will give us some leverage for designing a robust ID assignment protocol, which will reflect all the requirements.

Final resolution of this issue may require some emendation/"correction" of the source data see #218.

Any rule we make must be able to produce consistent IDs for components that need them but do not now have them.

@wendellpiez
Copy link
Contributor

NB also @anweiss asks important questions regarding confidence that these IDs can have universal scope not only document scope. To support this, we need at least to be able to enforce robust identifiers at the top level as well.

@iMichaela
Copy link
Contributor

Status Meeting: 8/23/2018

@wendellpiez , @joshualubell and @iMichaela will meet on Tuesday to discuss this issue.

@wendellpiez
Copy link
Contributor

Sprint 13 Progress Aug 30 2018

Meeting with @iMichaela and @joshualubell Tue Aug 28, we confirmed a strategy:

  • Source data (SP800-53A NVD XML source representing Appendix A "Objectives") would be enhanced with a pipeline step providing missing structure (cf Resolving possible anomalies in SP800-53 catalog #218)
  • Labels (names) of objectives will be refactored (rewritten) to reflect implicit structure and relations to other elements (statement items)
  • IDs will then be produced from labels and validated against analogous implicit (structural) IDs w/in doc scope
  • (Side effect: confirming structural integrity of labels on low level items as given in SP800-53rev4.)

Following the meeting I made progress yesterday on implementation. It is 50% there.

Progress towards goal (putting this issue to bed for this data set): 70% (including work of analysis/spec)

@wendellpiez
Copy link
Contributor

wendellpiez commented Aug 31, 2018

In my repo there is now a file implementing the ID protocol.

It also has links from Objectives to corresponding items in the control statement.

https://github.com/wendellpiez/OSCAL/blob/WIP-sp800-53-improvement/content/nist.gov/SP800-53/rev4/NIST_SP-800-53_rev4_catalog.xml

FWIW these IDs are extracted by mapping from given labels, so GIGO, meaning if the link integrity holds up that is all credit to the file originators. (Comparing these identifiers with systematically generated IDs is another thing.)

Unfortunately the file is largish that is beginning to be a concern even to me (even if it compresses fairly well).

next up: find and correct remaining glitches w/ more comprehensive validation of values expected and (re)presented; survey to see where IDs are still missing.

@wendellpiez
Copy link
Contributor

The file at the link above has just been committed again. I am continuing to review and to develop tooling to support review.

Note in particular in objectives there are now link elements with pointers to corresponding parts of statements.

@wendellpiez
Copy link
Contributor

Oops there is still at least one glitch see e.g. id value au-10_smt-.2.a. Please post any other glitches or issues you see @iMichaela @brianrufgsa while I repair this (and continue testing).

@wendellpiez
Copy link
Contributor

wendellpiez commented Sep 4, 2018

At least that has now been corrected to au-10.2_smt.a. More review is probably in order (and I will test some more as well).

@wendellpiez
Copy link
Contributor

More review has been performed and more issues found. Please don't merge just yet. :-)

@wendellpiez
Copy link
Contributor

Okay to consider the latest commit (again) of the file cited above. The data set is passing more and more stringent Schematron tests. (Next will be to write up the specification of the rules being enforced.)

@wendellpiez
Copy link
Contributor

wendellpiez commented Sep 5, 2018

** Sprint 13 Progress Report Sep 5 2018 **

Much progress including not only IDs and labels but a few other improvements found along the way. Additionally, a Schematron is now given which confirms the structured numbering.

I am hoping this is 90% done now. It requires review. Work on the readme is also proceeding.

@david-waltermire
Copy link
Contributor

9/6/2018 Status Meeting

This is addressed in PR #229.

@wendellpiez
Copy link
Contributor

Sprint 13 Progress September 27 2018

It becomes clear that there are really two issues here:

  1. Validation of ID design and usage in SP800-53 w/ perhaps documentation
  2. Documenting ID usage and semantics in OSCAL irrespective of SP800-53

Since #229 has been merged, the first of these can and should proceed, assuming no outstanding PRs touch the SP800-53 catalog and profiles. (The user community should also be helpful here.)

The second of these depends on Issue #196.

@wendellpiez
Copy link
Contributor

A file documenting ID conventions observed in the SP800-53 catalog is now behind PR #237.

https://github.com/wendellpiez/OSCAL/blob/feature-metaschema/content/nist.gov/SP800-53/rev4/conversion-notes.md

Please take a look @anweiss @iMichaela thanks!

@wendellpiez
Copy link
Contributor

A file describing the catalog including notes on conventions followed in ID assignment is now located here:

https://github.com/usnistgov/OSCAL/blob/master/content/nist.gov/SP800-53/rev4/conversion-notes.md

@wendellpiez
Copy link
Contributor

Since PR #247, this file is now in master. As @anweiss has noted, documentation in general is in need of refactoring/reorganization -- during which docs now scattered in the file system (like this one) could be brought together. (User Story needed.)

@bsilberberg
Copy link

Sorry i didn't see this thread, before I posed my ticket. #248.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
LoE: Medium Scope: Modeling Issues targeted at development of OSCAL formats User Story
Projects
None yet
Development

No branches or pull requests

6 participants