-
Notifications
You must be signed in to change notification settings - Fork 183
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Dataset Versioning in Catalogs #961
Comments
We discussed this today in the model meeting and a few pieces of feedback or other impressions on this work came up.
|
@ohsh6o Do you have an update to this proposal based on your notes above? |
I believe the last model meeting we had, I believe consensus would be to keep as close to the current v1.0.0 models (with a |
@ohsh6o Can you create a concrete change proposal listing each property to add and providing a corresponding definition. We talked about organization name (or party reference), How do we point to the How do we handle multiple source datasets? Perhaps by multiple backmatter references? @ohsh6o Will draft an updated proposal that will address this. @david-waltermire-nist will assist. |
So following up on this comment in anticipation of tomorrow's meeting, I am going to recommend:
This recommendation would allow for 0 to ∞ <resource uuid="example-uuid">
<prop name="dataset" class="collection" value="Special Publication"/>
<prop name="dataset" class="name" value="800-53"/>
<prop name="dataset" class="version" value="5"/>
<prop name="dataset" class="organization" value="gov.nist.csrc"/>
</resource> I have prepared some example code and a presentation for tomorrow. |
Very interested in how this will figure into the profile resolution updates with #954. It appears this pertains to the Second Order Question and Goal in this issue. @david-waltermire-nist and @wendellpiez , can we make time in the coming week to discuss feedback on this proposal? I went on leave around the time Dave got back, and I was not sure when we could pick up the technical feedback from your end and work towards realizing this into implementation. |
@ohsh6o let's not confuse metadata describing an entity "in the world" (such as a person, place, thing or document) such as "Rev 5 of SP800-53 as published by NIST", with requirements for traceability in the stricter sense, that when an OSCAL catalog is inspected, it can be seen (in applicable cases) to reference a 'document' (or 'serialized instance') (somewhere else) that "turns out" to be a profile that produces that catalog. Over and above this, whatever metadata you choose to put into either your catalog(s) or your profile(s), as OSCAL instances, is perfectly fine. But such metadata addresses a different set of requirements (even if still a requirement for 'traceability' in a broader sense). So a FedRAMP profile might well have to say "I am based on Pub X" (with a link) and also, you might want a catalog produced by that profile to be able to point back to the profile, just for traceability/validability. |
Hence I called it a second order question. :-)
More generally why do I keep pushing for this? I would like the second order question address so we can have graphs of how people derive catalogs and profiles from one another, just like Github. And then once I have that, I want to be able to filter and collect all those that are notionally based in the same dataset. Give me all the graphs of separate unrelated catalogs that believe they are notionally derived from 800-53, OSCAL or not. I think tooling to advance this requires the dataset props and the provenance linkage for people to build tools this way, even if they are not distinctly related to each other. I envision tools for analyzing public catalogs this way. For internal tooling that is more scoped, they will have the same need as well. But to be clear, I added a comment to ask what deficiencies there are in the proposal of the dataset properties and how to move that forward, not to discuss the implications of the relationship between #961 and #954. |
@ohsh6o given what you are saying you would like to accomplish or enable, I think the requirements here are actually pretty open-ended. Especially since I also think there are other approaches to assessing and understanding provenance (actual, purported, assumed or inferred) than those that accept claims made in the metadata at face value (however useful that info might be). Given this as usual I am inclined to a minimalistic approach. So the question 'what have I left out' may not be all that useful. The question should be 'do I have what I need for now'. |
Moving to Sprint 61. |
FedRAMP PMO expressed interest in a feature similar to this. Since this is in sprint, I will discuss expectations and timelines with them in the next sync meeting as this is a smallish change. |
One shortcoming noticed in FedRAMP OSCAL usage is that neither So some other manner of association will be necessary. |
Gary, this makes sense. The open-endedness of the requirement is not (really) a reason not to do it. What would be best, or possibly some combination? Let's limit it to the SP800-53 set of catalogs. There are various different ways this could be done - tagging to Github; tagging to UUID of referenced source catalog (too brittle?); tagging to certain metadata found in the source catalog; tagging to nominal version ("best available rev 5" kind of thing). Just to name a couple. Also it occurs to me these data points are useful in at least two different ways - one, for nominal traceability; two, a statement of intention (as to how a profile should be used). Are these the same and could they be collapsed, or do we need both? Finally - should this be an OSCAL thing, or maybe really is it a FedRAMP problem to solve using In a consuming organization, this could be done either at the boundary, or internally. Again it depends on what need is being met. There might be features available on both sides of the fence. |
@aj-stein-nist since you have this one on the priority list, I might have a little additional information to share that relates (loosely), but might benefit from a common approach across models. We should chat sometime. |
Can you set aside 15-30 minutes of time for us to meet and discuss next week in the sprint, thanks? |
At the 11/30 Triage Meeting: the team decided that this ticket is closable. We will leave the ticket open for 1 week (until 12/7/23) to hear any objections or comments. |
User Story:
As an OSCAL developer, I want to explicitly explain what dataset (NIST 800-53, ISO-27001 respectively) and which version of that dataset (respectively 4.0 and 5.1; ISO/IEC 27001:2013 and ISO/IEC 27001:2018) is the source of the catalog and resolved profile catalog without using human interpretation of semantic context, or externalized file and directory naming.
Goals:
When processing catalogs in software, especially beyond the pre-existing oscal-content resources, especially beyond SP800-53 baselines, ETL pipelines cannot rely on explicit file and directory names. Introspecting the content of an OSCAL catalog and resolved profile catalog, the closest to understanding the catalog's content is "800-53 Revision 5, version 5.1 from NIST specifically" requires reading in file names or free-form
title
text values. This is achievable, but runs counter the objectives of OSCAL with structured, machine-readable content.First-Order Question and Goal
First order problem: within a given document, how do we determine the origin (provenance) and that origin document’s version when referenced in a particular document?
//metadata/version
Second-Order Question and Goal
Second order question: In a resolved profile catalog, how do I know the provenance of the profile the new profile is based off of?
Dependencies:
Acceptance Criteria
{The items above are general acceptance criteria for all User Stories. Please describe anything else that must be completed for this issue to be considered resolved.}
The text was updated successfully, but these errors were encountered: