`InvestigativeAction`s should be required to produce at least one `ProvenanceRecord` #146

ajnelson-nist · 2024-01-23T20:26:44Z

Background

Discussion on CASE Issue 136 suggests that an InvestigativeAction should always result in the creation of at least one ProvenanceRecord.

Requirements

Requirement 1

CASE should enforce that an InvestigativeAction results in at least one ProvenanceRecord.

As an implementation note, this would be done with a qualified SHACL constraint.

Edited 2024-02-15: "Must" relaxed to "should".

Requirement 2

CASE should describe in a mechanically discoverable way that an InvestigativeAction is expected to always result in at least one ProvenanceRecord.

As an implementation note, this would be done with a qualified minimum cardinality in an OWL Restriction.

Risk / Benefit analysis

Benefits

Requiring a ProvenanceRecord always be generated induces a chain of custody tie in forensic processing for resultant objects of InvestigativeActions.
Reintroduction of OWL constructs will assist with OWL-specific review mechanisms that do not appear to be possible in SHACL, such as set-satisfiability (e.g. determining through set-theoretic analysis whether a class or restriction has accidentally ended up equating to the empty set, rendering usage conformant with the specification impossible).
1. This is acknowledged to be a broader issue than this one proposal. However, a minimum cardinality restriction appears to the submitter to be a "safe" reintroduction in terms of complexity.

Risks

Existing SHACL shapes require a ProvenanceRecord always have one member UcoObject. Thus, this proposal would induce a significant requirement on InvestigativeActions: They must always result in something aside from the ProvenanceRecord.
1. Note that an object being a result of an action does not necessarily imply that the object was created by the action. This stemmed from discussion on UCO Issue 558.
2. It is possible the definition of ProvenanceRecord is too stringent. It is somewhat a separate concern that there might exist a class of InvestigativeActions that truly have no results. Perhaps: "This action found all files within this directory. There were none."
3. NOTE: Risk 1 mitigated with resolution of UCO Issue 599. ProvenanceRecords may now be empty.
Some Actions might be desired to be defined in a manner that attempt to restrict the results to a specific class, e.g., IP addresses. If such an action-class were introduced, it could never be an InvestigativeAction, because an InvestigativeAction would be required to include a ProvenanceRecord among its results. Hence, this proposal would end up inducing an upstream design constraint on UCO: action:result can never be constrained with owl:allValuesFrom, because UCO doesn't "know" about case-investigation:ProvenanceRecord.
This proposal does not specify whether there must only be one ProvenanceRecord among the results. This is an inconclusive point from the discussion on CASE Issue 136, and could be affected depending on whether the committee decides a subaction's ProvenanceRecord should also be recorded in the parent action's results.
This proposal suggests restoring OWL practices, starting with a description of at least one of the outputs for any InvestigativeAction. CASE and UCO previously abandoned OWL in UCO 0.7.0 / CASE 0.5.0. This proposal starts a disciplined reintroduction of OWL constructs, testing with the UCO-OWL syntax review mechanisms.
1. UCO Change Proposal 23 housed discussion, though it appears that document was not exported from the access-controlled UCO Confluence space. (I don't think there is a reason it wasn't, aside from document exports only becoming a mandated part of the proposal process in later releases.)
2. A test focused on the syntax used will be added in a separate proposal to UCO.
Due to needing SHACL qualified shapes, the CASE testing infrastructure also needs to require pySHACL >= 0.24.0, which incorporates a resolution to pySHACL Issue 213.
(Added 2024-02-15.) In information sharing situations, some data might be restricted from being shared or alluded to, e.g., from legally imposed redactions. If Org1 shares part of a graph with Org2, and includes some InvestigativeAction for, say, its timing and tool-use relevance, but doesn't share the identifier for the generated ProvenanceRecord, the shared data should by itself still be conformant to UCO, and should not impose UCO validation errors when folded into the receiving organization's knowledge base.

Competencies demonstrated

Competencies are omitted from this proposal, as the effects are new restrictions on data, and hence do not enable new expressive abilities.

Solution suggestion

For CASE 1.x.0, add the following to investigation.ttl:

investigation:InvestigativeAction
	rdfs:subClassOf [
		a owl:Restriction ;
		owl:onProperty uco-action:result ;
		owl:onClass investigation:ProvenanceRecord ;
		owl:minQualifiedCardinality "1"^^xsd:nonNegativeInteger ;
	] ;
	sh:property [
		sh:message "An InvestigativeAction should have a ProvenanceRecord among its results.  This will be a requirement in CASE 2.0.0."@en ;
		sh:path uco-action:result ;
		sh:qualifiedMinCount "1"^^xsd:integer ;
		sh:qualifiedValueShape [
			a sh:NodeShape ;
			sh:class investigation:ProvenanceRecord ;
		] ;
		sh:severity sh:Warning ;
	] ;
	.

For CASE 2.0.0, remove the sh:message and sh:severity triples from the added sh:PropertyShape.

Coordination

Tracking in Jira ticket ONT-493

Administrative review completed, proposal announced to Ontology Committees (OCs) on Jan. 26, 2024
Requirements to be discussed in OC meeting, date Feb.15, 2024
Risk 1 addressed - InvestigativeActions that have no non-ProvenanceRecord results confirmed supportable.
Requirements to be discussed in OC meeting, date TBD.

Requirements Review vote has not occurred

Requirements development phase completed.
Solution announced to OCs on TODO-date
Solutions Approval to be discussed in OC meeting, date TBD
Solutions Approval vote has not occurred

Solutions development phase completed.
Backwards-compatible implementation merged into develop for the next release
develop state with backwards-compatible implementation merged into develop-2.0.0
Backwards-incompatible implementation merged into develop-2.0.0 (or N/A)
Milestone linked
Documentation logged in pending release page
Prerelease publication: CASE develop branch updated to track UCO's updated develop branch
Prerelease publication: CASE develop-2.0.0 branch updated to track UCO's updated develop-2.0.0 branch

The text was updated successfully, but these errors were encountered:

This new shape stemmed from discussion on CASE Issue 136. As a matter of preserving backwards compatibility, this patch introduces the shape requiring `ProvenanceRecord`s with a `sh:Warning`-level severity. In CASE 2.0.0, this requirement will be strengthened into a `sh:Violation`. A separate proposal will be filed with UCO to test the minimum qualified cardinality OWL structure. A draft of that syntax review system was used to test this patch. This patch adds a version floor for pySHACL to ensure an update in qualified value shape handling is included, which is necessary for the new property shape to function when using pySHACL. Disclaimer: References: * RDFLib/pySHACL#213 * #136 * #146 Signed-off-by: Alex Nelson <[email protected]>

A follow-on patch will regenerate Make-managed files. References: * casework/CASE#146 Signed-off-by: Alex Nelson <[email protected]>

References: * casework/CASE#146 Signed-off-by: Alex Nelson <[email protected]>

A follow-on patch will regenerate Make-managed files. References: * casework/CASE#146 Signed-off-by: Alex Nelson <[email protected]>

References: * casework/CASE#146 Signed-off-by: Alex Nelson <[email protected]>

A follow-on patch will regenerate Make-managed files. References: * casework/CASE#146 Signed-off-by: Alex Nelson <[email protected]>

References: * casework/CASE#146 Signed-off-by: Alex Nelson <[email protected]>

A follow-on patch will regenerate Make-managed files. References: * casework/CASE#146 Signed-off-by: Alex Nelson <[email protected]>

sbarnum · 2024-02-15T07:26:14Z

While I agree with this proposal in intended spirit I do not feel it is viable due to Risk 1 and Risk 2 above.

I do not believe either of these risks can be ignored in favor of the intent of this proposal.

I believe that Risk 2 is real and could have a significant impact if ignored.
I believe that Risk 1 is very real and WILL have a critical impact if ignored. There are certainly investigative actions that could have no result.

We can say that an InvestigativeAction SHOULD have at least one ProvenanceRecord but we cannot say MUST.

ajnelson-nist · 2024-02-15T14:39:19Z

While I agree with this proposal in intended spirit I do not feel it is viable due to Risk 1 and Risk 2 above.

I do not believe either of these risks can be ignored in favor of the intent of this proposal.

I believe that Risk 2 is real and could have a significant impact if ignored. I believe that Risk 1 is very real and WILL have a critical impact if ignored. There are certainly investigative actions that could have no result.

We can say that an InvestigativeAction SHOULD have at least one ProvenanceRecord but we cannot say MUST.

More on Risk 1:

I'm more inclined to review and revise that minimum-count 1 SHACL rule on ContextualCompilation. This is not the first place that has caused an issue: the experimental extension ontology in CASE-Corpora is trying an alignment between DCAT-US (in short, a model for datasets) and CASE+UCO. Some things under DCAT-US looked like philosophic kindreds to ContextualCompilation, but would at times be appropriately empty (e.g., datasets with distribution files, but not publicly available distribution files). The sh:minCount 1 rule inherited from ContextualCompilation calls that a data error. So there is some subclassing in that repository that feels ...contortive.

I'm glad you and think it is appropriate to represent investigative actions that have no non-provenance-record results. I think it's a little strange-feeling to have a provenance record with no members as the sole result of an investigative action, but it isn't necessarily wrong. For instance, it could be a further sanity check down stream in CASE analysis if that "empty" provenance record were used by a later investigative action and nothing in the (empty) provenance record was also an input to that same investigative action. (This is inching out of scope of this proposal, but my gut's saying that's a sanity check I would be grateful to have; it sounds like it would catch copy-paste errors stemming from copying the wrong thing.)

I think Risk 1 is solely from ContextualCompilation having used SHACL for its minimum member count description instead of OWL. A SHACL minimum-1 count, anywhere, induces validation failures for incomplete information, so it is a construct that must be used sparingly. Should a UCO graph fail validation because it named a set (ContextualCompilation) but said nothing of its members? This is a bigger question for data sharing, which I'm noting here because this might be another risk specific to this proposal. Here's an example:

If Org1 shares part of a graph with Org2, and includes some InvestigativeAction for, say, its timing and tool-use relevance, but doesn't share the identifier for the generated ProvenanceRecord, should that shared data fail validation?

After discussion on this morning's call, it is likely that that spelling change for ContextualCompilation will be proposed.

ajnelson-nist · 2024-02-15T20:40:19Z

From discussion on this morning's call, we felt the risks (including the one realized just prior to the call on information sharing) left us uncertain the requirements are sufficiently captured. We will return to this after proposing at least one upstream matter on UCO to address Risk 1.

ajnelson-nist · 2024-02-15T20:46:06Z

The proposal has received some revisions (accompanied by string "2024-02-15"), and an extra step in its coordination checklist.

ajnelson-nist · 2024-05-03T20:43:47Z

Risk 1 has been addressed with the resolution of UCO Issue 599.

ajnelson-nist linked a pull request Jan 23, 2024 that will close this issue

Confirm an InvestigativeAction results in at least one ProvenanceRecord #147

Draft

12 tasks

ajnelson-nist added this to the CASE 1.x.0 milestone Jan 23, 2024

ajnelson-nist linked a pull request Jan 23, 2024 that will close this issue

Confirm an InvestigativeAction results in at least one ProvenanceRecord #147

Draft

12 tasks

ajnelson-nist added a commit to casework/casework.github.io that referenced this issue Jan 26, 2024

Bump unstable pointers

a7a0cc2

A follow-on patch will regenerate Make-managed files. References: * casework/CASE#146 Signed-off-by: Alex Nelson <[email protected]>

ajnelson-nist added a commit to casework/casework.github.io that referenced this issue Jan 26, 2024

Regenerate Make-managed files

a937433

References: * casework/CASE#146 Signed-off-by: Alex Nelson <[email protected]>

ajnelson-nist added a commit to casework/CASE-Examples that referenced this issue Jan 26, 2024

Bump unstable pointers

eebde7f

A follow-on patch will regenerate Make-managed files. References: * casework/CASE#146 Signed-off-by: Alex Nelson <[email protected]>

ajnelson-nist added a commit to casework/CASE-Examples that referenced this issue Jan 26, 2024

Regenerate Make-managed files

a381d76

References: * casework/CASE#146 Signed-off-by: Alex Nelson <[email protected]>

ajnelson-nist added a commit to casework/CASE-Examples that referenced this issue Jan 26, 2024

Add missed ProvenanceRecord

0907ba1

A follow-on patch will regenerate Make-managed files. References: * casework/CASE#146 Signed-off-by: Alex Nelson <[email protected]>

ajnelson-nist added a commit to casework/CASE-Examples that referenced this issue Jan 26, 2024

Regenerate Make-managed files

32952ff

References: * casework/CASE#146 Signed-off-by: Alex Nelson <[email protected]>

ajnelson-nist added a commit to casework/CASE-Corpora that referenced this issue Jan 26, 2024

Bump unstable pointers

eead15c

A follow-on patch will regenerate Make-managed files. References: * casework/CASE#146 Signed-off-by: Alex Nelson <[email protected]>

ajnelson-nist mentioned this issue Feb 9, 2024

UCO's OWL syntax review should test cardinality restrictions ucoProject/UCO#591

Open

15 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

`InvestigativeAction`s should be required to produce at least one `ProvenanceRecord` #146

`InvestigativeAction`s should be required to produce at least one `ProvenanceRecord` #146

ajnelson-nist commented Jan 23, 2024 •

edited

Loading

sbarnum commented Feb 15, 2024

ajnelson-nist commented Feb 15, 2024

ajnelson-nist commented Feb 15, 2024

ajnelson-nist commented Feb 15, 2024

ajnelson-nist commented May 3, 2024

InvestigativeActions should be required to produce at least one ProvenanceRecord #146

InvestigativeActions should be required to produce at least one ProvenanceRecord #146

Comments

ajnelson-nist commented Jan 23, 2024 • edited Loading

Background

Requirements

Requirement 1

Requirement 2

Risk / Benefit analysis

Benefits

Risks

Competencies demonstrated

Solution suggestion

Coordination

sbarnum commented Feb 15, 2024

ajnelson-nist commented Feb 15, 2024

ajnelson-nist commented Feb 15, 2024

ajnelson-nist commented Feb 15, 2024

ajnelson-nist commented May 3, 2024

`InvestigativeAction`s should be required to produce at least one `ProvenanceRecord` #146

`InvestigativeAction`s should be required to produce at least one `ProvenanceRecord` #146

ajnelson-nist commented Jan 23, 2024 •

edited

Loading