Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add runtime trace predicate type #111

Merged
merged 1 commit into from
Mar 17, 2023

Conversation

adityasaky
Copy link
Member

This is an early draft of a runtime trace predicate type. We've noted some subsections / fields that need more input.

cc @pxp928 @nadgowdas @jedsalazar

},
"monitorLog": {
"process": [
{ /* object */ }
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could we define "object" better here? Normalizing output from various trace tools will be helpful in discoverability.

I have some rough ideas of that could look like in this branch. https://github.com/testifysec/go-witness/blob/tetragon/attestation/commandrun/commandrun.go

Copy link
Member

@pxp928 pxp928 Oct 7, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey Cole,

yes we have defined it in the demo example as the following. Would be great to get feedback :

          "eventType": "kprobe",
          "functionName": “__x64_sys_write/fd_install”,
          “processBinary” : “/bin/cat”
          “exitCode”: int
          "arguments": [
               "-wait_file-post_file ",
               "/tekton/run/2/out",
               "/tekton/run/3/out",
               "-termination_path",
               "/tekton/termination",
               "-step_metadata_dir",
               "/tekton/run/3/status",
               "-entrypoint",
               "/tekton/scripts/script-3-g5chp",
               "--",
          ],
          "privileged": [
                 “CAP_SYS_ADMIN”,
                 ],
          },
          {
          "eventType": "kprobe",
          "functionName": “tcp_connect”,
          “processBinary” : “”
          "arguments": [
               "tcp",
               "10.244.0.6:34965",
               "104.198.14.52:8",
          ],
          "privileged": [],
    },

we do want to add hash of files and such as that might be useful. Looking at your branch we are very much on the same page and would love to work together on it. I think there is information that you include that would be valuable to add into the predicate.

Copy link
Member

@colek42 colek42 Oct 10, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We will probably adopt whatever schema is defined here. The only issue we have is that the data needs to be normalized. For our use case, we would like the user to be able to define the same policy no matter what trace tool was used, i.e. tetragon, strace, etc

It looks like the schema here is from Tetragon. I think this needs to be mutated into a common format, and we should add another field(s) to define the trace provider.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

100% agree. We do want to make this generic that other tools can be used. Adding in fields that are missing. As for trace provider we do have:

  "predicate": {
    "monitor": {
      "id": "<URI>"
    },

which would define which tool is being used.

"invocation": {
"configSource": {},
"tracePolicy": {
"policies": [
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Trace policies would also need to define how they are configured. Below is a snippet from an output from tetragon. This still needs to be re-worked and formatted to be much more generalized. Not a representation of the final format, but wanted to put an example here for discussion.

"policies": [
          {
            "Name": "connect",
            "Config": "Name: /var/lib/tetragon/bpf_generic_kprobe_v53.o, Attach: tcp_connect, Label: kprobe/generic_kprobe, PinPath: gkp-sensor-1-gkp-0-tcp_connect_prog, RetProbe: false, ErrorFatal: true, Override: false, Type: generic_kprobe, LoaderDate: {ID:0};Name: /var/lib/tetragon/bpf_generic_kprobe_v53.o, Attach: tcp_close, Label: kprobe/generic_kprobe, PinPath: gkp-sensor-1-gkp-1-tcp_close_prog, RetProbe: false, ErrorFatal: true, Override: false, Type: generic_kprobe, LoaderDate: {ID:1};Name: /var/lib/tetragon/bpf_generic_kprobe_v53.o, Attach: tcp_sendmsg, Label: kprobe/generic_kprobe, PinPath: gkp-sensor-1-gkp-2-tcp_sendmsg_prog, RetProbe: false, ErrorFatal: true, Override: false, Type: generic_kprobe, LoaderDate: {ID:2}"
          },
          {
            "Name": "sys-read-follow-prefix",
            "Config": "Name: /var/lib/tetragon/bpf_generic_kprobe_v53.o, Attach: fd_install, Label: kprobe/generic_kprobe, PinPath: gkp-sensor-4-gkp-3-fd_install_prog, RetProbe: false, ErrorFatal: true, Override: false, Type: generic_kprobe, LoaderDate: {ID:3};Name: /var/lib/tetragon/bpf_generic_kprobe_v53.o, Attach: __x64_sys_close, Label: kprobe/generic_kprobe, PinPath: gkp-sensor-4-gkp-4-__x64_sys_close_prog, RetProbe: false, ErrorFatal: true, Override: false, Type: generic_kprobe, LoaderDate: {ID:4};Name: /var/lib/tetragon/bpf_generic_kprobe_v53.o, Attach: __x64_sys_read, Label: kprobe/generic_kprobe, PinPath: gkp-sensor-4-gkp-5-__x64_sys_read_prog, RetProbe: false, ErrorFatal: true, Override: false, Type: generic_kprobe, LoaderDate: {ID:5};Name: /var/lib/tetragon/bpf_generic_retkprobe_v53.o, Attach: __x64_sys_read, Label: kprobe/generic_retkprobe, PinPath: gkp-sensor-4-gkp-5-__x64_sys_read_ret_prog, RetProbe: true, ErrorFatal: true, Override: false, Type: generic_kprobe, LoaderDate: {ID:5};Name: /var/lib/tetragon/bpf_generic_kprobe_v53.o, Attach: __x64_sys_write, Label: kprobe/generic_kprobe, PinPath: gkp-sensor-4-gkp-6-__x64_sys_write_prog, RetProbe: false, ErrorFatal: true, Override: false, Type: generic_kprobe, LoaderDate: {ID:6}"
          }
        ]
      }

@colek42
Copy link
Member

colek42 commented Nov 19, 2022

I uploaded a sample of our schema here: https://gist.github.com/colek42/a9c82d3314f2cb109bfbb0b67c5e0f99

This is the data that was generated during the SecurityCon demo.

@adityasaky adityasaky marked this pull request as ready for review January 27, 2023 20:00
@adityasaky adityasaky changed the title Add draft of runtime trace predicate Add runtime trace predicate type Jan 30, 2023
@adityasaky adityasaky requested a review from a team January 30, 2023 11:48
@adityasaky adityasaky force-pushed the add-runtime-trace-attestation branch from 0c5e744 to fea4a2c Compare February 1, 2023 00:12
@itaysk
Copy link

itaysk commented Feb 1, 2023

this is very interesting, we're working in this space with tracee-action and have some comments, but will have to put my thoughts together after CNSC this week. BTW - is of the contributors of this PR in CNSC this week? happy to chat

@adityasaky
Copy link
Member Author

@itaysk I'm attending the conference! I'll message you on Slack, let's find some time to chat.

Copy link

@itaysk itaysk left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good stuff! added some comments.
BTW when I say "us","we","our projects" I mean https://github.com/aquasecurity/tracee and https://github.com/aquasecurity/tracee-action

spec/predicates/runtime-trace.md Outdated Show resolved Hide resolved

`build` _object_, _required_

Feedback required: Should this predicate be scoped to runtime traces of builds only or generalize to runtime traces of any activity?
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we need to either rename the attestation to mention build, or try to generalize the all the build-specific fields (most obvious example is the field called build). currently it doesn't make sense

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've updated this to use process instead of build. WDYT?

cc @pxp928 @nadgowdas

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated again to monitoredProcess to avoid collision with process in monitorLog.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes that looks good. Thanks @adityasaky


`monitorLog.network` _list_, _optional_

`monitorLog.fileAccess` _list_, _optional_
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just a note, that for us including reads was way too verbose. instead we capture only writes

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We want to capture reads to generate a BOM.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we also do other cool things with reads, question is this needs to be included in the runtime trace attestation as well (in other words if it's related to provenance, I suppose)


`monitorLog.fileAccess` _list_, _optional_

Record of files accessed during the build process. A complete list of _materials_ can be derived from this information. Each entry in this list is expected to record the path of the file and one or more digests of the file. This field is a list rather than a key value map because a single file may be used multiple times during the build process. Further, some files that are accessed may _change_ during the build process, and so, different entries may have different digests.
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

every other monitorLog (process, network) was abstract, this is the only one that specifies the contents of the trace. I think at this stage we should keep all of them abstract.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In order to create policy over the attestation we need to have a defined schema for the monitor log.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

oh I see. policy use case was not mentioned in this spec so I presumed the use case was to enrich provenance data.
I have a feeling that discussing the detailed logs schema will be much longer then the outline schema. If that's the intention, let's see the schemas and then we can discuss.


While this predicate can be used to log file accesses, the actual technique used to capture the file access event has some implications. If a synchronous monitor, for example one that uses `ptrace` to trace the file access system calls, is used, then the build process can be paused while the file's digest is calculated and stored. However, asynchronous monitors such as those using eBPF cannot pause the build process before the file is actually used. Therefore, they cannot make as strong guarantees about the digests of the files accessed. Verifiers using runtime trace attestations for file accesses must careful about what guarantees they are actually getting based on how the build process was monitored.

Feedback required: There were discussions about including a field for "materials" derived from file accesses. Should it be within monitorLog?
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I didn't see how materials is related to "runtime trace", so I think no, but I might have missed something


Effectively a pointer to the monitor's configuration.

`invocation.tracePolicy` _object_, _optional_
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IMO tracePolicy should move under monitor. Different tracers have different features that might not be available across the board. also the syntax will definitely be different between tracers and it will make most sense to keep the trace policy in it's original form. so if the tracePolicy is monitor-specific it's clearer to move it there.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes thats a good suggestion!

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we can move all of invocation's fields into the monitor object. WDYT @pxp928?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sounds good. Thanks!


`monitorLog` _object_, _required_

Feedback required: Should different types of monitors (process vs network for example) emit different attestation subtypes? If they are all combined in one, should there still be some differentiation within `monitorLog`?
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

could you please clarify this question?


Effectively a pointer to the monitor's configuration.

`invocation.tracePolicy` _object_, _optional_
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we define what is the scope of the trace or it is up to the implementation to decide? For example, in GH Actions, each run gets a runner vm which executes the workflow. a tracer will likely be installed at the runner (vm) level, so it could trace the entire runner or be scoped to events just from the workflow.
I'm putting this note here because it could also be just part of the trace policy.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it's up to the implementation to decide. This spec should be about how the final trace is communicated and verified. The verification policy could check, I think, the scope and if the trace is usable for its context.


Feedback required: Should different types of monitors (process vs network for example) emit different attestation subtypes? If they are all combined in one, should there still be some differentiation within `monitorLog`?

`monitorLog.process` _list_, _optional_
Copy link

@itaysk itaysk Feb 3, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

process args/envs might include secrets which the user would not want to logged. since we're not specifying the schema of the process log, I guess it's an issue for the implementor and not for this attestation. I thought I'd just make the note here anyway

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should the spec require the implementer to specify black-listed env vars? We could also hash the variables.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

depending on whether this statement of mine is still true of not "since we're not specifying the schema of the process log" after #111 (comment)


Feedback required: Should tracingPolicy point outwards to a separate attestation with the same subject that wraps around the actual policy? The policy can be more securely defined and distributed (signed for by developer keys rather than online keys for example).

`monitorLog` _object_, _required_
Copy link

@itaysk itaysk Feb 3, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

probably very subjective but trace might be more descriptive in this case than monitor. the attestation is also called runtime trace. and it's definition is a trace policy. so calling this monitor here is standing out for me

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fair! I think we've got a couple of different field rename suggestions to consider and implement here.

Copy link
Contributor

@marcelamelara marcelamelara left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall, I think exposing runtime trace events for builders will be really useful. That said, I have two concerns with this predicate as it's proposed right now.

(1) It's decently closely tied to the SLSA Provenance format; Will this predicate have to be updated in lockstep with every SLSA Provenance update? Maybe this won't be too bad if this predicate only needs updates with every major SLSA update or something like that, but it's still a consideration. So my recommendation would be to ensure this predicate is sufficiently decoupled from SLSA to minimize this.

(2) Given the example, I expect full runtime traces to be quite large in practice, generating a lot of data and large attestations. Especially if the intent is to generate a runtime trace attestation at every build. Is there a way this predicate could be more of a summary of a runtime trace that points to the full event log instead of including the full event log in the attestation? I'm not too familiar with different tracing tools, but I imagine that might make the predicate more interoperable between different tracing tools, too. I'm also thinking making heavier use of Reference- and/or SCAI-type fields might be helpful for this.

spec/predicates/runtime-trace.md Show resolved Hide resolved
spec/predicates/runtime-trace.md Outdated Show resolved Hide resolved
spec/predicates/runtime-trace.md Show resolved Hide resolved
spec/predicates/runtime-trace.md Outdated Show resolved Hide resolved
spec/predicates/runtime-trace.md Outdated Show resolved Hide resolved
spec/predicates/runtime-trace.md Show resolved Hide resolved
spec/predicates/runtime-trace.md Outdated Show resolved Hide resolved
Copy link
Contributor

@marcelamelara marcelamelara left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the updates! Left a few comments.

spec/predicates/runtime-trace.md Outdated Show resolved Hide resolved
spec/predicates/runtime-trace.md Outdated Show resolved Hide resolved
Comment on lines 132 to 135
Feedback required: Should tracingPolicy point outwards to a separate attestation
with the same subject that wraps around the actual policy? The policy can be
more securely defined and distributed (signed for by developer keys rather than
online keys for example).
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It strikes me that the tracePolicy wouldn't actually change as often as the runtime traces are generated (correct me if I'm wrong), so I think it would definitely make sense to include a reference to a separate policy attestation that can be reused for any trace it applies to.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think I was in favour of having it be a separate document but I think either @pxp928 or @nadgowdas noted that we need to unambiguously map the trace policy to the trace so that we can reason about it when writing policies for monitorLog. For a separate document, I guess the monitor would have to understand that a configured trace policy matches the corresponding attestation? Perhaps a TODO past the v0.1 spec?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Using a separate document for the policy is a better solution in retrospect. I agree that it would not change that often but it would have to be published alongside the runtime attestation each time to ensure which policy was set when the trace was captured. If the policy for example is set to not capture any network logs, we cannot attest that during the build time, the pipeline did not reach out to the internet.

Copy link
Member Author

@adityasaky adityasaky Mar 16, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we have a number of points for v0.2. I'll try to capture them into one place we can use to iterate.

@adityasaky adityasaky force-pushed the add-runtime-trace-attestation branch from 3895b85 to 17233ac Compare March 10, 2023 16:50
@adityasaky
Copy link
Member Author

Note to reviewers: I'll clean up git history once you reckon this is good to go for v0.1. 😄

@adityasaky adityasaky requested a review from a team March 10, 2023 16:52
@adityasaky adityasaky force-pushed the add-runtime-trace-attestation branch 2 times, most recently from 102253c to 9dee7e8 Compare March 10, 2023 18:28
@adityasaky adityasaky force-pushed the add-runtime-trace-attestation branch from 9dee7e8 to 9f0bafe Compare March 10, 2023 21:33
Copy link
Contributor

@marcelamelara marcelamelara left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for all the updates! I think this is ready for v0.1.

@adityasaky adityasaky force-pushed the add-runtime-trace-attestation branch from 9f0bafe to 854d8d5 Compare March 13, 2023 20:03
@adityasaky
Copy link
Member Author

Updated after @marcelamelara's approval to update predicate type to https://in-toto.io/attestation/runtime-trace/v0.1 from https://in-toto.io/attestation/runtime-trace/v0.1.0.

spec/predicates/runtime-trace.md Outdated Show resolved Hide resolved
spec/predicates/runtime-trace.md Outdated Show resolved Hide resolved
spec/predicates/runtime-trace.md Outdated Show resolved Hide resolved
spec/predicates/runtime-trace.md Outdated Show resolved Hide resolved
spec/predicates/runtime-trace.md Outdated Show resolved Hide resolved
spec/predicates/runtime-trace.md Outdated Show resolved Hide resolved
spec/predicates/runtime-trace.md Outdated Show resolved Hide resolved
spec/predicates/runtime-trace.md Outdated Show resolved Hide resolved
spec/predicates/runtime-trace.md Show resolved Hide resolved
@adityasaky adityasaky force-pushed the add-runtime-trace-attestation branch from 854d8d5 to a0719ab Compare March 14, 2023 20:56
@adityasaky adityasaky force-pushed the add-runtime-trace-attestation branch 3 times, most recently from d5a6389 to 78edb2b Compare March 15, 2023 16:23
@TomHennen
Copy link
Contributor

LGTM for v0.1 (just a couple minor comments outstanding).

Thanks!

Signed-off-by: Aditya Sirish <[email protected]>
Co-authored-by: Parth Patel <[email protected]>
Co-authored-by: Shripad Nadgowda <[email protected]>
@adityasaky adityasaky force-pushed the add-runtime-trace-attestation branch from 78edb2b to c8487cb Compare March 15, 2023 20:03
@pxp928
Copy link
Member

pxp928 commented Mar 16, 2023

I had some outstanding comment responses that I never submitted. Thanks, @adityasaky for updating this :)

@TomHennen TomHennen merged commit bee6ee7 into in-toto:main Mar 17, 2023
@adityasaky adityasaky deleted the add-runtime-trace-attestation branch March 17, 2023 15:12
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants