-
Notifications
You must be signed in to change notification settings - Fork 69
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add runtime trace predicate type #111
Add runtime trace predicate type #111
Conversation
spec/predicates/runtime-trace.md
Outdated
}, | ||
"monitorLog": { | ||
"process": [ | ||
{ /* object */ } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could we define "object" better here? Normalizing output from various trace tools will be helpful in discoverability.
I have some rough ideas of that could look like in this branch. https://github.com/testifysec/go-witness/blob/tetragon/attestation/commandrun/commandrun.go
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hey Cole,
yes we have defined it in the demo example as the following. Would be great to get feedback :
"eventType": "kprobe",
"functionName": “__x64_sys_write/fd_install”,
“processBinary” : “/bin/cat”
“exitCode”: int
"arguments": [
"-wait_file-post_file ",
"/tekton/run/2/out",
"/tekton/run/3/out",
"-termination_path",
"/tekton/termination",
"-step_metadata_dir",
"/tekton/run/3/status",
"-entrypoint",
"/tekton/scripts/script-3-g5chp",
"--",
],
"privileged": [
“CAP_SYS_ADMIN”,
],
},
{
"eventType": "kprobe",
"functionName": “tcp_connect”,
“processBinary” : “”
"arguments": [
"tcp",
"10.244.0.6:34965",
"104.198.14.52:8",
],
"privileged": [],
},
we do want to add hash of files and such as that might be useful. Looking at your branch we are very much on the same page and would love to work together on it. I think there is information that you include that would be valuable to add into the predicate.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We will probably adopt whatever schema is defined here. The only issue we have is that the data needs to be normalized. For our use case, we would like the user to be able to define the same policy no matter what trace tool was used, i.e. tetragon, strace, etc
It looks like the schema here is from Tetragon. I think this needs to be mutated into a common format, and we should add another field(s) to define the trace provider.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
100% agree. We do want to make this generic that other tools can be used. Adding in fields that are missing. As for trace provider we do have:
"predicate": {
"monitor": {
"id": "<URI>"
},
which would define which tool is being used.
"invocation": { | ||
"configSource": {}, | ||
"tracePolicy": { | ||
"policies": [ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Trace policies would also need to define how they are configured. Below is a snippet from an output from tetragon. This still needs to be re-worked and formatted to be much more generalized. Not a representation of the final format, but wanted to put an example here for discussion.
"policies": [
{
"Name": "connect",
"Config": "Name: /var/lib/tetragon/bpf_generic_kprobe_v53.o, Attach: tcp_connect, Label: kprobe/generic_kprobe, PinPath: gkp-sensor-1-gkp-0-tcp_connect_prog, RetProbe: false, ErrorFatal: true, Override: false, Type: generic_kprobe, LoaderDate: {ID:0};Name: /var/lib/tetragon/bpf_generic_kprobe_v53.o, Attach: tcp_close, Label: kprobe/generic_kprobe, PinPath: gkp-sensor-1-gkp-1-tcp_close_prog, RetProbe: false, ErrorFatal: true, Override: false, Type: generic_kprobe, LoaderDate: {ID:1};Name: /var/lib/tetragon/bpf_generic_kprobe_v53.o, Attach: tcp_sendmsg, Label: kprobe/generic_kprobe, PinPath: gkp-sensor-1-gkp-2-tcp_sendmsg_prog, RetProbe: false, ErrorFatal: true, Override: false, Type: generic_kprobe, LoaderDate: {ID:2}"
},
{
"Name": "sys-read-follow-prefix",
"Config": "Name: /var/lib/tetragon/bpf_generic_kprobe_v53.o, Attach: fd_install, Label: kprobe/generic_kprobe, PinPath: gkp-sensor-4-gkp-3-fd_install_prog, RetProbe: false, ErrorFatal: true, Override: false, Type: generic_kprobe, LoaderDate: {ID:3};Name: /var/lib/tetragon/bpf_generic_kprobe_v53.o, Attach: __x64_sys_close, Label: kprobe/generic_kprobe, PinPath: gkp-sensor-4-gkp-4-__x64_sys_close_prog, RetProbe: false, ErrorFatal: true, Override: false, Type: generic_kprobe, LoaderDate: {ID:4};Name: /var/lib/tetragon/bpf_generic_kprobe_v53.o, Attach: __x64_sys_read, Label: kprobe/generic_kprobe, PinPath: gkp-sensor-4-gkp-5-__x64_sys_read_prog, RetProbe: false, ErrorFatal: true, Override: false, Type: generic_kprobe, LoaderDate: {ID:5};Name: /var/lib/tetragon/bpf_generic_retkprobe_v53.o, Attach: __x64_sys_read, Label: kprobe/generic_retkprobe, PinPath: gkp-sensor-4-gkp-5-__x64_sys_read_ret_prog, RetProbe: true, ErrorFatal: true, Override: false, Type: generic_kprobe, LoaderDate: {ID:5};Name: /var/lib/tetragon/bpf_generic_kprobe_v53.o, Attach: __x64_sys_write, Label: kprobe/generic_kprobe, PinPath: gkp-sensor-4-gkp-6-__x64_sys_write_prog, RetProbe: false, ErrorFatal: true, Override: false, Type: generic_kprobe, LoaderDate: {ID:6}"
}
]
}
I uploaded a sample of our schema here: https://gist.github.com/colek42/a9c82d3314f2cb109bfbb0b67c5e0f99 This is the data that was generated during the SecurityCon demo. |
0c5e744
to
fea4a2c
Compare
this is very interesting, we're working in this space with tracee-action and have some comments, but will have to put my thoughts together after CNSC this week. BTW - is of the contributors of this PR in CNSC this week? happy to chat |
@itaysk I'm attending the conference! I'll message you on Slack, let's find some time to chat. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good stuff! added some comments.
BTW when I say "us","we","our projects" I mean https://github.com/aquasecurity/tracee and https://github.com/aquasecurity/tracee-action
spec/predicates/runtime-trace.md
Outdated
|
||
`build` _object_, _required_ | ||
|
||
Feedback required: Should this predicate be scoped to runtime traces of builds only or generalize to runtime traces of any activity? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we need to either rename the attestation to mention build, or try to generalize the all the build-specific fields (most obvious example is the field called build
). currently it doesn't make sense
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've updated this to use process
instead of build
. WDYT?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Updated again to monitoredProcess
to avoid collision with process
in monitorLog
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yes that looks good. Thanks @adityasaky
spec/predicates/runtime-trace.md
Outdated
|
||
`monitorLog.network` _list_, _optional_ | ||
|
||
`monitorLog.fileAccess` _list_, _optional_ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just a note, that for us including reads was way too verbose. instead we capture only writes
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We want to capture reads to generate a BOM.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we also do other cool things with reads, question is this needs to be included in the runtime trace attestation as well (in other words if it's related to provenance, I suppose)
spec/predicates/runtime-trace.md
Outdated
|
||
`monitorLog.fileAccess` _list_, _optional_ | ||
|
||
Record of files accessed during the build process. A complete list of _materials_ can be derived from this information. Each entry in this list is expected to record the path of the file and one or more digests of the file. This field is a list rather than a key value map because a single file may be used multiple times during the build process. Further, some files that are accessed may _change_ during the build process, and so, different entries may have different digests. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
every other monitorLog (process, network) was abstract, this is the only one that specifies the contents of the trace. I think at this stage we should keep all of them abstract.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In order to create policy over the attestation we need to have a defined schema for the monitor log.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
oh I see. policy use case was not mentioned in this spec so I presumed the use case was to enrich provenance data.
I have a feeling that discussing the detailed logs schema will be much longer then the outline schema. If that's the intention, let's see the schemas and then we can discuss.
spec/predicates/runtime-trace.md
Outdated
|
||
While this predicate can be used to log file accesses, the actual technique used to capture the file access event has some implications. If a synchronous monitor, for example one that uses `ptrace` to trace the file access system calls, is used, then the build process can be paused while the file's digest is calculated and stored. However, asynchronous monitors such as those using eBPF cannot pause the build process before the file is actually used. Therefore, they cannot make as strong guarantees about the digests of the files accessed. Verifiers using runtime trace attestations for file accesses must careful about what guarantees they are actually getting based on how the build process was monitored. | ||
|
||
Feedback required: There were discussions about including a field for "materials" derived from file accesses. Should it be within monitorLog? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I didn't see how materials is related to "runtime trace", so I think no, but I might have missed something
spec/predicates/runtime-trace.md
Outdated
|
||
Effectively a pointer to the monitor's configuration. | ||
|
||
`invocation.tracePolicy` _object_, _optional_ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
IMO tracePolicy
should move under monitor
. Different tracers have different features that might not be available across the board. also the syntax will definitely be different between tracers and it will make most sense to keep the trace policy in it's original form. so if the tracePolicy is monitor-specific it's clearer to move it there.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+1
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes thats a good suggestion!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we can move all of invocation's fields into the monitor object. WDYT @pxp928?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
sounds good. Thanks!
spec/predicates/runtime-trace.md
Outdated
|
||
`monitorLog` _object_, _required_ | ||
|
||
Feedback required: Should different types of monitors (process vs network for example) emit different attestation subtypes? If they are all combined in one, should there still be some differentiation within `monitorLog`? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
could you please clarify this question?
spec/predicates/runtime-trace.md
Outdated
|
||
Effectively a pointer to the monitor's configuration. | ||
|
||
`invocation.tracePolicy` _object_, _optional_ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should we define what is the scope of the trace or it is up to the implementation to decide? For example, in GH Actions, each run gets a runner vm which executes the workflow. a tracer will likely be installed at the runner (vm) level, so it could trace the entire runner or be scoped to events just from the workflow.
I'm putting this note here because it could also be just part of the trace policy.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it's up to the implementation to decide. This spec should be about how the final trace is communicated and verified. The verification policy could check, I think, the scope and if the trace is usable for its context.
spec/predicates/runtime-trace.md
Outdated
|
||
Feedback required: Should different types of monitors (process vs network for example) emit different attestation subtypes? If they are all combined in one, should there still be some differentiation within `monitorLog`? | ||
|
||
`monitorLog.process` _list_, _optional_ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
process args/envs might include secrets which the user would not want to logged. since we're not specifying the schema of the process log, I guess it's an issue for the implementor and not for this attestation. I thought I'd just make the note here anyway
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should the spec require the implementer to specify black-listed env vars? We could also hash the variables.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
depending on whether this statement of mine is still true of not "since we're not specifying the schema of the process log" after #111 (comment)
|
||
Feedback required: Should tracingPolicy point outwards to a separate attestation with the same subject that wraps around the actual policy? The policy can be more securely defined and distributed (signed for by developer keys rather than online keys for example). | ||
|
||
`monitorLog` _object_, _required_ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
probably very subjective but trace might be more descriptive in this case than monitor. the attestation is also called runtime trace. and it's definition is a trace policy. so calling this monitor here is standing out for me
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fair! I think we've got a couple of different field rename suggestions to consider and implement here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Overall, I think exposing runtime trace events for builders will be really useful. That said, I have two concerns with this predicate as it's proposed right now.
(1) It's decently closely tied to the SLSA Provenance format; Will this predicate have to be updated in lockstep with every SLSA Provenance update? Maybe this won't be too bad if this predicate only needs updates with every major SLSA update or something like that, but it's still a consideration. So my recommendation would be to ensure this predicate is sufficiently decoupled from SLSA to minimize this.
(2) Given the example, I expect full runtime traces to be quite large in practice, generating a lot of data and large attestations. Especially if the intent is to generate a runtime trace attestation at every build. Is there a way this predicate could be more of a summary of a runtime trace that points to the full event log instead of including the full event log in the attestation? I'm not too familiar with different tracing tools, but I imagine that might make the predicate more interoperable between different tracing tools, too. I'm also thinking making heavier use of Reference- and/or SCAI-type fields might be helpful for this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the updates! Left a few comments.
spec/predicates/runtime-trace.md
Outdated
Feedback required: Should tracingPolicy point outwards to a separate attestation | ||
with the same subject that wraps around the actual policy? The policy can be | ||
more securely defined and distributed (signed for by developer keys rather than | ||
online keys for example). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It strikes me that the tracePolicy
wouldn't actually change as often as the runtime traces are generated (correct me if I'm wrong), so I think it would definitely make sense to include a reference to a separate policy attestation that can be reused for any trace it applies to.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think I was in favour of having it be a separate document but I think either @pxp928 or @nadgowdas noted that we need to unambiguously map the trace policy to the trace so that we can reason about it when writing policies for monitorLog
. For a separate document, I guess the monitor would have to understand that a configured trace policy matches the corresponding attestation? Perhaps a TODO past the v0.1 spec?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Using a separate document for the policy is a better solution in retrospect. I agree that it would not change that often but it would have to be published alongside the runtime attestation each time to ensure which policy was set when the trace was captured. If the policy for example is set to not capture any network logs, we cannot attest that during the build time, the pipeline did not reach out to the internet.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we have a number of points for v0.2. I'll try to capture them into one place we can use to iterate.
3895b85
to
17233ac
Compare
Note to reviewers: I'll clean up git history once you reckon this is good to go for v0.1. 😄 |
102253c
to
9dee7e8
Compare
9dee7e8
to
9f0bafe
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for all the updates! I think this is ready for v0.1.
9f0bafe
to
854d8d5
Compare
Updated after @marcelamelara's approval to update predicate type to |
854d8d5
to
a0719ab
Compare
d5a6389
to
78edb2b
Compare
LGTM for v0.1 (just a couple minor comments outstanding). Thanks! |
Signed-off-by: Aditya Sirish <[email protected]> Co-authored-by: Parth Patel <[email protected]> Co-authored-by: Shripad Nadgowda <[email protected]>
78edb2b
to
c8487cb
Compare
I had some outstanding comment responses that I never submitted. Thanks, @adityasaky for updating this :) |
This is an early draft of a runtime trace predicate type. We've noted some subsections / fields that need more input.
cc @pxp928 @nadgowdas @jedsalazar