Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(event): export thread, process and parent entity id #3503

Merged

Conversation

AlonZivony
Copy link
Contributor

@AlonZivony AlonZivony commented Sep 26, 2023

1. Explain what the PR does

4febd87 feat(api): add thread entity id to protobuf
eb156ab feat(event): export thread, process and parent entity ids

2. Explain how to test it

Manually checked that hashes matches the values from the process tree.

3. Other comments

types/trace/trace.go Outdated Show resolved Hide resolved
@rafaeldtinoco
Copy link
Contributor

rafaeldtinoco commented Sep 26, 2023

Lets have the 3 IDs available ? This way we can always find ourselves in the proctree, and we can also find parent and leader for artifacts used and anything related.

If you add it in this PR we can merge it before any other, and then I'll rebase #3498 and you will likely fix yours (datasource) as well.

@AlonZivony AlonZivony force-pushed the feature/tasks-hashes-in-event branch 2 times, most recently from 4febd87 to 88f3698 Compare September 26, 2023 14:56
@AlonZivony AlonZivony changed the title feat(event): export thread and process entity id feat(event): export thread, process and parent entity id Sep 26, 2023
types/trace/trace.go Outdated Show resolved Hide resolved
Copy link
Contributor

@rafaeldtinoco rafaeldtinoco left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@AlonZivony I believe there are some minor issues/things to be addressed (a padding, a .proto file change + make proto run). Everything else seems ok to me. Feel free to place the type change PR so we can merge that before this one.

@AlonZivony
Copy link
Contributor Author

Will push the changes after #3505 will be merged

@rafaeldtinoco
Copy link
Contributor

I have merged the type change, will upgrade the type version and adjust the grpc interface in another PR then you will be fine to rebase this one with the nit fixes.

@rafaeldtinoco
Copy link
Contributor

rafaeldtinoco commented Sep 26, 2023

@AlonZivony

go get -u github.com/aquasecurity/tracee/types@9c94c39ad
git commit -a -m 'chore(types): upgrade to latest types including entity ids'

as the first commit of this PR, then the rest of the PR will take care of the type changes, likely. Don't forget the grpc code:

$ git diff -p
diff --git a/pkg/server/grpc/tracee.go b/pkg/server/grpc/tracee.go
index 6f75cc929..958e92247 100644
--- a/pkg/server/grpc/tracee.go
+++ b/pkg/server/grpc/tracee.go
@@ -170,13 +170,14 @@ func getProcess(e trace.Event) *pb.Process {
 
        return &pb.Process{
                // Executable
-               EntityId:      wrapperspb.UInt32(e.EntityID),
+               EntityId:      wrapperspb.UInt32(e.ProcessEntityId),
                Pid:           wrapperspb.UInt32(uint32(e.HostProcessID)),
                NamespacedPid: wrapperspb.UInt32(uint32(e.ProcessID)),
                RealUser: &pb.User{
                        Id: wrapperspb.UInt32(uint32(e.UserID)),
                },
                Thread: &pb.Thread{
+                       EntityId:       wrapperspb.UInt32(e.ThreadEntityId),
                        Start:          threadStartTime,
                        Name:           e.ProcessName,
                        Tid:            wrapperspb.UInt32(uint32(e.HostThreadID)),
@@ -186,6 +187,7 @@ func getProcess(e trace.Event) *pb.Process {
                        UserStackTrace: userStackTrace,
                },
                Parent: &pb.Parent{
+                       EntityId:      wrapperspb.UInt32(e.ParentEntityId),
                        Pid:           wrapperspb.UInt32(uint32(e.HostParentProcessID)),
                        NamespacedPid: wrapperspb.UInt32(uint32(e.ParentProcessID)),
                },

Copy link
Collaborator

@yanivagman yanivagman left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Isn't it too much to send all the hashes on every event?

@yanivagman
Copy link
Collaborator

Isn't it too much to send all the hashes on every event?

I mean, from bpf to userspace.
All of these can be extracted from the process tree without sending them from bpf code

@AlonZivony
Copy link
Contributor Author

AlonZivony commented Sep 26, 2023

Isn't it too much to send all the hashes on every event?

I mean, from bpf to userspace. All of these can be extracted from the process tree without sending them from bpf code

You are right that you can get them from user-mode, but it is a pretty expensive procedure.
We might want to calculate them only before sending to user space, instead of for each hook triggered, but it will complicate things.
Changing where they are calculated can be done in the future without breaking anything as we don't use the hash in the kernel currently, so I don't think its too much to merge now and maybe change in the future.

@AlonZivony AlonZivony force-pushed the feature/tasks-hashes-in-event branch 2 times, most recently from d973f30 to 5e66116 Compare September 26, 2023 21:33
@rafaeldtinoco
Copy link
Contributor

rafaeldtinoco commented Sep 26, 2023

Isn't it too much to send all the hashes on every event?

I mean, from bpf to userspace. All of these can be extracted from the process tree without sending them from bpf code

You are right that you can get them from user-mode, but it is a pretty expensive procedure. We might want to calculate them only before sending to user space, instead of for each hook triggered, but it will complicate things. Changing where they are calculated can be done in the future without breaking anything as we don't use the hash in the kernel currently, so I don't think its too much to merge now and maybe change in the future.

Actually, if we missed the SignalSchedProcEvent, for example, then we would have no information about parent, and leader, and we wouldn't be able to find the relation (or to create their nodes). Reading procfs wouldn't be always possible, for mitigating this, due to fast spawning processes.

Imagining the situation we do have the nodes in the process tree (for parent and leader) but not the leader and parent hashes... by the time the event is decoded, we would have to check something like "recent pids" cache (since we don't have leader and parent start time information), so it would involve keeping non unique data (recent PIDs and/or start_time and/or their hashes) for something that is meant to be unique (like the hash IDs, which is unique no matter what, as long as there are no hash collisions).

With that said, considering we're talking about adding a u64 to each event (2 u32 hashes), I think we should stick with the hashes for parent, leader and task and consider this as "unique task identifiers" (considering PIDs and TIDs aren't necessarily unique during tracee lifetime). The benefits look way bigger than the disadvantages.

Copy link
Contributor

@rafaeldtinoco rafaeldtinoco left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. There is a tiny test change need (and the kernel tests run) before this can be merged, IMO.

@yanivagman
Copy link
Collaborator

With that said, considering we're talking about adding a u64 to each event (2 u32 hashes), I think we should stick with the hashes for parent, leader and task and consider this as "unique task identifiers" (considering PIDs and TIDs aren't necessarily unique during tracee lifetime). The benefits look way bigger than the disadvantages.

Actually there are now 144 bytes and before there were 128, adding 16 bytes in total. Another point to be made is that now we need to calculate the hash in bpf code for every event, which might degrade performance.

Copy link
Collaborator

@yanivagman yanivagman left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we should check the performance implications of this change before merging it:

  1. Event header size is now 12.5% bigger
  2. For every event (not only those sent to userspace) we calculate 3 hashes in bpf code

This needs to be checked against a high poad of events. For example, using magic_write event and sched_switch

pkg/bufferdecoder/decoder.go Show resolved Hide resolved
hash_task_id(get_task_host_tgid(leader), get_task_start_time(leader));
struct task_struct *parent = get_leader_task(get_parent_task(leader));
context->task.parent_hash_id =
hash_task_id(get_task_host_tgid(parent), get_task_start_time(parent));
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The hashing happens on init_context, which means not only we do the calculation for every event sent to userspace, but for every event in general, which may have performance implications

Change the current event entity id to be the proper process event ID
(was the task entity ID before), and add the thread and parent entity
IDs.
Now the values are given from the kernel.
Move the entity ID calculation to the user space.
This way the calculation per event is reduced, and we have no
exceptions between different events.
@rafaeldtinoco
Copy link
Contributor

Actually there are now 144 bytes and before there were 128, adding 16 bytes in total. Another point to be made is that now we need to calculate the hash in bpf code for every event, which might degrade performance.

Fair enough. I see Alon changed from hash to start_time for both, leader and parent. That was another idea we discussed recently, LGTM as well.

Copy link
Collaborator

@yanivagman yanivagman left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM
@rafaeldtinoco I'm leaving the merge to you after your review

@rafaeldtinoco rafaeldtinoco merged commit f7887e8 into aquasecurity:main Sep 27, 2023
25 checks passed
@AlonZivony AlonZivony deleted the feature/tasks-hashes-in-event branch September 27, 2023 13:22
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants