Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BUG] libs do not update proc.name for renamed threads #1011

Open
therealbobo opened this issue Mar 28, 2023 · 30 comments
Open

[BUG] libs do not update proc.name for renamed threads #1011

therealbobo opened this issue Mar 28, 2023 · 30 comments
Labels
kind/bug Something isn't working
Milestone

Comments

@therealbobo
Copy link
Contributor

therealbobo commented Mar 28, 2023

Describe the bug

The Falco libs use the /proc filesystem to retrieve information about processes when they start up. However, if a thread, during its lifecycle, changes its name using a prctl system call, the proc.name value is not updated. This means that Falco may not be able to accurately identify the process and could lead to errors or security issues.

In the following situation Falco doesn't take into account the renaming because it's only aware of the information of the process at its creation.

  Falco starts
        |
        |
        |----malicious process----------------   -\                             
        |            |                            |                             
        |            |                            |                         
        |            |                            |                             
        |            |   thread renaming          |                             
        |            |          |                 |                             
        |            |          |                 |- malicious process lifecycle
        |            |          |                 |                             
        |            |          |                 |                             
        |            |          |                 |                             
        |            |          |                 |                             
        |            |          |                 |                             
        |            v          v                 |                             
        |-------------------------------------   -/                             
        |                                        
        |
        v

On the contrary, in the following situation, given that the renaming has already taken place before Falco has started, Falco will display correctly the proc.name.

          ----malicious process----------------   -\   
                      |                            | 
                      |                            | 
                      |                            | 
                      |   thread renaming          | 
                      |          |                 | 
   Falco starts       |          |                 | 
         |            |          |                 |- malicious process lifecycle
         |            |          |                 | 
         |            |          |                 | 
         |            |          |                 | 
         |            |          |                 | 
         |            |          |                 | 
         |            v          v                 | 
         |-------------------------------------   -/  
         |                                          
         |                                                                         
         |                                          
         v                                          

How to reproduce it

If we consider the first situation shown above:

  1. Compile the following simple c program (e.g.: gcc worker.c -o worker):
#include <stdio.h>
#include <pthread.h>

void *worker(void *arg) {
    if (pthread_equal(pthread_self(), *((pthread_t*)arg))) {
        FILE *fp;
        char buffer[1024];
        int lines = 0;
	pthread_setname_np(pthread_self(), "reader");

        sleep(5);
        fp = fopen("/etc/passwd", "r");
        if (fp == NULL) {
            perror("Error opening /etc/passwd");
            pthread_exit(NULL);
        }

        while (fgets(buffer, sizeof(buffer), fp) != NULL)
            lines++;

        printf("Thread-0: /etc/passwd contains %d lines\n", lines);

        fclose(fp);
    } else {
	pthread_setname_np(pthread_self(), "otherThread");
        printf("%ld started\n", pthread_self());
        sleep(10);
        printf("%ld finished\n", pthread_self());
    }

    pthread_exit(NULL);
}

int main(void) {
    pthread_t threads[10];
    int rc;
    pthread_setname_np(pthread_self(), "MainThread");

    for (int i = 0; i < 10; i++) {
        rc = pthread_create(&threads[i], NULL, worker, (void *)&threads[0]);
        if (rc) {
            fprintf(stderr, "Error creating thread %d\n", i);
            return 1;
        }
    }

    for (int i = 0; i < 10; I++)
        pthread_join(threads[i], NULL);

    return 0;
}
  1. Start Falco with the following rule that will match the thread called reader:
- rule: Read from /etc/passwd
  desc: A process attempted to read from the /etc/passwd file.
  condition: evt.type=read and fd.name="/etc/passwd"
  output: "Process attempted to read from /etc/passwd (user=%user.name command=%proc.cmdline procname=%proc.name)"
  priority: ERROR
  1. Run the worker program compiled in first step
  2. Falco will output "MainThread" as proc.name

If we consider the second situation shown above:

  1. Compile the following simple c program (e.g.: gcc worker.c -o worker):
#include <stdio.h>
#include <pthread.h>

void *worker(void *arg) {
    if (pthread_equal(pthread_self(), *((pthread_t*)arg))) {
        FILE *fp;
        char buffer[1024];
        int lines = 0;
	pthread_setname_np(pthread_self(), "reader");

        sleep(5);
        fp = fopen("/etc/passwd", "r");
        if (fp == NULL) {
            perror("Error opening /etc/passwd");
            pthread_exit(NULL);
        }

        while (fgets(buffer, sizeof(buffer), fp) != NULL)
            lines++;

        printf("Thread-0: /etc/passwd contains %d lines\n", lines);

        fclose(fp);
    } else {
	pthread_setname_np(pthread_self(), "otherThread");
        printf("%ld started\n", pthread_self());
        sleep(10);
        printf("%ld finished\n", pthread_self());
    }

    pthread_exit(NULL);
}

int main(void) {
    pthread_t threads[10];
    int rc;
    pthread_setname_np(pthread_self(), "MainThread");

    for (int i = 0; i < 10; i++) {
        rc = pthread_create(&threads[i], NULL, worker, (void *)&threads[0]);
        if (rc) {
            fprintf(stderr, "Error creating thread %d\n", i);
            return 1;
        }
    }

    for (int i = 0; i < 10; I++)
        pthread_join(threads[i], NULL);

    return 0;
}
  1. Run the worker program compiled in the previous step
  2. Start Falco after the renaming with the following rule that will match the thread called reader:
- rule: Read from /etc/passwd
  desc: A process attempted to read from the /etc/passwd file.
  condition: evt.type=read and fd.name="/etc/passwd"
  output: "Process attempted to read from /etc/passwd (user=%user.name command=%proc.cmdline procname=%proc.name)"
  priority: ERROR
  1. Falco will output "reader" as proc.name

At the end the concept is simple: if Falco starts before the renaming, it will only use the information available at the start of the process; otherwise if Falco starts after the renaming, it will have the complete information and use the effective thread name.

Expected behaviour

Given that proc.name is the name (excluding the path) of the executable generating the event.
When a thread changes its name within a process, Falco should update the proc.name value to reflect the new thread name. This should happen regardless of whether Falco was started before or after the thread was renamed. This will ensure that Falco is correctly identifying the process and can take appropriate actions as needed.

Screenshots
Here's how Falco behaves if it starts before the matching process (first situation).

$ sudo falco -A
Tue Mar 28 10:30:10 2023: Falco version: 0.34.1 (x86_64)
Tue Mar 28 10:30:10 2023: Falco initialized with configuration file: /etc/falco/falco.yaml
Tue Mar 28 10:30:10 2023: Loading rules from file /home/ubuntu/src/read_rule.yaml
Tue Mar 28 10:30:10 2023: The chosen syscall buffer dimension is: 8388608 bytes (8 MBs)
Tue Mar 28 10:30:10 2023: gRPC server threadiness equals to 8
Tue Mar 28 10:30:10 2023: Starting health webserver with threadiness 8, listening on port 8765
Tue Mar 28 10:30:10 2023: Starting gRPC server at unix:///run/falco/falco.sock
Tue Mar 28 10:30:10 2023: Enabled event sources: syscall
Tue Mar 28 10:30:10 2023: Opening capture with Kernel module
10:30:40.852145252: Error Process attempted to read from /etc/passwd (user=ubuntu command=MainThread procname=MainThread)
10:30:40.852148051: Error Process attempted to read from /etc/passwd (user=ubuntu command=MainThread procname=MainThread)
10:30:40.852152596: Error Process attempted to read from /etc/passwd (user=ubuntu command=MainThread procname=MainThread)
10:30:40.852153490: Error Process attempted to read from /etc/passwd (user=ubuntu command=MainThread procname=MainThread)

Here's how Falco behaves if it starts after (second situation) the matching process and before the actual read (so when the process has already changed its name).

$ sudo falco -A
Tue Mar 28 10:34:48 2023: Falco version: 0.34.1 (x86_64)
Tue Mar 28 10:34:48 2023: Falco initialized with configuration file: /etc/falco/falco.yaml
Tue Mar 28 10:34:48 2023: Loading rules from file /home/ubuntu/src/read_rule.yaml
Tue Mar 28 10:34:48 2023: The chosen syscall buffer dimension is: 8388608 bytes (8 MBs)
Tue Mar 28 10:34:48 2023: gRPC server threadiness equals to 8
Tue Mar 28 10:34:48 2023: Starting health webserver with threadiness 8, listening on port 8765
Tue Mar 28 10:34:48 2023: Enabled event sources: syscall
Tue Mar 28 10:34:48 2023: Starting gRPC server at unix:///run/falco/falco.sock
Tue Mar 28 10:34:48 2023: Opening capture with Kernel module
10:34:51.958366528: Error Process attempted to read from /etc/passwd (user=ubuntu command=reader procname=reader)
10:34:51.958370300: Error Process attempted to read from /etc/passwd (user=ubuntu command=reader procname=reader)
10:34:51.958373326: Error Process attempted to read from /etc/passwd (user=ubuntu command=reader procname=reader)
10:34:51.958373935: Error Process attempted to read from /etc/passwd (user=ubuntu command=reader procname=reader)

Here's the output of the ps command after the renaming:

$ ps -T -o pid,tid,comm p 2155073
    PID     TID COMMAND
2155073 2155073 MainThread
2155073 2155074 reader
2155073 2155075 otherThread
2155073 2155076 otherThread
2155073 2155077 otherThread
2155073 2155078 otherThread
2155073 2155079 otherThread
2155073 2155080 otherThread
2155073 2155081 otherThread
2155073 2155082 otherThread
2155073 2155083 otherThread

Environment

  • Falco version:
Falco version: 0.34.1
Libs version:  0.10.4
Plugin API:    2.0.0
Engine:        16
Driver:
  API version:    3.0.0
  Schema version: 2.0.0
  Default driver: 4.0.0+driver
  • System info:
{
  "machine": "x86_64",
  "nodename": "ip-172-31-37-210",
  "release": "5.15.0-1031-aws",
  "sysname": "Linux",
  "version": "#35-Ubuntu SMP Fri Feb 10 02:07:18 UTC 2023"
}
  • Cloud provider or hardware configuration: ec2 instance
  • OS:
PRETTY_NAME="Ubuntu 22.04.2 LTS"
NAME="Ubuntu"
VERSION_ID="22.04"
VERSION="22.04.2 LTS (Jammy Jellyfish)"
VERSION_CODENAME=jammy
ID=ubuntu
ID_LIKE=debian
HOME_URL="https://www.ubuntu.com/"
SUPPORT_URL="https://help.ubuntu.com/"
BUG_REPORT_URL="https://bugs.launchpad.net/ubuntu/"
PRIVACY_POLICY_URL="https://www.ubuntu.com/legal/terms-and-policies/privacy-policy"
UBUNTU_CODENAME=jammy
  • Kernel:
Linux ip-172-31-37-210 5.15.0-1031-aws #35-Ubuntu SMP Fri Feb 10 02:07:18 UTC 2023 x86_64 x86_64 x86_64 GNU/Linux
  • Installation method:
    DEB

Additional context

I think that the problem is linked to the fact that a thread is added to the thread table via parse_clone_exit

bool thread_added = m_inspector->add_thread(tinfo);

In this way, when a prctl occurs, the thread comm is never updated. A possible solution could be to hook the prctl and update the thread table accordingly.

@therealbobo therealbobo added the kind/bug Something isn't working label Mar 28, 2023
@leogr
Copy link
Member

leogr commented Apr 4, 2023

Given that proc.name is the name (excluding the path) of the executable generating the event.
When a thread changes its name within a process, Falco should update the proc.name value to reflect the new thread name. This should happen regardless of whether Falco was started before or after the thread was renamed. This will ensure that Falco is correctly identifying the process and can take appropriate actions as needed.

@therealbobo
I'm not convinced about that assumption. I believe the expectation for proc.name is always to report the initial process name (i.e., the executable name without the path)

@falcosecurity/libs-maintainers WDYT? 🤔
cc @loresuso @darryk10

@therealbobo
Copy link
Contributor Author

In my opinion the problem is that proc.name is associated with /proc/<pid>/comm which, in the process lifecycle, could change. This fact could mislead in the use of proc.name. Maybe this issue can be solved just updating the docs making this crystal clear or adding, for example, proc.startname. 😄

@leogr
Copy link
Member

leogr commented Apr 4, 2023

My point is that if it can change during the execution, rules authors can't rely on (otherwise, it can lead to bypasses).
I'd instead find a way to stick with the documented behavior and find a way to fix proc. name if possible. But I'd like to have more opinions on that.

Moreover, we also have another field, proc.exe, that can be a custom name (so perhaps it should reflect the last value of comm ?). Not sure about that, just guessing.

@loresuso
Copy link
Member

loresuso commented Apr 4, 2023

I strongly believe that updating the process name is not what we need here. This is because many rules rely on proc.name, and for this reason, just using prctl could lead to bypass all of them.
I am trying to think about better solution!

@darryk10
Copy link

darryk10 commented Apr 4, 2023

I think the proc.name should remain consistent and shouldn't contains the thread info. It would be helpful having another field containing the thread information without updating the proc.name field which is strongly used in detection and falco rules. This is also causing a lot of noise cause often the field proc.name is used in whitelist.

@gnosek
Copy link
Contributor

gnosek commented Apr 4, 2023

If the only way to change the process comm is via prctl, then IMO we should track the changes.

(the "if" comes from the fact that you can easily overwrite argv[0] from userspace, but IIRC it's entirely separate from comm which lives in the kernel so it should be fine).

AIUI, the confusion comes from all the different things that could be considered the name of a process (and I believe we expose at least most of them one way or another):

  • comm -> proc.name
  • initial argv[0] (or the full argv array) -> proc.args
  • current argv[0] (or full argv) -> we cannot get this from syscalls
  • the executable name -> proc.exe
  • I probably missed some

Since the comments appearing while I'm typing this indicate we don't want proc.name to contain updated comm, how about we introduce proc.comm for this? But in any case, if there isn't a spec for what fields contain what, let's make one and stick to it (it seems important for proc.name in particular since it implies "the" name while there isn't one).

As much as Falco has become the main user of the libs, security isn't the only use case, we can also use the comm (updated by a legit process) to troubleshoot specific processes/threads by name, e.g. in OSS Sysdig

@loresuso
Copy link
Member

loresuso commented Apr 4, 2023

Possible problem/question:

  • comm -> proc.name as @gnosek said
  • updates on comm -> proc.comm reflects the changes. Initially proc.name and proc.comm are the same.

Now consider this:

wdyt?

@therealbobo
Copy link
Contributor Author

I think that proc.comm could be a great idea! 👍

@leogr
Copy link
Member

leogr commented Apr 4, 2023

This should not be ok if we want an immutable proc.name.

Anyway, I believe the root cause depends on the usage of task->comm, that's generally not trustable since there's at least one case where one can set it from the userspace.

Thus, why not use the executable name instead? 🤔

@gnosek
Copy link
Contributor

gnosek commented Apr 4, 2023

It all boils down to the question: what exactly is proc.name? In the kernel, IMO the obvious answer would be comm, but it looks like we're gravitating towards it being the same as proc.exe.

@leogr
Copy link
Member

leogr commented Apr 4, 2023

Since the comments appearing while I'm typing this indicate we don't want proc.name to contain updated comm, how about we introduce proc.comm for this? But in any case, if there isn't a spec for what fields contain what, let's make one and stick to it (it seems important for proc.name in particular since it implies "the" name while there isn't one).

@gnosek

My TL;DR is: I believe rules authors want proc.name to behave like proc.exe (ie. just the executable name, immutable, @darryk10, and @loresuso do you agree?), and I totally agree to introduce proc.comm (that mirrors task->comm) would be nice to reduce confusion among users.

But I believe we have to reach consensus before introducing such changes.

@incertum
Copy link
Contributor

incertum commented Apr 4, 2023

Hey 👋, happy to join the discussion.

proc.name certainly can not and should never be called exe name or executable name (besides it's mostly cropped at 16 characters depending on your kernel settings which can be annoying). Perhaps all this can even be considered a bit of a broader flaw that affects most of the Falco upstream rules 🙃 . Please checkout #938 where we added the ability to traverse exe and exepath parent process lineage, because for instance for java there are many limitations as typically the app has a custom proc.name and therefore you otherwise wouldn't know if a shell was spawned from java. Lastly, we all are painfully aware of issues for symlinked executables ...

Hence we may not need proc.comm and can just keep the contract that proc.name is task->comm and similar to how we update other thread fields during its lifetime this one could be updated as well one day.

Rather we should prioritize getting the true exepath including resolved symlinks (@loresuso) and then introduce proc.exename which simply would be the last portion of the true full path, because proc.exe is also not really this sometimes ... Subsequently upstream rules should be updated, but I understand what a pain this will be, so proc.comm is a valid option, but please be aware that when changing proc.name to not reflect task->comm we are going against common Linux definitions, ps util output etc, so I would not recommend this.

In summary, getting technical clarity and feature completeness would be amazing as it's such an important aspect for threat detection:

  • Prioritize getting true executable path in the kernel
  • Introduce proc.exename
  • Update upstream rules using proc.exename instead

@leogr
Copy link
Member

leogr commented Apr 4, 2023

but please be aware that when changing proc.name to not reflect task->comm we are going against common Linux definitions, ps util output etc, so I would not recommend this.

@incertum

Good point.

I thought those definitions used command name (which in my mind is task->comm) and not process name (which in my mind is a totally arbitrary definition). Generally speaking, is there any precise definition of that in the Linux doc?

After your comment, I've changed my mind, and I'm not sure anymore that introducing proc.comm is a good idea.
At this point, the short-term solution would be

  • to stick with proc.name equals to task->comm (and fix cases when it's not correctly updated, like in the original description of this issue)
  • clearly state in the documentation that proc.name is not always the executable name (we have to set the expectation of rules authors correctly, that's the main point here)

In the long term, we have to fix the symlink problem, but it's another issue.

@incertum
Copy link
Contributor

incertum commented Apr 4, 2023

@leogr fully agree with touching up the docs a bit more and set expectations even better. Re true executable path and symlinks yes we will work on it and hopefully soon have a trustworthy proc.exename ... I'll tag up with @loresuso to make it a reality and @Andreagit97 also already expressed willingness to help.

Lastly, worthwhile mentioning is that proc.name is just more convenient to use in rules since we can do in (list) but for proc.exepath we would need to type out endswith each time. The good news is that we already have work planned for Falco 0.36 to make the UX better, see falcosecurity/falco#2403.

Let's dig through the ps utils source code? I think this could be good ground truth reference for what folks are expecting or used to.

@leogr
Copy link
Member

leogr commented Apr 28, 2023

Over the past few weeks, I have talked to everyone about this problem, and I am now totally convinced that the solution is:

  • to stick with proc.name equals to task->comm (and fix cases when it's not correctly updated, like in the original description of this issue)
  • clearly state in the documentation that proc.name is not always the executable name (we have to set the expectation of rules authors correctly, that's the main point here)

So, IMO the next step would be to review the field documentation.

@therealbobo
Copy link
Contributor Author

If you all agree, I could try to help with te documentation! 😄

@leogr
Copy link
Member

leogr commented Apr 28, 2023

If you all agree, I could try to help with te documentation! smile

Yes please 🙏

@Andreagit97 Andreagit97 added this to the libs-backlog milestone Jun 7, 2023
@incertum
Copy link
Contributor

This is all addressed, not just documentation wise but @therealbobo also added #1015

@Andreagit97
Copy link
Member

Uhm just reading again this issue I've noticed this is not completely fixed, in #1015 we added the kernel code to manage prctl code but we never added the parsing logic in sinsp to update proc.name, and the same for the clone caller parser. We never update the comm of the thread when a new clone happens while indeed the task->comm could be changed between the first clone and the next one...
I would reopen the issue here to keep track of this, wdyt?

@Andreagit97 Andreagit97 reopened this Oct 17, 2023
@incertum
Copy link
Contributor

oh yes we need to fix this, thanks for checking again @Andreagit97!

@poiana
Copy link
Contributor

poiana commented Jan 15, 2024

Issues go stale after 90d of inactivity.

Mark the issue as fresh with /remove-lifecycle stale.

Stale issues rot after an additional 30d of inactivity and eventually close.

If this issue is safe to close now please do so with /close.

Provide feedback via https://github.com/falcosecurity/community.

/lifecycle stale

@leogr
Copy link
Member

leogr commented Jan 16, 2024

/remove-lifecycle stale

@incertum
Copy link
Contributor

incertum commented Mar 7, 2024

@therealbobo and @Andreagit97 just checking in where we are at for this issue? What still needs to be done? Could we add a new summary of open items?

@Andreagit97
Copy link
Member

Uhm this comment #1011 (comment) should be still valid, AFAIK nothing has been changed 🤔

@poiana
Copy link
Contributor

poiana commented Jun 5, 2024

Issues go stale after 90d of inactivity.

Mark the issue as fresh with /remove-lifecycle stale.

Stale issues rot after an additional 30d of inactivity and eventually close.

If this issue is safe to close now please do so with /close.

Provide feedback via https://github.com/falcosecurity/community.

/lifecycle stale

@Andreagit97
Copy link
Member

/remove-lifecycle stale

@poiana
Copy link
Contributor

poiana commented Sep 4, 2024

Issues go stale after 90d of inactivity.

Mark the issue as fresh with /remove-lifecycle stale.

Stale issues rot after an additional 30d of inactivity and eventually close.

If this issue is safe to close now please do so with /close.

Provide feedback via https://github.com/falcosecurity/community.

/lifecycle stale

@Andreagit97
Copy link
Member

/remove-lifecycle stale

@poiana
Copy link
Contributor

poiana commented Dec 3, 2024

Issues go stale after 90d of inactivity.

Mark the issue as fresh with /remove-lifecycle stale.

Stale issues rot after an additional 30d of inactivity and eventually close.

If this issue is safe to close now please do so with /close.

Provide feedback via https://github.com/falcosecurity/community.

/lifecycle stale

@leogr
Copy link
Member

leogr commented Dec 11, 2024

/remove-lifecycle stale

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/bug Something isn't working
Projects
None yet
Development

No branches or pull requests

8 participants