-
Notifications
You must be signed in to change notification settings - Fork 3.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
usdt: Make namespace aware #1231
Conversation
Nice and simple. [buildbot, ok to test] |
Ooops! This fix is not quite correct. It fixes the immediate issue, but accessing the probes through The proper fix implies making the USDT code aware of |
Cleanup the `strncmp` code and add a few more ignored map names
Wouldn't you want to access the associated FD rather than the path? |
So, I've spent the afternoon messing around with this... I've added the proper And by working I mean that they execute without problems, but no events are being reported. I've verified this manually with the simplest examples (i.e. by printing to
That's all I have for today. @drzaeus77 @brendangregg if you have any further insights or leads I could look into, all help is appreciated. Thank you! |
Well, the fix wasn't quite right to begin with, but I don't understand what you mean by associated FD? Most of the code internally works by reading paths (particularly the Elf code), so either we have a Does that make sense? |
Alright, so I've managed to manually verify that processes inside Docker containers cannot be traced using the Reproduction steps: A binary #include <unistd.h>
#include <stdio.h>
#include <folly/tracing/StaticTracepoint.h>
int main() {
char s[100];
int i, a = 20, b = 40;
for (i = 0; i < 100; i++) s[i] = (i & 7) + (i & 6);
fprintf(stderr, "Running: %d\n", (int)getpid());
while (1) {
FOLLY_SDT(test, probe_point_1, s[7], b);
FOLLY_SDT(test, probe_point_3, a, b);
sleep(3);
a++; b++;
FOLLY_SDT(test, probe_point_1, s[4], a);
FOLLY_SDT(test, probe_point_2, 5, s[10]);
FOLLY_SDT(test, probe_point_3, s[4], s[7]);
}
return 1;
} It has several tracepoints, all statically defined, no semaphores:
We now run this binary:
And we attempt to manually trace it using the interfaces at
Note we're using We now attempt to run the
We run the image as so:
Note that the process spawns as PID 1, but we can find it running on the host OS:
We now attempt to perform the same tracing steps as earlier:
Note again we're using Am I going fucking bananas here? This is very puzzling to me. Has anybody been able to reproduce the issue locally? Is there an issue in my local kernel or Docker version? |
Looking at the kernel sources for the trace procfs, I see nothing funky: http://elixir.free-electrons.com/linux/v4.10/source/kernel/trace/trace_uprobe.c#L442 filename = argv[1];
ret = kern_path(filename, LOOKUP_FOLLOW, &path);
if (ret)
goto fail_address_parse;
inode = igrab(d_inode(path.dentry));
path_put(&path);
if (!inode || !S_ISREG(inode->i_mode)) {
ret = -EINVAL;
goto fail_address_parse;
} If the path to the binary was successfully resolved (which it should have been, as the uprobes are showing up in the procfs and can be activated), it keeps a pointer to the |
I'm playing around with your code, and seeing a different set of problems. In particular, I get an error out of kernel/events/uprobes.c:
What file system are you using? My default docker install chose overlayfs which doesn't have the readpage() handler, hence the error when trying to do perf_event_open. |
I was able to get a working example where I for instance did the following:
Attaching a bcc USDT object to the resulting pid gave coherent results. This showed to me that the trace attach is able to work across mount namespaces, since there is no /usr/bin/usdt in the root mount ns. However, using the |
Ooh, interesting. Thanks for looking into this, @drzaeus77. Yes, I can also verify that mounting with the There's still a pretty big issue with files that are added using
|
Yey, good news! It seems like the tracing issues are specific to the I think this is good to review and merge: the PR fixes the issues the USDT code had when loading probes in mounted namespaces. I'm going to update the title and body of the PR to accurately describe the changes. |
When trying to attach probes to a namespaced process (i.e. one inside a container), we access directly the
/proc/$pid/maps
file in our local procfs. The paths to the mapped files inside the procfs, however, are relative to the chroot of the process instead of the global FS root.To work around this, this PR makes the USDT code in BCC aware of namespaces, by using the
ProcMountNS
helpers that were previously introduced to the library.We should now be able to attach and trace probes to processes namespaced inside a container.
Note that tracing using
uprobes
is only supported in Docker deployments when using thedevicemapper
(and most likelybrtfs
) storage drivers. The overlay FSs (aufs
,overlayfs
,overlayfs2
) do not seem to be properly writing the userspace breakpoints on the binaries, so the uprobes cannot trigger.