Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

flaky usdt test #2402

Open
xh4n3 opened this issue Oct 28, 2022 · 4 comments
Open

flaky usdt test #2402

xh4n3 opened this issue Oct 28, 2022 · 4 comments
Labels
bug Something isn't working tests Issues with our tests or test framework; missing tests; invalid tests

Comments

@xh4n3
Copy link
Contributor

xh4n3 commented Oct 28, 2022

What reproduces the bug?

Trigger CI on the master branch.
https://github.com/iovisor/bpftrace/actions/runs/3343605763/jobs/5536998903
https://github.com/iovisor/bpftrace/actions/runs/3341593002/jobs/5532960039

2: [ RUN      ] usdt."usdt probes - file based semaphore activation multi process"
2: [  TIMEOUT ] usdt."usdt probes - file based semaphore activation multi process"
2: 	Command: ../src//bpftrace runtime/scripts/usdt_file_activation_multiprocess.bt --usdt-file-activation
2: 	Timeout: 5
2: 	Current output: Attaching 3 probes...
2: __BPFTRACE_NOTIFY_PROBES_ATTACHED
2: found 1 processes
@xh4n3 xh4n3 added the bug Something isn't working label Oct 28, 2022
@BurntBrunch
Copy link

I'm fighting with it in 4eb280b but I can't make it stable yet. Now the pidof invocation to get the child pid after it has started is failing. I may just make that logic only run if the test needs BEFORE_PID.

@viktormalik
Copy link
Contributor

I think that this should be resolved by #2370

@xh4n3 xh4n3 closed this as completed Nov 3, 2022
@BurntBrunch
Copy link

Unfortunately, I don't think it was fully fixed, just made a bit better. I still don't fully understand how this happens (and why it happens on that test specifically) but here's a run after that PR that timed out again - https://github.com/BurntBrunch/bpftrace/actions/runs/3381817256/jobs/5616467787

@BurntBrunch BurntBrunch reopened this Nov 3, 2022
BurntBrunch pushed a commit to BurntBrunch/bpftrace that referenced this issue Nov 3, 2022
As noted in bpftrace#2402, usdt flakiness was made better by
508538a but not fully fixed.

This commit is what I should have done all along: it allows the test
runner to parse and wait for multiple BEFORE clauses and thus ensures
the processes have started before the test runs.

There are two minor changes:

  1. The check for child processes is now `ps --ppid` based to
     eventually allow parallel process runs in the same environment.
  2. Because of the `ps` usage, the name check is now truncated to 15
     chars, which will fail if TASK_COMM_LEN is not 16. That looks like
     a constant in the kernel, so I think we're good.
BurntBrunch pushed a commit to BurntBrunch/bpftrace that referenced this issue Nov 3, 2022
As noted in bpftrace#2402, usdt flakiness was made better by
508538a but not fully fixed.

This commit is what I should have done all along: it allows the test
runner to parse and wait for multiple BEFORE clauses and thus ensures
the processes have started before the test runs.

There are two minor changes:

  1. The check for child processes is now `ps --ppid` based to
     eventually allow parallel process runs in the same environment.
  2. Because of the `ps` usage, the name check is now truncated to 15
     chars, which will fail if TASK_COMM_LEN is not 16. That looks like
     a constant in the kernel, so I think we're good.
BurntBrunch pushed a commit to BurntBrunch/bpftrace that referenced this issue Nov 3, 2022
As noted in bpftrace#2402, usdt flakiness was made better by
508538a but not fully fixed.

This commit is what I should have done all along: it allows the test
runner to parse and wait for multiple BEFORE clauses and thus ensures
the processes have started before the test runs.

There are two minor changes:

  1. The check for child processes is now `ps --ppid` based to
     eventually allow parallel process runs in the same environment.
  2. Because of the `ps` usage, the name check is now truncated to 15
     chars, which will fail if TASK_COMM_LEN is not 16. That looks like
     a constant in the kernel, so I think we're good.
BurntBrunch pushed a commit to BurntBrunch/bpftrace that referenced this issue Nov 4, 2022
As noted in bpftrace#2402, usdt flakiness was made better by
508538a but not fully fixed.

This commit is what I should have done all along: it allows the test
runner to parse and wait for multiple BEFORE clauses and thus ensures
the processes have started before the test runs.

There are two minor changes:

  1. The check for child processes is now `ps --ppid` based to
     eventually allow parallel process runs in the same environment.
  2. Because of the `ps` usage, the name check is now truncated to 15
     chars, which will fail if TASK_COMM_LEN is not 16. That looks like
     a constant in the kernel, so I think we're good.
@BurntBrunch
Copy link

Okay, I give up. #2414 doesn't help either, so this is not about the test runner racing the BEFORE calls. Maybe it's something to do with --usdt-file-activation and whatever kernel the GH runners run?

viktormalik pushed a commit to viktormalik/bpftrace that referenced this issue Apr 20, 2023
As noted in bpftrace#2402, usdt flakiness was made better by
508538a but not fully fixed.

This commit is what I should have done all along: it allows the test
runner to parse and wait for multiple BEFORE clauses and thus ensures
the processes have started before the test runs.

There are two minor changes:

  1. The check for child processes is now `ps --ppid` based to
     eventually allow parallel process runs in the same environment.
  2. Because of the `ps` usage, the name check is now truncated to 15
     chars, which will fail if TASK_COMM_LEN is not 16. That looks like
     a constant in the kernel, so I think we're good.
viktormalik pushed a commit to viktormalik/bpftrace that referenced this issue Apr 20, 2023
As noted in bpftrace#2402, usdt flakiness was made better by
508538a but not fully fixed.

This commit is what I should have done all along: it allows the test
runner to parse and wait for multiple BEFORE clauses and thus ensures
the processes have started before the test runs.

There are two minor changes:

  1. The check for child processes is now `ps --ppid` based to
     eventually allow parallel process runs in the same environment.
  2. Because of the `ps` usage, the name check is now truncated to 15
     chars, which will fail if TASK_COMM_LEN is not 16. That looks like
     a constant in the kernel, so I think we're good.
viktormalik pushed a commit to viktormalik/bpftrace that referenced this issue Apr 20, 2023
As noted in bpftrace#2402, usdt flakiness was made better by
508538a but not fully fixed.

This commit is what I should have done all along: it allows the test
runner to parse and wait for multiple BEFORE clauses and thus ensures
the processes have started before the test runs.

There are two minor changes:

  1. The check for child processes is now `ps --ppid` based to
     eventually allow parallel process runs in the same environment.
  2. Because of the `ps` usage, the name check is now truncated to 15
     chars, which will fail if TASK_COMM_LEN is not 16. That looks like
     a constant in the kernel, so I think we're good.
viktormalik pushed a commit to viktormalik/bpftrace that referenced this issue Apr 20, 2023
As noted in bpftrace#2402, usdt flakiness was made better by
508538a but not fully fixed.

This commit is what I should have done all along: it allows the test
runner to parse and wait for multiple BEFORE clauses and thus ensures
the processes have started before the test runs.

There are two minor changes:

  1. The check for child processes is now `ps --ppid` based to
     eventually allow parallel process runs in the same environment.
  2. Because of the `ps` usage, the name check is now truncated to 15
     chars, which will fail if TASK_COMM_LEN is not 16. That looks like
     a constant in the kernel, so I think we're good.
viktormalik pushed a commit to viktormalik/bpftrace that referenced this issue Apr 20, 2023
As noted in bpftrace#2402, usdt flakiness was made better by
508538a but not fully fixed.

This commit is what I should have done all along: it allows the test
runner to parse and wait for multiple BEFORE clauses and thus ensures
the processes have started before the test runs.

There are two minor changes:

  1. The check for child processes is now `ps --ppid` based to
     eventually allow parallel process runs in the same environment.
  2. Because of the `ps` usage, the name check is now truncated to 15
     chars, which will fail if TASK_COMM_LEN is not 16. That looks like
     a constant in the kernel, so I think we're good.
viktormalik pushed a commit to viktormalik/bpftrace that referenced this issue Apr 20, 2023
As noted in bpftrace#2402, usdt flakiness was made better by
508538a but not fully fixed.

This commit is what I should have done all along: it allows the test
runner to parse and wait for multiple BEFORE clauses and thus ensures
the processes have started before the test runs.

There are two minor changes:

  1. The check for child processes is now `ps --ppid` based to
     eventually allow parallel process runs in the same environment.
     That requires to use `ps` from the `procps` package on Alpine as
     the default BusyBox one doesn't have the `--ppid` option.
  2. Because of the `ps` usage, the name check is now truncated to 15
     chars, which will fail if TASK_COMM_LEN is not 16. That looks like
     a constant in the kernel, so I think we're good.
viktormalik pushed a commit to BurntBrunch/bpftrace that referenced this issue May 24, 2023
As noted in bpftrace#2402, usdt flakiness was made better by
508538a but not fully fixed.

This commit is what I should have done all along: it allows the test
runner to parse and wait for multiple BEFORE clauses and thus ensures
the processes have started before the test runs.

There are two minor changes:

  1. The check for child processes is now `ps --ppid` based to
     eventually allow parallel process runs in the same environment.
     That requires to use `ps` from the `procps` package on Alpine as
     the default BusyBox one doesn't have the `--ppid` option.
  2. Because of the `ps` usage, the name check is now truncated to 15
     chars, which will fail if TASK_COMM_LEN is not 16. That looks like
     a constant in the kernel, so I think we're good.
danobi pushed a commit that referenced this issue May 24, 2023
As noted in #2402, usdt flakiness was made better by
508538a but not fully fixed.

This commit is what I should have done all along: it allows the test
runner to parse and wait for multiple BEFORE clauses and thus ensures
the processes have started before the test runs.

There are two minor changes:

  1. The check for child processes is now `ps --ppid` based to
     eventually allow parallel process runs in the same environment.
     That requires to use `ps` from the `procps` package on Alpine as
     the default BusyBox one doesn't have the `--ppid` option.
  2. Because of the `ps` usage, the name check is now truncated to 15
     chars, which will fail if TASK_COMM_LEN is not 16. That looks like
     a constant in the kernel, so I think we're good.
@jordalgo jordalgo added the tests Issues with our tests or test framework; missing tests; invalid tests label Jan 4, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working tests Issues with our tests or test framework; missing tests; invalid tests
Projects
None yet
Development

No branches or pull requests

4 participants