-
Notifications
You must be signed in to change notification settings - Fork 2.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
"Crash" on TestInetEndPoint in CI on Darwin #10025
Comments
So the abort is because something is adding a timer that is already in the timer list. I did some test runs with the log line above replaced with Mac-specific backtrace dumping, and the stack trace (sorry, no line numbers, but I expect them to be pretty obvious for the most part) looks like this:
the only real line number ambiguity here is which of the
which annoyingly mixes ChipLog bits and |
OK, so I tried running the unit tests under TSan and then everything becomes clear:
How could we have a data race under
and from
So the timer handling ends up completely unsynchronized, we have data races, and end up in a bad state sometimes, not surprising at that point. @vivien-apple @kpschoedel @sagar-apple We need to figure out what the actual contract here is going to be when
|
Maybe the answer is we should not be using |
In the use_dispatch setup the Matter stack is expected to not spin up threads manually, but our async DNS implementation does just that via pthreads. This is leading to random failures in TestInetEndpoint on Darwin due to thread data races. project-chip#10025
In the use_dispatch setup the Matter stack is expected to not spin up threads manually, but our async DNS implementation does just that via pthreads. This is leading to random failures in TestInetEndpoint on Darwin due to thread data races. Fixes project-chip#10025
In the use_dispatch setup the Matter stack is expected to not spin up threads manually, but our async DNS implementation does just that via pthreads. This is leading to random failures in TestInetEndpoint on Darwin due to thread data races. Fixes #10025
Problem
Sometimes the TestInetEndPoint test "crashes" in CI on Darwin like so:
We added diagnostic log upload, but it turns out this is not actually a crash; it's an
abort()
call, so it looks like that does not generate a diagnostic log...But the logging we added for
VerifyOrDie
does catch this in https://github.com/project-chip/connectedhomeip/runs/3733865170?check_suite_focus=true:Proposed Solution
Sort out why that VerifyOrDie is getting hit and fix it.
@mspang @kpschoedel @andy31415 @woody-apple
The text was updated successfully, but these errors were encountered: