-
Notifications
You must be signed in to change notification settings - Fork 566
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
SIGSEGV when Python tests use the ctypes
module
#442
Comments
Hello @P403n1x87, Thanks for your input, but the data you've provided is not enough to discovery the root cause of the issue.
says there's no crash dump and there's nothing to tracebak - i advise you to add the step with `ls -l /var/crash/' to check the exact path of core dump. Alternatively can you please provide the exact python code that segfaults. I tried to create a simple repro with typical ctypes use case, but it works as expected. Ideally it would be good if you changed my minimal repro in order to get the same SEGFAULT error. In this case we will resolve the issue as soon as possible. |
@dsame here you can see a run with a backtrace https://github.com/P403n1x87/austin/runs/7045867473?check_suite_focus=true This seems to point to this line of Python test code which performs a call to To run this locally, you could clone the pytest test/cunit This single-line command should do the trick once inside the cloned python3.10 -m venv /tmp/austin-venv && source /tmp/austin-venv/bin/activate && pip install -r test/requirements.txt && pytest test/cunit Note that this requires
I would try calling |
@P403n1x87 thanks a lot for the backtrace, now i see a SEGFAULT is caused by an instruction of from It is known problem: toolcache python is built against libffi.so.6 which does not exist in ubuntu-20.04 While we are providing the solution, please try to add the step installing libffi.so.6 before running tests as workaround
|
@dsame thanks for looking into this. I will give the workaround a try and let you know how it goes! |
Hello @P403n1x87, |
@dsame sorry I had moved to other things in the meantime. Just opened a draft PR to test the workaround, but it doesn't seem to solve the issue unfortunately 🙁 https://github.com/P403n1x87/austin/runs/7336236967?check_suite_focus=true |
Hello @P403n1x87 , now the error messages changed Although i still see the same point of the error origin, can you please double check the new hint? |
A Segmentation Fault still causes this. Looking at the full traceback from the first failing test you can see the attempt to |
@dsame This run has the backtrace information https://github.com/P403n1x87/austin/runs/7385733844?check_suite_focus=true It looks like this is using |
@dsame I have tried
but the result is still the same 🙁 https://github.com/P403n1x87/austin/runs/7385900441?check_suite_focus=true |
@P403n1x87 Trying to investigate the code i am not able to find the place there the problematic function I see this test fails - https://github.com/P403n1x87/austin/blob/ci/setup-python-workaround/test/cunit/cache.py I suppose the compiled code is executed after that, causing the SEGFLT during the call of Also did you ever have this test passed on linux? Can you show the most recent success commit then? |
@dsame the tests are in the devel branch. The call to You can see that tests on the devel branch are all passing and I've never seen a SIGSEGV with deadsnakes, neither on GH Actions nor locally. The C unit tests are single-threaded, so I think a race condition is unlikely. |
@P403n1x87 The problem has been reproduced on simplified code:
|
Since python 3.10 and ubuntu 22.04 the same problem appears even with the native, default python I suppose there's a conflict of the python memory management and linux memory management, because the result of C.malloc is unknown and treated as Python object which is under Python control. To fix it, the arguments should be explicitly set
https://github.com/akv-platform/austin/actions/runs/2723152503 Alternatively you can use python installed from ppa:deadsnakes, like this
i confirmed it works so far, but i believe it is only temporary workaround solution Does the answer help to solve the issue? |
Ah this is weird. Currently I'm relying on deadsnakes for the tests on Linux. This is one of the latest runs with https://github.com/P403n1x87/austin/runs/7425790895?check_suite_focus=true So using deadsnakes as a workaround on 20.04 works for me. The only downside is that it takes slightly longer to install from the PPA repo than with the action, so it would be cool to have this working on Linux too at some point. |
In fact it is not. I have no confirmation but it looks to me that if there's no argument result type explicitly set then some Python object is created which is subject of GC. As a result, at the moment of accessing the value it either maybe dead or instead of passing the reference to the memory block we pass a reference to the object that holds it. Declaring the types:
we avoid the the problem. |
Hello @P403n1x87, i am going to close the issue because of provided workarounds and the root cause is in the area of Python teams, but feel free to reopen this issue or create new one if you feel you need it. |
@dsame thanks for your investigation into this issue. What I find weird is that the original coding worked for some Python distros but not for others. Your proposed type declarations make total sense. |
@dsame FYI, I just tried declaring the types on https://github.com/P403n1x87/austin/runs/7496599298?check_suite_focus=true This is the change: |
I suppose you have to declare the types for |
Status: For example: There might be 2 reasons:
@P403n1x87 any advises about using |
The code in
This is weird. If |
it was causing CI to crash, see: actions/setup-python#442 With ubuntu-22.04 the action is not needed anymore as the default Python version is 3.10.4
it was causing CI to crash, see: actions/setup-python#442 With ubuntu-22.04 the action is not needed anymore as the default Python version is 3.10.4
hello @P403n1x87, I had a chance to investigate the problem more and now i've confirmed the problem is with correct description of the arguments in results. I.e. the folowing code
works without problem (with commenting out Now the problem is how to set the quick naive attempts
do not work so far, but may be you can suggest something better than these? |
This way i am able to fix QueueItem constructor to assept the correct arguments
also it maybe make sens to set This is expected to fix the return values but i am not able to confirm it, neither if can provide working code for destroying QueueItem
|
@dsame thanks for your further investigation and the reproducer showing that the issue, in this case, is with the argument types. The result of Based on your analysis, I suspect that using |
"A cleaner solution would be to enhance the parsing of C to reduce type definitions to basic types, but this involves quite some effort" Yes, it is assumed to end up with the enhancing the parser, but i wanted to get a proof the problem is with the passing the argument. "using c_int instead of c_void_p would reproduce the issue?" Yes, arguments and result are treated by int by default. "But I do still wonder about what makes the setup-python build behave differently from deadsnake." Hm, i did not think about it. Let me try to investigate this side as well. Might be it could point out some other idea beside complicating the parser. |
@P403n1x87 i confirm the code works with PPA python without problem, but i am not able to find out how it could be. Unfortunately the the further investigation is beyond the supporting the action and i have to close the issue with the conclusion "prebuilt pythons keeps the behavior of the official binaries". I advise you to ask the support from the python teams which might be most effective with the provided info about required ctypes annotations. Also it might be helpful to compare the |
@dsame many thanks for your investigations and support thus far. I believe the best fix, going forward, would be to enhance the parsing and explicitly assign types to args and return, to be protected against these sorts of build differences. I will try to get to that when I can find the time! |
Description:
When using the Python installed with this action I run into SIGSEGV on Linux if my tests use
ctypes
. This is is the offending workflowIf Python is installed from, e.g.,
deadsnakes/ppa
, then the tests pass without SIGSEGV. I can also get the tests to pass locally.This is an example of a happy workflow that pulls Python from the PPA: https://github.com/P403n1x87/austin/runs/7046194671?check_suite_focus=true
This is a run with the action: https://github.com/P403n1x87/austin/runs/7045119660?check_suite_focus=true
This was discovered with this PR: https://github.com/P403n1x87/austin/pull/120/files
Action version:
v4
Platform:
Runner type:
Tools version:
Repro steps:
See description above
Expected behavior:
No SIGSEGV, like with the Pythons from the PPA.
Actual behavior:
SIGSEGV if
ctypes
is usedThe text was updated successfully, but these errors were encountered: