Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Kodi crashing #90619

Closed
mrpg mannequin opened this issue Jan 21, 2022 · 28 comments
Closed

Kodi crashing #90619

mrpg mannequin opened this issue Jan 21, 2022 · 28 comments
Labels
3.9 only security fixes type-crash A hard crash of the interpreter, possibly with a core dump

Comments

@mrpg
Copy link
Mannequin

mrpg mannequin commented Jan 21, 2022

BPO 46461
Nosy @ericvsmith, @ajoino, @mrpg

Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.

Show more details

GitHub fields:

assignee = None
closed_at = None
created_at = <Date 2022-01-21.17:40:00.731>
labels = ['3.9', 'type-crash']
title = 'Kodi crashing'
updated_at = <Date 2022-03-06.10:22:14.446>
user = 'https://github.com/mrpg'

bugs.python.org fields:

activity = <Date 2022-03-06.10:22:14.446>
actor = 'ajoino'
assignee = 'none'
closed = False
closed_date = None
closer = None
components = []
creation = <Date 2022-01-21.17:40:00.731>
creator = 'mrpg'
dependencies = []
files = []
hgrepos = []
issue_num = 46461
keywords = []
message_count = 7.0
messages = ['411164', '411169', '411170', '411485', '414610', '414611', '414612']
nosy_count = 3.0
nosy_names = ['eric.smith', 'ajoino', 'mrpg']
pr_nums = []
priority = 'normal'
resolution = None
stage = None
status = 'open'
superseder = None
type = 'crash'
url = 'https://bugs.python.org/issue46461'
versions = ['Python 3.9']

@mrpg
Copy link
Mannequin Author

mrpg mannequin commented Jan 21, 2022

Dear Sir/Madam,

I was adviced to submitt an issue here, as the kodi team suggests its a problem with python.

i get this error in my syslog :

Jan 21 18:19:09 mediapc kernel: [ 14.478095] LanguageInvoker[1228]: segfault at 0 ip 00007fe50f704f45 sp 00007fe48d7f84e8 error 4 in libc-2.33.so[7fe50f5ab000+16b000]

Link to thread on Kodi forum :

https://forum.kodi.tv/showthread.php?tid=363499

Best Regards
Patric

@mrpg mrpg mannequin added 3.9 only security fixes type-crash A hard crash of the interpreter, possibly with a core dump labels Jan 21, 2022
@mrpg
Copy link
Mannequin Author

mrpg mannequin commented Jan 21, 2022

EDIT, just upgraded to ubuntu 21.10 and now use python 3.10.0 , same issue.

@ericvsmith
Copy link
Member

Without a way to reproduce this, we won't be able to help you.

Ideally you would provide a python script, which we could run, which shows the problem. It would be best if there were no third party packages involved, but if there are, you should provide instructions on how to set up a virtual environment that includes whatever software you need.

@mrpg
Copy link
Mannequin Author

mrpg mannequin commented Jan 24, 2022

Hi,
Its reproducible, but not very fast or easy :

1 Install Ubuntu 21.04 or 21.10

2 Install kodi 19 from ppa : https://launchpad.net/~team-xbmc/+archive/ubuntu/ppa

3 install aeon nox silvo, and customized the skin and kodi settings

4 install plugins youtube,svtplay,retrospect,formula1,DR TV,iplayer www,netflix,Discovery plus, pvr simple client

@mrpg
Copy link
Mannequin Author

mrpg mannequin commented Mar 6, 2022

any update on this?
can i do something more to help?

BR
Patric

@ericvsmith
Copy link
Member

I don’t have Ubuntu to test on. Plus the steps to reproduce are too much for the average volunteer to work through. I don’t think we’ll be able to help.

@ajoino
Copy link
Mannequin

ajoino mannequin commented Mar 6, 2022

From Kodi GH issues, they suspect is related to the work on subinterpreters xbmc/xbmc#19961 (comment):

"The bulk of this issue is due to how python and it's modules handle sub interpreters.
There are several open python bpos and have been many prs to cpython to handle sub interpreters better, but it's still very much a work in progress.

All C modules are/were being updated to what they call multi phase init, which essentially moves away from static init of members. Sub interpreters can modify these static members which can cause other sub interpreter states to crash horribly.

For the user with the SSL crash running python 3.9, if you go to 3.10, the SSL module was converted to multiphase init, and shouldn't crash in the same manner for that particular module. However not all modules are converted still, so there are still failure points on other modules.

This is a cpython issue, and there's not much we can do but wait for it to be resolved in cpython"

@ezio-melotti ezio-melotti transferred this issue from another repository Apr 10, 2022
@comrade-meowski
Copy link

I don’t have Ubuntu to test on. Plus the steps to reproduce are too much for the average volunteer to work through. I don’t think we’ll be able to help.

I'm keen to chip in on this - I have tens of Kodi systems effected by this and they are all effectively unusable at this point.

We'll have to meet halfway on the reproducer I'm afraid, the issue is complex enough that a simple python script is not going to be enough to surface it as inconvenient as that may be for all of us. The good news is that the setup above is by no means necessary to trigger the faults so if any dev that's going to work on this can let me know the environment in which they'd prefer to test I will work on a much more reasonable reproducer.

In essence you need any Linux (Windows is not effected in my limited testing on that platform) host + Kodi 19.x + an addon. My specific use case is Linux (multiple architectures and distributions) + Kodi 19.x + the youtube and jellyfin addons. However both the youtube and jellyfin addons require a not-inconsiderable amount of pre-use configuration which I agree makes them not ideal candidates for triaging this. The "good" news here is that I'm pretty sure almost any addon or perhaps a combination of any two addons is sufficient to trigger the fault - I have spotted one candidate addon that aggressively autorefreshes it's state and seems capable of causing the fault on an otherwise vanilla Kodi install by itself with no user interaction required.

Please supply a preferred test environment for your purposes and I will provide as simple a reproducer as I can find.

@JelleZijlstra
Copy link
Member

Is it possible to get a gdb traceback for the segfault? That should make it a lot easier to find the cause. On a quick look I didn't see any tracebacks here or in the linked Kodi forums thread.

@comrade-meowski
Copy link

That's a good point - I'll get started on it. Arch Linux x64 ok?

@comrade-meowski
Copy link

Here's a trace.log generated on a known effected Arch x64 VM - please let me know if I have not generated it correctly for you, I don't normally venture this deep into debug territory.

From the user perspective the crash was triggered by starting Kodi + Jellyfin plugin and playing one video for 10 seconds, stopping it and then attempting to play any other video. On systems with this bug that's a 100% effective reproducer.

trace.log

@comrade-meowski
Copy link

Ok guys, here's your reproducer - it turned out to be embarrassingly trivial to trigger. This should work fine on any modern Linux distribution running Python 3.9 - 3.10 and I tested it specifically on Ubuntu 22.04 and Arch x64 systems, both fully updated today 22-04-14. Brand new 'clean' test user accounts in both cases to avoid confounding variables.

1: Install Kodi via your distribution of choices package manager (you'll get version 19.4) - don't bother with any configuration
2: Enter the Add Ons section of the Kodi interface
3: Install the iplayer addon - don't bother with any configuration
4: Install the youtube addon - don't bother with any configuration
5: Open either addon - immediately back out, don't even bother playing a video or using it in any way
6: Switch to the other addon - immediately back out, don't even bother playing a video or using it in any way
7: Repeat a couple of times

That's it. Kodi will crash triggering the fault under discussion here and drop a kodi_crashlog_timestamp.log in the home directory. It's cross platform and 100% reproducible.

I'm more than happy to help in any way I can with testing or providing tracebacks, etc. I've got an entire fleet of useless systems and a lot of unhappy users to fend off so I'm highly motivated!

Many thanks.

@comrade-meowski
Copy link

Anything further I can do to get some kind of progress on this? There doesn't seem to be anywhere else this is being discussed or worked on that I can find although I'd appreciate any information that I've missed.

Is there a specific person or group of people I could talk to or work with - I'm quite willing to put in as much effort as is required short of actually just learning python and C to a level where I could fix someone else's complex code.

Thanks.

@JelleZijlstra
Copy link
Member

I can help a little but won't be able to spend much time. I hadn't heard of Kodi before, and it sounds like reproducing the bug requires a Linux desktop installation, which I don't have.

The gdb trace in your previous post unfortunately isn't that helpful; the function name is ??. If you can reproduce the problem in a debug build, it may have more information. I think some distros provide a python3-dbg package that includes debug binaries.

@comrade-meowski
Copy link

Thanks for your reply but this could easily end up requiring a dev with Linux experience, just due to the nature of the problem. It has not been possible to reproduce this on Windows or MacOS in my testing. On the other hand I'm not a python developer so what do I know and as you're the only person who's offered to help I'll very happily take it.

The previous trace log was generated using the (pretty new) Arch Linux debuginfod* service which once configured automatically pulled in debug variants of all involved packages so I'm unsure why it still hasn't supplied all the information required. Unfortunately I am myself hardly an expert in providing or analysing tracelogs but can dig in and learn what I need to know to do my end of things.

It would seem sensible to at least settle on a specific Linux distribution for debugging purposes so do you or anyone else at the dev end have a preference or requirement? I admin pretty much all of them so don't mind but would recommend Ubuntu or Arch for simplicity's sake.

The first task would seem to be generating a trace log that provides some more useful information I guess?

@TheGorbag
Copy link

I seem to be having the same problem on Gentoo after a recent upgrade to Python 3.9.11. Downgrading to 3.9.9 fixes the issue. I have built Python 3.9.11 debug and generated a backtrace that is attached. I was unable to build kodi debug, so that portion of the backtrace is less useful. The problem appears to be a NULL pointer access at line 76 of Include/internal/pycore_object.h

(gdb) l 76
71 filename, lineno, "_PyObject_GC_UNTRACK");
72
73 PyGC_Head *gc = _Py_AS_GC(op);
74 PyGC_Head *prev = _PyGCHead_PREV(gc);
75 PyGC_Head *next = _PyGCHead_NEXT(gc);
76 _PyGCHead_SET_NEXT(prev, next);
77 _PyGCHead_SET_PREV(next, prev);
78 gc->_gc_next = 0;
79 gc->_gc_prev &= _PyGC_PREV_MASK_FINALIZED;
80 }
(gdb) p prev
$5 = (PyGC_Head *) 0x0
(gdb) p next
$6 = (PyGC_Head *) 0x0

kodi-python-bt.txt

@JelleZijlstra
Copy link
Member

Thanks, that's very helpful! I can look at the code in a bit to see if I can find anything obvious. I'll also be at PyCon this coming week if anyone wants to debug it in person.

There's a chance this is a memory corruption bug caused by some other code in the process, not by CPython itself, but we'll have to debug more to figure that out.

@sweeneyde
Copy link
Member

Is it possible that this is the same issue as #91636? Is it possible it was fixed by #91651?

@VioletRed
Copy link

VioletRed commented Apr 28, 2022

I have the same issue, and I can confirm the issue appeared in debian between 3.9.2 and 3.9.12. The same kodi binary will crash with 3.9.12 but work with 3.9.2.

It looks like a duplicate of, or at least related to, #90228 https://bugs.python.org/issue46070. There is no call to _PyObject_GC_UNTRACK_impl in that report's log, but both seem to fail while importing a module.

@JelleZijlstra
Copy link
Member

@vstinner also suggested #90228 may be related.

@vstinner
Copy link
Member

It may be related to #92036

@comrade-meowski
Copy link

It may be related to #92036

@vstinner: Rebuilding the distribution-provided python3.10 with your patch from #92037 actually resolves this issue as originally reported. I've only tested on Ubuntu x64 22.04/22.10 so far but Kodi + addons have stopped crashing.

I'll rebuild python for Arch x64/aarch64 now and report back but this looks promising so far. Thanks!

@comrade-meowski
Copy link

I can confirm that rebuilding the distro-provided python 3.10.4 with just @vstinner's patch from #92037 fixes the dreaded Kodi + addons = crash issues on Ubuntu + Arch Linux x64/aarch64.

@vstinner - I can't thank you enough, this has been driving me mad for months at this point. Will the various fixes involved appear soon in a 3.10.4 point release or are they likely to be held back to the 3.10.5 update in a couple of months?

@JelleZijlstra - thank you also, and everyone else who contributed here.

@ericvsmith
Copy link
Member

3.10.4 was already released, so it looks like this will be in 3.10.5. See PEP-619 for the 3.10 schedule. 3.10.5 is scheduled for 2022-06-06.

@vstinner
Copy link
Member

vstinner commented May 5, 2022

I can confirm that rebuilding the distro-provided python 3.10.4 with just @vstinner's patch from #92037 fixes the dreaded Kodi + addons = crash issues on Ubuntu + Arch Linux x64/aarch64.

Subinterpreters exists since Python 1.5 if I recall correctly. Last years, a lot of collaborative work has been done for the long term plan "Per-Interpreter GIL" which got its own PEP: https://peps.python.org/pep-0684/

The work is also to modernize C extensions to convert static types to heap types, replace global variables with module state, and use the multi-phase initialization API (PEP 489). Thanks to all this work, Python no longer leaks memory at exit: https://mail.python.org/archives/list/[email protected]/thread/E4C6TDNVDPDNNP73HTGHN5W42LGAE22F/

See also my article about "isolating subinterpreters": https://vstinner.github.io/isolate-subinterpreters.html

Well, sadly, implementing correctly subinterpreters is really hard. Previously, everything was shared. Now only "some" things are per interpreter, whereas some others are still shared. It leads to non-trivial regressions. Sometimes, the bug was already there before, but it was really hard or just not possible to trigger it.

@vstinner
Copy link
Member

vstinner commented May 5, 2022

According to @comrade-meowski, this bug is a duplicate of #92036 which has been fixed. I close the issue.

@vstinner vstinner closed this as completed May 5, 2022
@vstinner
Copy link
Member

New released Python 3.9.13 contains the fix.

@vstinner
Copy link
Member

Fixed by issue #92036 with commit 1424336.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
3.9 only security fixes type-crash A hard crash of the interpreter, possibly with a core dump
Projects
None yet
Development

No branches or pull requests

7 participants