Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

UnicodeDecodeError "invalid continuation byte" when Tribler GUI reads Core stdout on Windows #6916

Closed
kozlovsky opened this issue May 28, 2022 · 2 comments
Assignees
Milestone

Comments

@kozlovsky
Copy link
Contributor

After #6567, Tribler core stdout assumes to be in UTF8 encoding. On Windows machines, the actual encoding of stdout is non-UTF8. With the introduction of auto-generated tags, tag names are logged into the core stdout and sometimes contain non-ASCII symbols. As a result, the Tribler GUI process crashed with the following error when trying to read Core output:

ErrorHandler.gui_error(): tribler.gui.utilities.CreationTraceback: 
  File "run_tribler.py", line 97, in <module>
  File "tribler\gui\start_gui.py", line 77, in run_gui
  File "tribler\gui\utilities.py", line 410, in trackback_wrapper
  File "tribler\gui\core_manager.py", line 95, in start_tribler_core


The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "tribler\gui\utilities.py", line 413, in trackback_wrapper
  File "tribler\gui\utilities.py", line 410, in trackback_wrapper
  File "tribler\gui\core_manager.py", line 110, in on_core_stdout_read_ready
UnicodeDecodeError: 'utf-8' codec can't decode byte 0xf1 in position 81: invalid continuation byte

This bug affects all Windows machines and is guaranteed to crash Tribler on Windows after a long enough period.

It may be hard to detect the correct Core stdout encoding on all systems. I think there is a simpler solution - try to read the core stdout as UTF8, and in case of the non-UTF8 output, read it as ASCII with escaping of all non-ASCII symbols. With this approach, all Core output will be preserved. There is a possibility that some artificial non-UTF8 sequences will be mistakenly read as UTF8, but it is non-critical, and for the normal output, it should not happen.

@sentry-for-tribler
Copy link

Sentry issue: TRIBLER-TESTS-22

@kozlovsky
Copy link
Contributor Author

Fixed in #6917

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Development

No branches or pull requests

2 participants