Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

UnicodeDecodeError: 'utf-8' codec can't decode byte xxx #483

Open
hamsterbacke opened this issue Aug 26, 2021 · 5 comments
Open

UnicodeDecodeError: 'utf-8' codec can't decode byte xxx #483

hamsterbacke opened this issue Aug 26, 2021 · 5 comments

Comments

@hamsterbacke
Copy link

Hi there :)
again a utf-8 against iso-8859-15 issue.
Scriptserver has bash env utf-8
connects to host test.de with user oracle which has bash env de_DE.iso885915 set.
Script outputs an umlaut and Scriptserver shows error:

Unexpected error occurred. Contact the administrator.
>> KILLED

The server.log says:

2021-08-26 16:19:50,663 [script_server.execution_service.INFO] Calling script #1497: ssh [email protected] /home/oracle/bin/prod2test_CS2.sh
2021-08-26 16:19:50,677 [tornado.access.INFO] 200 POST /executions/start (127.0.0.1) 31.51ms
2021-08-26 16:19:50,683 [tornado.access.INFO] 101 GET /executions/io/1497 (127.0.0.1) 0.67ms
2021-08-26 16:19:50,870 [script_server.process_popen.ERROR] Failed to read script output
Traceback (most recent call last):
  File "src/execution/process_popen.py", line 71, in pipe_process_output
    data = self.process.stdout.read(1)
  File "/usr/lib/python3.7/codecs.py", line 322, in decode
    (result, consumed) = self._buffer_decode(data, self.errors, final)
UnicodeDecodeError: 'utf-8' codec can't decode byte 0xe4 in position 6: invalid continuation byte
2021-08-26 16:19:50,883 [web_server.INFO] lutz disconnected

0xe4 is a german "ä" which is an output from a program that is being executed.
The only fix is to use a pipe and converting to utf-8 like so:
echo "This is a german umlaut ä" | iconv -f latin9 -t utf8

@bugy
Copy link
Owner

bugy commented Aug 29, 2021

Hi @hamsterbacke, could you try using terminal mode? requires_terminal : true
(but it works only on Linux)
The fix, which I did in #376 works only for this mode (which is default on Linux)

@bugy bugy added the bug label Aug 29, 2021
@hamsterbacke
Copy link
Author

Will try it.
It would be nicer if the scriptserver wouldn't kill the whole process just because of an unknown character.
As I filed bug #376 , it just got truncated but the script continued. Now the script gets killed.
I wouldn't mind to see a strange character or the character left out but continuing the script :)

@bugy
Copy link
Owner

bugy commented Aug 29, 2021

yup, I'll keep the ticket open, to fix non-terminal mode

@hamsterbacke
Copy link
Author

You are absolutely correct. I ticked the "Enable pseudo-terminal" box and now it accepts the non-utf8 latin9 aka iso-8859-15 characters. Sorry that I missed it the first time when you fixed it in #376 . Many thanks!

@bugy
Copy link
Owner

bugy commented Aug 31, 2021

Cool, thanks for confirming

bugy added a commit that referenced this issue Dec 4, 2022
…om exception to replacing unknown characters
@bugy bugy added the resolved label Dec 4, 2022
@bugy bugy added this to the 1.18.0 milestone Dec 4, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants