Skip to content

Commit

Permalink
python: use surrogateescape when encoding args in wrapper.py
Browse files Browse the repository at this point in the history
Problem: When passing commandline arguments (i.e. sys.argv) from
Python scripts to C API functions via the wrapper.py class, encoding
errors are possible if a utf-8 string is supplied in argv if the
current environment does not have a UTF-8 LANG or LC_ALL environemnt
set. This seems to be because Python encodes sys.argv to `str` with
the current encoding, and (at least as of Python 3.6) there is no
way to get the actual bytes given in argv in this environment without
throwing a UnicodeEncode error.

Fortunately this has been a big enough problem that PEP383[1] was
devised, introducing the 'surrogatescape' error handler for encode(),
which allows the so-called "smuggling" bytes in character strings.

Since the alternative is to possibly throw UnicodeEncode errors,
add `errors='surrogateescape'` to the wrapper.py __call__() method,
when it encodes string arguments to pass into C. There is likely no
downside here, since there is presumably no existing usage depending
on the current behavior.

[1] https://www.python.org/dev/peps/pep-0383/
  • Loading branch information
grondo committed Jul 23, 2020
1 parent b44ff1e commit f8a3ce8
Showing 1 changed file with 1 addition and 1 deletion.
2 changes: 1 addition & 1 deletion src/bindings/python/flux/wrapper.py
Original file line number Diff line number Diff line change
Expand Up @@ -190,7 +190,7 @@ def __call__(self, calling_object, *args_in):
# Unpack wrapper objects
args[i] = args[i].handle
elif isinstance(args[i], six.text_type):
args[i] = args[i].encode("utf-8")
args[i] = args[i].encode("utf-8", errors="surrogateescape")

try:
result = self.fun(*args)
Expand Down

0 comments on commit f8a3ce8

Please sign in to comment.