Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Encoding error while opening file in binary mode #120

Closed
e18r opened this issue Sep 24, 2016 · 1 comment · Fixed by #121
Closed

Encoding error while opening file in binary mode #120

e18r opened this issue Sep 24, 2016 · 1 comment · Fixed by #121
Labels

Comments

@e18r
Copy link

e18r commented Sep 24, 2016

I'm having trouble opening non-ascii files in binary mode using pyfakefs v2.7.

Usually, UnicodeDecodeErrors are raised when opening a file with an inappropriate encoding:

>>> file = open('file.txt', mode='w', encoding='utf-8')
>>> file.write('ñ')
1
>>> file.close()
>>> file = open('file.txt', mode='r', encoding='ascii')
>>> file.read()
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/usr/lib/python3.5/encodings/ascii.py", line 26, in decode
    return codecs.ascii_decode(input, self.errors)[0]
UnicodeDecodeError: 'ascii' codec can't decode byte 0xc3 in position 0: ordinal not in range(128)

If you open the same file in binary mode, however, no error is raised:

>>> file = open('file.txt', mode='rb')
>>> file.read()
b'\xc3\xb1'

It is even inappropriate to specify an encoding, because bytes are read as-is, so there's no decoding process:

>>> file = open('file.txt', mode='rb', encoding='ascii')
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
ValueError: binary mode doesn't take an encoding argument

On the other hand, if you try the same thing with pyfakefs,

>>> from pyfakefs import fake_filesystem
>>> from unittest import mock
>>> filesystem = fake_filesystem.FakeFilesystem()
>>> fake_open = fake_filesystem.FakeFileOpen(filesystem)
>>> with mock.patch('builtins.open', fake_open):
...     file = open('file.txt', 'w')
...     file.write('ñ')
...     file.close()
... 
1
>>> with mock.patch('builtins.open', fake_open):
...     file = open('file.txt', mode='rb')
... 
Traceback (most recent call last):
  File "<stdin>", line 2, in <module>
  File "/usr/local/lib/python3.5/dist-packages/pyfakefs/fake_filesystem.py", line 1898, in __call__
    return self.Call(*args, **kwargs)
  File "/usr/local/lib/python3.5/dist-packages/pyfakefs/fake_filesystem.py", line 2211, in Call
    closefd=closefd)
  File "/usr/local/lib/python3.5/dist-packages/pyfakefs/fake_filesystem.py", line 2002, in __init__
    contents = bytes(contents, 'ascii')
UnicodeEncodeError: 'ascii' codec can't encode character '\xf1' in position 0: ordinal not in range(128)

Any ideas?

@mrbean-bremen
Copy link
Member

This is indeed not implemented. The current implementation handles file contents as str in both Python 2 and 3, and uses some convertions and wrapping into a helper class to make this work - but only if writing and reading is done using the same mode (text or binary). Your best bet is not to mix modes in your code at the moment, if possible.
To fix this, probably file contents shall always be stored as bytes in Python 3, and the encoding explicitely handled if given in open() in Python 3 (currently it is ignored).
@jmcgeheeiv: I could have a go at this somewhere in the next week, if you don't mind.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants