-
-
Notifications
You must be signed in to change notification settings - Fork 18.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
CLN: memory-mapping code #44766
CLN: memory-mapping code #44766
Conversation
df = tm.makeDataFrame() | ||
df.to_csv(path, mode="w+b") | ||
tm.assert_frame_equal(df, pd.read_csv(path, index_col=0)) | ||
def test_binary_mode(): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this test and test_warning_missing_utf_bom
had nothing to do with mmap but were in its test class.
|
||
# add one entry with a sepcial character | ||
encoding_ = encoding or "utf-8" | ||
leonardo = "Léonardo".encode(encoding_, errors="ignore").decode(encoding_) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This more strict test version of the test would have failed on master with the python engine.
@@ -699,15 +699,15 @@ def test_encoding_mmap(memory_map): | |||
GH 23254. | |||
""" | |||
encoding = "iso8859_1" | |||
data = BytesIO(" 1 A Ä 2\n".encode(encoding)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This test wasn't using memory_map because it silently failed.
Hello @twoertwein! Thanks for updating this PR. We checked the lines you've touched for PEP 8 issues, and found: There are currently no PEP 8 issues detected in this Pull Request. Cheers! 🍻 Comment last updated at 2021-12-05 23:25:04 UTC |
Using TextIOWrapper seems to be slower than the previous solution :( Will revert most of the changes in this PR. |
Rebased on #44761
No need for
codecs.getincrementaldecoder
asio.TextWrapperIO
will do that (and we can useio.TextWrapperIO
because mmap is wrapped inside_IOWrapper
).io.TextWrapperIO
also provides__next__
for us :)Probably will need some benchmarking with utf-8/non-utf8 files.