-
-
Notifications
You must be signed in to change notification settings - Fork 18.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Allow errors
keyword for HDF IO Encoding Err Handling
#20873
Conversation
@@ -705,7 +705,7 @@ def select(self, key, where=None, start=None, stop=None, columns=None, | |||
def func(_start, _stop, _where): | |||
return s.read(start=_start, stop=_stop, | |||
where=_where, | |||
columns=columns, **kwargs) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I removed this **kwargs
argument because it was getting mangled when calling read_index_node
with arbitrary keyword arguments in read_hdf
. I think it was a mistake to be included originally
Codecov Report
@@ Coverage Diff @@
## master #20873 +/- ##
==========================================
+ Coverage 91.78% 91.78% +<.01%
==========================================
Files 153 153
Lines 49341 49319 -22
==========================================
- Hits 45287 45267 -20
+ Misses 4054 4052 -2
Continue to review full report at Codecov.
|
does anythng break if you just always pass |
I think the biggest downside is that it's idiomatic in Python3 to have strict encoding when dealing with files, so if we did that here we'd introduce inconsistency with codecs handling in this project and with what I'd argue to be the larger Python ecosystem. ref https://www.python.org/dev/peps/pep-0540/#encoding-and-error-handler |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lgtm. minor doc comments
@@ -4579,6 +4598,7 @@ def _unconvert_string_array(data, nan_rep=None, encoding=None): | |||
data : fixed length string dtyped array | |||
nan_rep : the storage repr of NaN, optional | |||
encoding : the encoding of the data, optional | |||
errors : handler for encoding errors, default 'strict' |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can you show options and/or point to the python ref for these
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added links to the open
docs from NDFrame.to_hdf
and pd.read_hdf
.
@@ -1946,6 +1946,10 @@ def to_hdf(self, path_or_buf, key, **kwargs): | |||
If applying compression use the fletcher32 checksum. | |||
dropna : bool, default False | |||
If true, ALL nan rows will not be written to store. | |||
errors : str, default 'strict' |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@TomAugspurger I know you are adding a few things for the RC so don't need to change anything here, but do we typically document things in the API like this? Wondering if we shouldn't make all of the documented features actual keyword arguments in the call signature rather than tucking them away in kwargs.
FWIW if we have errors here we'd probably want to add encoding
as well
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The signature should be changed from kwargs to reflect the actual signature. I thought we had an issue for it, but didn't find one. Opened #20903
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks! I'll take a stab at that one later
Appveyor failure is fixed in #20906 I'm going to merge that before merging this, so that the merge commit is properly tested. |
Thanks @WillAyd :) |
git diff upstream/master -u -- "*.py" | flake8 --diff