-
-
Notifications
You must be signed in to change notification settings - Fork 1.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Limit repr of arrays containing long strings #3900
Conversation
"""Summarize a variable in one line, e.g., for the Dataset.__repr__.""" | ||
if max_width is None: | ||
max_width = OPTIONS["display_width"] | ||
max_width_options = OPTIONS["display_width"] | ||
if not isinstance(max_width_options, int): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this a reasonable way to enforce types for objects coming from dicts? Or too much cruft and we should just use type hints?
LGTM. thanks |
* upstream/master: (54 commits) Limit repr of arrays containing long strings (pydata#3900) expose a few zarr backend functions as semi-public api (pydata#3897) Use drawstyle instead of linestyle in plot.step. (pydata#3274) Implementation of polyfit and polyval (pydata#3733) misplaced quote in whatsnew (pydata#3889) Rename ordered_dict_intersection -> compat_dict_intersection (pydata#3887) Control attrs of result in `merge()`, `concat()`, `combine_by_coords()` and `combine_nested()` (pydata#3877) xfail test_uamiv_format_write (pydata#3885) Use `fixes` in PR template (pydata#3886) Tweaks to "how_to_release" (pydata#3882) whatsnew section for 0.16.0 Release v0.15.1 whatsnew for 0.15.1 (pydata#3879) update panel documentation (pydata#3880) reword the whats-new entry for unit support (pydata#3878) Raise error when assigning to IndexVariable.values & IndexVariable.data (pydata#3862) Re-enable tests xfailed in pydata#3808 and fix new CFTimeIndex failures due to upstream changes (pydata#3874) add spacing in the versions section of the issue report (pydata#3876) map_blocks: allow user function to add new unindexed dimension. (pydata#3817) Delete associated indexes when deleting coordinate variables. (pydata#3840) ...
# NB: Probably not ideal; an alternative would be cutting after the | ||
# first ellipsis | ||
actual = formatting.format_array_flat(np.arange(100.0), 11) | ||
expected = "0.0 ... 99.0" | ||
expected = "0.0 ... ..." |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd cut after the first ellipsis, two of them in a row look strange to me.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, but what's the rule? Backtrack and see whether the previous values were ...
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
something like that, yes:
if len(pprint_str) > max_width:
pprint_str = pprint_str[: max(max_width - 3, 0)].rstrip()
if not pprint_str.endswith("..."):
pprint_str = pprint_str + "..."
Does that seem reasonable?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Something like this is very reasonable, for sure. Just a tradeoff of complexity vs. results.
For example, this case wouldn't change as a result of the function's change, because it ends with a space... We could add that logic in though
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
true, but we could also run rstrip
first, truncate the string and run rstrip
again (apologies if that's what you meant, I'm not really sure).
My suggestion might result in something like 783759.270 ... 592000...
but avoids 0.0 ... ...
. If we'd also like to avoid the former, we could not use "".join()
immediately but instead save the iterable, check if the result of "".join()
is short enough and remove the last element of the iterable if it isn't.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes 100%, I agree that's better. I'm probably a bit less motivated to make these stat as I feel like the egregious tails have been limited, which was the pressing issue. But can come back to it in the future if no one else takes it up.
# replace the end with an ellipsis | ||
# NB: this will still returns a full 3-character ellipsis when max_width < 3 | ||
if len(pprint_str) > max_width: | ||
pprint_str = pprint_str[: max(max_width - 3, 0)] + "..." |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
pprint_str = pprint_str[: max(max_width - 3, 0)] + "..." | |
pprint_str = pprint_str[: max(max_width - 3, 0)] + padding |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I can't seem to accept this, maybe because the PR is closed...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yes, we'd need a new PR for that.
* upstream/master: (75 commits) Implement idxmax and idxmin functions (pydata#3871) Update pre-commit-config.yaml (pydata#3911) Revert "Use `fixes` in PR template (pydata#3886)" (pydata#3912) update the docstring of diff (pydata#3909) Un-xfail test_dayofyear_after_cftime_range (pydata#3907) Limit repr of arrays containing long strings (pydata#3900) expose a few zarr backend functions as semi-public api (pydata#3897) Use drawstyle instead of linestyle in plot.step. (pydata#3274) Implementation of polyfit and polyval (pydata#3733) misplaced quote in whatsnew (pydata#3889) Rename ordered_dict_intersection -> compat_dict_intersection (pydata#3887) Control attrs of result in `merge()`, `concat()`, `combine_by_coords()` and `combine_nested()` (pydata#3877) xfail test_uamiv_format_write (pydata#3885) Use `fixes` in PR template (pydata#3886) Tweaks to "how_to_release" (pydata#3882) whatsnew section for 0.16.0 Release v0.15.1 whatsnew for 0.15.1 (pydata#3879) update panel documentation (pydata#3880) reword the whats-new entry for unit support (pydata#3878) ...
isort -rc . && black . && mypy . && flake8
whats-new.rst
for all changes andapi.rst
for new APIThis also more strictly limits the length of array reprs where the max_width is small, such that it doesn't always display the first and last item. Lmk if that could have a adverse impact anywhere.