-
-
Notifications
You must be signed in to change notification settings - Fork 18.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
API: concat on sparse values #25719
API: concat on sparse values #25719
Conversation
API breaking change to `concat(List[DataFrame[Sparse]])` to return a DataFrame with sparse values, rather than a SparseDataFrame. Doing an outright break, rather than deprecation, because I have a followup PR deprecating SparseDataFrame. We return this internally in a few places (e.g. get_dummies on all-sparse data). Closes pandas-dev#25702
Codecov Report
@@ Coverage Diff @@
## master #25719 +/- ##
==========================================
- Coverage 91.24% 91.24% -0.01%
==========================================
Files 172 172
Lines 52967 52970 +3
==========================================
+ Hits 48332 48334 +2
- Misses 4635 4636 +1
Continue to review full report at Codecov.
|
1 similar comment
Codecov Report
@@ Coverage Diff @@
## master #25719 +/- ##
==========================================
- Coverage 91.24% 91.24% -0.01%
==========================================
Files 172 172
Lines 52967 52970 +3
==========================================
+ Hits 48332 48334 +2
- Misses 4635 4636 +1
Continue to review full report at Codecov.
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
tiny change, otherwise lgtm.
@@ -64,6 +64,42 @@ is respected in indexing. (:issue:`24076`, :issue:`16785`) | |||
df = pd.DataFrame([0], index=pd.DatetimeIndex(['2019-01-01'], tz='US/Pacific')) | |||
df['2019-01-01 12:00:00+04:00':'2019-01-01 13:00:00+04:00'] | |||
|
|||
Concatenating Sparse Values |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
add a ref
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have seen you make this review comment before, but IMO, there is not really much added value in asking for this. It doesn't change anything in the built documentation, and is only needed if you make a link to it from somewhere else in our docs (and in this case, I think it is rather unlikely that we will ever link to it).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
disagree. as a matter of course if we have subsections we should have links. its not much of a burden and promotes consistency.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's just that you are asking something (and we are often already asking a lot of contributors in many rounds) that is basically useless, except for the consistency argument that you give (IMO that's not worth the overhead of always adding it, but OK, we can disagree on that).
as a matter of course if we have subsections we should have links
Note that this does not introduce links. Things like anchors so that people can link to that section on the html docs, that is done automatically by sphinx, you don't need this label for that.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
again disagree. consistency is SO important in the pandas docs / code. If it can be automated (either the actual code / docs or the check of it great), otherwise the purpose is NOT to special case the world. Again having to make that judgement call on each PR would make a big inconsistency across PR's & maintainers.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
having to make that judgement call on each PR would make a big inconsistency across PR's & maintainers.
The thing is that you don't need to make a judgement call here: it simply is not needed for new sections being added. Only when you ask to add somewhere an internal link, you need to add a label.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
you are still missing the point
pls add the ref here - arguing about reviews is a waste of time
Changing the convention might save time in the long run, if we avoid
unnecessary CI runs, abandoned PRs, etc.
…On Thu, Mar 14, 2019 at 6:11 PM Jeff Reback ***@***.***> wrote:
***@***.**** commented on this pull request.
------------------------------
In doc/source/whatsnew/v0.25.0.rst
<#25719 (comment)>:
> @@ -64,6 +64,42 @@ is respected in indexing. (:issue:`24076`, :issue:`16785`)
df = pd.DataFrame([0], index=pd.DatetimeIndex(['2019-01-01'], tz='US/Pacific'))
df['2019-01-01 12:00:00+04:00':'2019-01-01 13:00:00+04:00']
+Concatenating Sparse Values
you are still missing the point
pls add the ref here - arguing about reviews is a waste of time
—
You are receiving this because you authored the thread.
Reply to this email directly, view it on GitHub
<#25719 (comment)>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/ABQHImm40yVgqRq7eV83ucR_5OytS_6dks5vWtcJgaJpZM4bxfDE>
.
|
thanks @TomAugspurger |
* origin/master: DOC: clean bug fix section in whatsnew (pandas-dev#25792) DOC: Fixed PeriodArray api ref (pandas-dev#25526) Move locale code out of tm, into _config (pandas-dev#25757) Unpin pycodestyle (pandas-dev#25789) Add test for rdivmod on EA array (GH23287) (pandas-dev#24047) ENH: Support datetime.timezone objects (pandas-dev#25065) Cython language level 3 (pandas-dev#24538) API: concat on sparse values (pandas-dev#25719) TST: assert_produces_warning works with filterwarnings (pandas-dev#25721) make core.config self-contained (pandas-dev#25613) CLN: replace %s syntax with .format in pandas.io.parsers (pandas-dev#24721) TST: Check pytables<3.5.1 when skipping (pandas-dev#25773) DOC: Fix typo in docstring of DataFrame.memory_usage (pandas-dev#25770) Replace dicts with OrderedDicts in groupby aggregation functions (pandas-dev#25693) TST: Fixturize tests/frame/test_missing.py (pandas-dev#25640) DOC: Improve the docsting of Series.iteritems (pandas-dev#24879) DOC: Fix function name. (pandas-dev#25751) Implementing iso_week_year support for to_datetime (pandas-dev#25541) DOC: clarify corr behaviour when using a callable (pandas-dev#25732) remove unnecessary check_output (pandas-dev#25755) # Conflicts: # doc/source/whatsnew/v0.25.0.rst
* upstream/master: (55 commits) PERF: Improve performance of StataReader (pandas-dev#25780) Speed up tokenizing of a row in csv and xstrtod parsing (pandas-dev#25784) BUG: Fix _binop for operators for serials which has more than one returns (divmod/rdivmod). (pandas-dev#25588) BUG-24971 copying blocks also considers ndim (pandas-dev#25521) CLN: Panel reference from documentation (pandas-dev#25649) ENH: Quoting column names containing spaces with backticks to use them in query and eval. (pandas-dev#24955) BUG: reading windows utf8 filenames in py3.6 (pandas-dev#25769) DOC: clean bug fix section in whatsnew (pandas-dev#25792) DOC: Fixed PeriodArray api ref (pandas-dev#25526) Move locale code out of tm, into _config (pandas-dev#25757) Unpin pycodestyle (pandas-dev#25789) Add test for rdivmod on EA array (GH23287) (pandas-dev#24047) ENH: Support datetime.timezone objects (pandas-dev#25065) Cython language level 3 (pandas-dev#24538) API: concat on sparse values (pandas-dev#25719) TST: assert_produces_warning works with filterwarnings (pandas-dev#25721) make core.config self-contained (pandas-dev#25613) CLN: replace %s syntax with .format in pandas.io.parsers (pandas-dev#24721) TST: Check pytables<3.5.1 when skipping (pandas-dev#25773) DOC: Fix typo in docstring of DataFrame.memory_usage (pandas-dev#25770) ...
API breaking change to
concat(List[DataFrame[Sparse]])
toreturn a DataFrame with sparse values, rather than a SparseDataFrame.
Doing an outright break, rather than deprecation, because I have a
followup PR deprecating SparseDataFrame. We hit this internally in
a few places (e.g. get_dummies on all-sparse data).
Closes #25702
git diff upstream/master -u -- "*.py" | flake8 --diff