REF: collect ops dispatch functions in one place, try to de-duplicate SparseDataFrame methods #23060

jbrockmendel · 2018-10-09T15:27:43Z

No description provided.

… like regular methods

…code

pep8speaks · 2018-10-09T15:27:49Z

Hello @jbrockmendel! Thanks for submitting the PR.

There are no PEP8 issues in the file pandas/core/frame.py !
There are no PEP8 issues in the file pandas/core/ops.py !
There are no PEP8 issues in the file pandas/core/sparse/frame.py !
There are no PEP8 issues in the file pandas/tests/arithmetic/test_numeric.py !

…iling2

codecov · 2018-10-09T19:20:41Z

Codecov Report

Merging #23060 into master will not change coverage.
The diff coverage is 95.65%.

@@           Coverage Diff           @@
##           master   #23060   +/-   ##
=======================================
  Coverage   92.16%   92.16%           
=======================================
  Files         166      166           
  Lines       51224    51224           
=======================================
  Hits        47212    47212           
  Misses       4012     4012

Flag	Coverage Δ
#multiple	`90.6% <95.65%> (ø)`	⬆️
#single	`42.23% <26.08%> (ø)`	⬆️

Impacted Files	Coverage Δ
pandas/core/sparse/frame.py	`94.86% <92.3%> (ø)`	⬆️
pandas/core/ops.py	`94.24% <97.67%> (ø)`	⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 62a15fa...c431373. Read the comment docs.

…iling2

jbrockmendel · 2018-10-10T13:48:24Z

Woops, accidentally pushed some unrelated commits collecting arithmetic tests.

jreback

looks good. just a couple of questions / comments.

jreback · 2018-10-11T00:54:15Z

pandas/core/ops.py

+    """
+    # Note: we use iloc to access columns for compat with cases
+    #       with non-unique columns.
+    import pandas.core.computation.expressions as expressions


can this be imported at the top?

I'm not 100% sure, but I think this is a run-time import to make import pandas as pd faster

jreback · 2018-10-11T00:56:22Z

pandas/core/sparse/frame.py

+            if own_default == other_default:
+                # TOOD: won't this evaluate as False if both are np.nan?
+                fill_value = own_default
+            elif np.isnan(own_default) and not np.isnan(other_default):


should these be isna checks?

At first I thought so, but the module-level docstring says only float64 is supported, so I kept the behavior as-is. I think the overall takeaway is that this isn't especially well-maintained, and we should all look forward to Sparse EA.

can this be amendednow that that Sparse EA is here? (followup ok too)

I would recommend holding off on changing it.

SparseDataFrame may be going away, so why bother.

We may have to change the default_fill_value if we want its type to match that of sp_values (Require the dtype of SparseArray.fill_value and sp_values.dtype to match #23124 (comment))

Sounds good, thanks.

jreback · 2018-10-11T00:57:49Z

pandas/tests/frame/test_arithmetic.py

+                            'floatcol': np.random.randn(10),
+                            'stringcol': list(tm.rands(10))})
+        df.loc[np.random.rand(len(df)) > 0.5, 'dates2'] = pd.NaT
+        ops = {'gt': 'lt', 'lt': 'gt', 'ge': 'le', 'le': 'ge', 'eq': 'eq',


should parameterize if you can

Yah, the point of collecting these arithmetic tests is to parametrize/fixturize and especially de-duplicate them in an upcoming pass.

jreback · 2018-10-11T00:58:25Z

pandas/tests/frame/test_arithmetic.py

+        # DataFrame
+        assert df.eq(df).values.all()
+        assert not df.ne(df).values.any()
+        for op in ['eq', 'ne', 'gt', 'lt', 'ge', 'le']:


needs paramaterization!

jreback · 2018-10-11T00:59:09Z

pandas/tests/frame/test_arithmetic.py

+            with tm.assert_raises_regex(ValueError, msg):
+                f(ndim_5)
+
+        # Series


pull lthis out to a separatate, parameterized test (future PR is ok for these, though since you are moving around, maybe better here)

jreback · 2018-10-11T01:00:05Z

pandas/tests/series/test_arithmetic.py

+             lambda x: tm.makeFloatSeries(),
+             True)
+        ])
+    @pytest.mark.parametrize('opname', ['add', 'sub', 'mul', 'floordiv',


ideally switch to our operators fixture

…iling2

jreback · 2018-10-23T03:14:07Z

pandas/tests/frame/test_arithmetic.py

+                # == and !=, inequalities should raise
+                result = x == y
+                expected = pd.DataFrame({col: x[col] == y[col]
+                                         for col in x.columns},


can you parameterize this (next pass ok)

…iling2

jbrockmendel · 2018-10-25T04:33:27Z

If it will help, I can separate out the unrelated test parts of this. There is a bunch of test cleanup to do, and already a healthy number of test-touching PRs in play.

…iling2

jreback

comment, but can be a followup, ping on green.

jreback · 2018-10-26T01:38:20Z

pandas/core/sparse/frame.py

+            if own_default == other_default:
+                # TOOD: won't this evaluate as False if both are np.nan?
+                fill_value = own_default
+            elif np.isnan(own_default) and not np.isnan(other_default):


can this be amendednow that that Sparse EA is here? (followup ok too)

jbrockmendel · 2018-10-26T01:41:20Z

can this be amendednow that that Sparse EA is here? (followup ok too)

@TomAugspurger any idea about this? I have no clue

jreback · 2018-10-26T01:57:44Z

also happy to merge this and followup later on sparse refactorings (prob better).

jbrockmendel · 2018-10-26T04:32:54Z

Ping

jreback · 2018-10-28T03:18:21Z

can you rebase. the isort is playing havoc :>

…iling2

jreback · 2018-10-28T13:49:00Z

thanks @jbrockmendel nice as always!

…y_tests * repo_org/master: (52 commits) ENH: Allow rename_axis to specify index and columns arguments (pandas-dev#20046) STY: proposed isort settings [ci skip] [skip ci] [ciskip] [skipci] (pandas-dev#23366) MAINT: Remove extraneous test.parquet file CLN: Follow-up comments to pandas-devgh-23392 (pandas-dev#23401) BUG GH23282 calling min on series of NaT returns NaT (pandas-dev#23289) unpin openpyxl (pandas-dev#23361) REF: collect ops dispatch functions in one place, try to de-duplicate SparseDataFrame methods (pandas-dev#23060) CLN: Remove pandas.tools module (pandas-dev#23376) CLN: Remove some dtype methods from API (pandas-dev#23390) CLN: Cleanup toplevel namespace shims (pandas-dev#23386) DOC: fixup whatsnew note for GH21394 (pandas-dev#23355) Fix import format at pandas/tests/extension directory (pandas-dev#23365) DOC: Remove Series.sortlevel from api.rst (pandas-dev#23395) API: Disallow dtypes w/o frequency when casting (pandas-dev#23392) BUG/TST/REF: Datetimelike Arithmetic Methods (pandas-dev#23215) STYLE: lint add np.nan* funcs to cython_table (pandas-dev#22109) Run Isort on tests/util single PR (pandas-dev#23347) BUG: Fix date_range overflow (pandas-dev#23345) Run Isort on tests/arrays single PR (pandas-dev#23346) ...

… SparseDataFrame methods (pandas-dev#23060)

jbrockmendel added 4 commits October 8, 2018 17:34

collect dispatch functions in one place

c01c19a

remove unused try_cast args; try to make SparseDataFrame methods more…

f0e0a4e

… like regular methods

Use align methods in SparseDataFrame methods to move towards sharing …

30f3737

…code

typo fixup

5f9d111

jbrockmendel added 2 commits October 9, 2018 12:20

fixup copy/paste mistake

f236663

Merge branch 'master' of https://github.com/pandas-dev/pandas into fa…

1a556bc

…iling2

jbrockmendel added 4 commits October 9, 2018 17:05

Merge branch 'master' of https://github.com/pandas-dev/pandas into fa…

bcb1c35

…iling2

keep collecting arithmetic tests

1c9b86b

keep collecting Series arith tests

a2d1a56

Merge branch 'master' of https://github.com/pandas-dev/pandas into fa…

27c40cb

…iling2

jreback requested changes Oct 11, 2018

View reviewed changes

jreback added Numeric Operations Arithmetic, Comparison, and Logical operations Sparse Sparse Data Type Clean labels Oct 11, 2018

jbrockmendel added 2 commits October 21, 2018 11:24

Merge branch 'master' of https://github.com/pandas-dev/pandas into fa…

9835825

…iling2

Merge branch 'master' of https://github.com/pandas-dev/pandas into fa…

9737aee

…iling2

jreback reviewed Oct 23, 2018

View reviewed changes

jbrockmendel added 2 commits October 23, 2018 07:57

Merge branch 'master' of https://github.com/pandas-dev/pandas into fa…

945beb2

…iling2

fixup duplicate import

ecaac45

jreback added this to the 0.24.0 milestone Oct 24, 2018

Merge branch 'master' of https://github.com/pandas-dev/pandas into fa…

1d08646

…iling2

jreback approved these changes Oct 26, 2018

View reviewed changes

jbrockmendel added 2 commits October 27, 2018 22:16

Merge branch 'master' of https://github.com/pandas-dev/pandas into fa…

11219fe

…iling2

post-merge cleanup

c431373

jreback merged commit b9e2278 into pandas-dev:master Oct 28, 2018

jbrockmendel deleted the failing2 branch October 28, 2018 16:17

tm9k1 pushed a commit to tm9k1/pandas that referenced this pull request Nov 19, 2018

REF: collect ops dispatch functions in one place, try to de-duplicate…

04bfc36

… SparseDataFrame methods (pandas-dev#23060)

Pingviinituutti pushed a commit to Pingviinituutti/pandas that referenced this pull request Feb 28, 2019

REF: collect ops dispatch functions in one place, try to de-duplicate…

c556512

… SparseDataFrame methods (pandas-dev#23060)

Pingviinituutti pushed a commit to Pingviinituutti/pandas that referenced this pull request Feb 28, 2019

REF: collect ops dispatch functions in one place, try to de-duplicate…

5e79c8c

… SparseDataFrame methods (pandas-dev#23060)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

REF: collect ops dispatch functions in one place, try to de-duplicate SparseDataFrame methods #23060

REF: collect ops dispatch functions in one place, try to de-duplicate SparseDataFrame methods #23060

jbrockmendel commented Oct 9, 2018

pep8speaks commented Oct 9, 2018

codecov bot commented Oct 9, 2018 •

edited

Loading

jbrockmendel commented Oct 10, 2018

jreback left a comment

jreback Oct 11, 2018

jbrockmendel Oct 11, 2018

jreback Oct 11, 2018

jbrockmendel Oct 11, 2018

jreback Oct 26, 2018

TomAugspurger Oct 26, 2018

jbrockmendel Oct 26, 2018

jreback Oct 11, 2018

jbrockmendel Oct 11, 2018

jreback Oct 11, 2018

jreback Oct 11, 2018

jreback Oct 11, 2018

jreback Oct 23, 2018

jbrockmendel commented Oct 25, 2018

jreback left a comment

jreback Oct 26, 2018

jbrockmendel commented Oct 26, 2018

jreback commented Oct 26, 2018

jbrockmendel commented Oct 26, 2018

jreback commented Oct 28, 2018

jreback commented Oct 28, 2018

REF: collect ops dispatch functions in one place, try to de-duplicate SparseDataFrame methods #23060

REF: collect ops dispatch functions in one place, try to de-duplicate SparseDataFrame methods #23060

Conversation

jbrockmendel commented Oct 9, 2018

pep8speaks commented Oct 9, 2018

codecov bot commented Oct 9, 2018 • edited Loading

Codecov Report

jbrockmendel commented Oct 10, 2018

jreback left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

jbrockmendel commented Oct 25, 2018

jreback left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

jbrockmendel commented Oct 26, 2018

jreback commented Oct 26, 2018

jbrockmendel commented Oct 26, 2018

jreback commented Oct 28, 2018

jreback commented Oct 28, 2018

codecov bot commented Oct 9, 2018 •

edited

Loading