Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

started to fixturize pandas/tests/base #31701

Merged
merged 4 commits into from
Feb 15, 2020

Conversation

SaturnFromTitan
Copy link
Contributor

Part of #23877

  • tests added / passed
  • passes black pandas
  • passes git diff upstream/master -u -- "*.py" | flake8 --diff

I didn't apply the fixtures to the rest of test_ops.py yet since @jreback voiced some concerns in this comment.

@@ -0,0 +1,52 @@
import numpy as np
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.



@pytest.fixture(params=_all_objs.keys())
def index_or_series_obj(request):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm I think this is a rather complicated way of composing things. Given there is already a fixture for indeces that does something very similar can we just not modify this to use that and create the Series objects as required?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure I understand you right. What do you mean by "create the Series objects as required"?

I need a fixture to replace self.all_objs from the Ops mixin. This will be used in a lot more places in pandas/tests/base/test_ops.py, so I can't just construct the Series' in the test cases.

I was hoping that there are other fixtures which I could use instead of _series and _narrow_series, but I think I still know too little about the codebase to find them. I'd be grateful for any hints to point me in the right direction ✌️

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To rephrase I'm asking if there is a way to leverage the already existing indices fixture, as this duplicates part of the logic of that

Copy link
Contributor Author

@SaturnFromTitan SaturnFromTitan Feb 8, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not really. I checked two possibilities:

  1. Creating more parametrized fixtures for series and narrow_series:
@pytest.fixture
def series(indices):
    data = np.random.randn(len(indices))
    return pd.Series(data, index=indices, name=indices.name)

The problem is, that when using these fixtures in a test like

def test1(indices, series):
    ...

the product of all fixture values is executed instead of the union. So a lot of redundant test cases are run.

  1. Letting index_or_series_obj depend on indices:
    The problem is that index_or_series_obj is supposed to have a different length. So this doesn't really work either.

-> Based on this, the current implementation seems to be the best we can do.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Issue 1 and a potential solution (requires a new dependency) is well explained in this SO post.

@WillAyd WillAyd added the Testing pandas testing functions or related to the test suite label Feb 5, 2020
}


def _create_narrow_series(data_dtype):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

just call this dtype, use the is_integer and so one for comparisons; we really try to avoid using numpy things generally (as they don't scale to all of our dtypes).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jreback I just tried that, but is_integer(np.int16) is actually False. Similar for the other dtypes or is_float 😐

import pytest

import pandas as pd
from pandas.tests.indexes.conftest import indices_dict
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would rather just move the indices_dict to pandas/conftest.py and create these helper series there (as fixtures). ok to do this in 2 PRs

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I just did it in this one. Please let me know if it's what you intended.

create these helper series there (as fixtures)

Not sure what you mean with "as fixtures". Like described in my comment above, I don't see a better way to create this index_or_series_obj fixture

@SaturnFromTitan SaturnFromTitan force-pushed the refactor-test-base-part-4 branch from c3cc757 to 0f88d47 Compare February 10, 2020 16:14
pandas/conftest.py Outdated Show resolved Hide resolved
@SaturnFromTitan
Copy link
Contributor Author

I think I addressed all the review comments. Friendly ping @jreback @WillAyd @jbrockmendel

Copy link
Contributor

@jreback jreback left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm. some comments for followups; if you can do them (and track), then no-need to keep a public list, if you are busy pls make an issue with these types of TODOs

pandas/conftest.py Show resolved Hide resolved

@pytest.fixture(params=indices_dict.keys())
def indices(request):
# copy to avoid mutation, e.g. setting .name
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

next time around can you give these doc-strings

@@ -953,3 +954,69 @@ def __len__(self):
return self._data.__len__()

return TestNonDictMapping


Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can you do 'sections' here e.g. some text markers so reading the file is easy (@jbrockmendel has added these in several files); this can be a followup PR.

@@ -34,6 +34,7 @@
period_range,
)
import pandas._testing as tm
from pandas.conftest import indices_dict
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we really want to avoid this import (but understand doing this in steps)

@@ -13,7 +13,7 @@
from pandas import Float64Index, Int64Index, RangeIndex, UInt64Index
import pandas._testing as tm
from pandas.api.types import pandas_dtype
from pandas.tests.indexes.conftest import indices_dict
from pandas.conftest import indices_dict
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same

@jreback jreback merged commit 8f49265 into pandas-dev:master Feb 15, 2020
@SaturnFromTitan
Copy link
Contributor Author

xref: #31989

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Testing pandas testing functions or related to the test suite
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants