Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

TST: placement of network error catching in s3 tests #19645

Merged
merged 4 commits into from
Feb 13, 2018

Conversation

jreback
Copy link
Contributor

@jreback jreback commented Feb 11, 2018

No description provided.

@jreback jreback added Testing pandas testing functions or related to the test suite IO Network Local or Cloud (AWS, GCS, etc.) IO Issues labels Feb 11, 2018
@jreback jreback added this to the 0.23.0 milestone Feb 11, 2018
@codecov
Copy link

codecov bot commented Feb 11, 2018

Codecov Report

Merging #19645 into master will decrease coverage by 0.02%.
The diff coverage is 60.46%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master   #19645      +/-   ##
==========================================
- Coverage   91.58%   91.56%   -0.03%     
==========================================
  Files         150      150              
  Lines       48806    48829      +23     
==========================================
+ Hits        44701    44709       +8     
- Misses       4105     4120      +15
Flag Coverage Δ
#multiple 89.93% <60.46%> (-0.03%) ⬇️
#single 41.71% <6.97%> (-0.02%) ⬇️
Impacted Files Coverage Δ
pandas/io/sas/sas7bdat.py 91.07% <100%> (ø) ⬆️
pandas/io/sas/sas_xport.py 90.27% <100%> (ø) ⬆️
pandas/io/packers.py 87.31% <16.66%> (-1.34%) ⬇️
pandas/io/parsers.py 95.32% <33.33%> (-0.25%) ⬇️
pandas/io/s3.py 72.72% <50%> (-13.64%) ⬇️
pandas/io/common.py 68.77% <50%> (-0.3%) ⬇️
pandas/io/json/json.py 92.23% <75%> (-0.37%) ⬇️
pandas/io/parquet.py 71.79% <80%> (-0.28%) ⬇️
pandas/util/testing.py 83.85% <0%> (+0.2%) ⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 569bc7a...825f2a8. Read the comment docs.

@pep8speaks
Copy link

pep8speaks commented Feb 11, 2018

Hello @jreback! Thanks for updating the PR.

Cheers ! There are no PEP8 issues in this Pull Request. 🍻

Comment last updated on February 12, 2018 at 12:20 Hours UTC

@jreback
Copy link
Contributor Author

jreback commented Feb 11, 2018

This I believe exposes some issues that exist with the s3 tests and mocking that have our CI occasionaly fail. I think there is some interaction going on. When run separately these all work.

pytest pandas/tests/io/parser/test_network.py pandas/tests/io/json/test_pandas.py pandas/tests/io/test_parquet.py -r xXs -k s3 -v --tb=short

xref #19585

was trying to remove the resource closing issues (IOW on s3 access we don't close s3).

any ideas here: @jorisvandenbossche @TomAugspurger @maximveksler @mdurant

@jreback jreback removed this from the 0.23.0 milestone Feb 11, 2018
Copy link
Contributor

@TomAugspurger TomAugspurger left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does running pytest pandas/tests/io/parser/test_network.py pandas/tests/io/json/test_pandas.py pandas/tests/io/test_parquet.py -r xXs -k s3 -v --tb=short fail for you ever?

I've tried a handful of times and haven't gotten a failure yet, and I've never seen it locally. Are the failures on a run using pytest-xdist? Some race condition between creating the fixtures and using them?

@@ -183,7 +183,10 @@ def get_filepath_or_buffer(filepath_or_buffer, encoding=None,

Returns
-------
a filepath_ or buffer or S3File instance, the encoding, the compression
tuple of (a filepath_ or buffer or S3File instance,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Usually multiple return values are split into separate lines.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

these are not multiple return values from different possibilities for the first

@jreback
Copy link
Contributor Author

jreback commented Feb 11, 2018

this fails for me locally - i was expecting it to fail in the CI actually

let me take a closer look

@jreback
Copy link
Contributor Author

jreback commented Feb 12, 2018

So these pass now all pass, but have these warnings. any ideas?

Users/jreback/miniconda3/envs/pandas/lib/python3.6/site-packages/fastparquet/dataframe.py:7: FutureWarning: 'pandas.core' is private. Use 'pandas.Categorical'
  from pandas.core.categorical import Categorical, CategoricalDtype
collected 140 items                                                                                                                                                                                                                         

pandas/tests/io/json/test_pandas.py .                                                                                                                                                                                                 [  7%]
pandas/tests/io/test_parquet.py ..                                                                                                                                                                                                    [ 21%]
pandas/tests/io/parser/test_network.py ...........                                                                                                                                                                                    [100%]

============================================================================================================= warnings summary ==============================================================================================================
pandas/tests/io/parser/test_network.py::TestS3::()::test_parse_public_s3_bucket
  /Users/jreback/pandas/pandas/io/s3.py:38: ResourceWarning: unclosed <ssl.SSLSocket fd=9, family=AddressFamily.AF_INET, type=SocketKind.SOCK_STREAM, proto=6, laddr=('172.20.10.2', 52618), raddr=('52.216.96.141', 443)>
    filepath_or_buffer = fs.open(_strip_schema(filepath_or_buffer), mode)
  /Users/jreback/pandas/pandas/io/s3.py:38: ResourceWarning: unclosed <ssl.SSLSocket fd=9, family=AddressFamily.AF_INET, type=SocketKind.SOCK_STREAM, proto=6, laddr=('172.20.10.2', 52626), raddr=('52.216.96.141', 443)>
    filepath_or_buffer = fs.open(_strip_schema(filepath_or_buffer), mode)
  /Users/jreback/pandas/pandas/io/s3.py:38: ResourceWarning: unclosed <ssl.SSLSocket fd=9, family=AddressFamily.AF_INET, type=SocketKind.SOCK_STREAM, proto=6, laddr=('172.20.10.2', 52629), raddr=('52.216.96.141', 443)>
    filepath_or_buffer = fs.open(_strip_schema(filepath_or_buffer), mode)
  /Users/jreback/pandas/pandas/io/s3.py:38: ResourceWarning: unclosed <ssl.SSLSocket fd=9, family=AddressFamily.AF_INET, type=SocketKind.SOCK_STREAM, proto=6, laddr=('172.20.10.2', 52630), raddr=('52.216.96.141', 443)>
    filepath_or_buffer = fs.open(_strip_schema(filepath_or_buffer), mode)

pandas/tests/io/parser/test_network.py::TestS3::()::test_parse_public_s3n_bucket
  /Users/jreback/pandas/pandas/io/s3.py:38: ResourceWarning: unclosed <ssl.SSLSocket fd=9, family=AddressFamily.AF_INET, type=SocketKind.SOCK_STREAM, proto=6, laddr=('172.20.10.2', 52632), raddr=('52.216.96.141', 443)>
    filepath_or_buffer = fs.open(_strip_schema(filepath_or_buffer), mode)

pandas/tests/io/parser/test_network.py::TestS3::()::test_parse_public_s3a_bucket
  /Users/jreback/pandas/pandas/io/s3.py:38: ResourceWarning: unclosed <ssl.SSLSocket fd=9, family=AddressFamily.AF_INET, type=SocketKind.SOCK_STREAM, proto=6, laddr=('172.20.10.2', 52633), raddr=('52.216.96.141', 443)>
    filepath_or_buffer = fs.open(_strip_schema(filepath_or_buffer), mode)

pandas/tests/io/parser/test_network.py::TestS3::()::test_parse_public_s3_bucket_nrows
  /Users/jreback/pandas/pandas/io/s3.py:38: ResourceWarning: unclosed <ssl.SSLSocket fd=9, family=AddressFamily.AF_INET, type=SocketKind.SOCK_STREAM, proto=6, laddr=('172.20.10.2', 52635), raddr=('52.216.96.141', 443)>
    filepath_or_buffer = fs.open(_strip_schema(filepath_or_buffer), mode)
  /Users/jreback/pandas/pandas/io/s3.py:38: ResourceWarning: unclosed <ssl.SSLSocket fd=9, family=AddressFamily.AF_INET, type=SocketKind.SOCK_STREAM, proto=6, laddr=('172.20.10.2', 52640), raddr=('52.216.96.141', 443)>
    filepath_or_buffer = fs.open(_strip_schema(filepath_or_buffer), mode)
  /Users/jreback/pandas/pandas/io/s3.py:38: ResourceWarning: unclosed <ssl.SSLSocket fd=9, family=AddressFamily.AF_INET, type=SocketKind.SOCK_STREAM, proto=6, laddr=('172.20.10.2', 52645), raddr=('52.216.96.141', 443)>
    filepath_or_buffer = fs.open(_strip_schema(filepath_or_buffer), mode)

pandas/tests/io/parser/test_network.py::TestS3::()::test_parse_public_s3_bucket_chunked

Copy link
Contributor

@TomAugspurger TomAugspurger left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

pandas/pandas/core/ops.py

Lines 273 to 284 in a277108

_arith_doc_FRAME = """
Binary operator %s with support to substitute a fill_value for missing data in
one of the inputs
Parameters
----------
other : Series, DataFrame, or constant
axis : {0, 1, 'index', 'columns'}
For Series input, axis to match Series index on
fill_value : None or float value, default None
Fill missing (NaN) values with this value. If both DataFrame locations are
missing, the result will be missing
looks somewhat similar. I wonder if that's it?

return read_table(path)


@pytest.fixture(scope='module')
@pytest.yield_fixture
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think yield_fixtures are deprecated. You can just yield in a regular fixture.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

hahah didn't know that!

@jreback jreback added this to the 0.23.0 milestone Feb 12, 2018
@jreback jreback merged commit d6fe194 into pandas-dev:master Feb 13, 2018
harisbal pushed a commit to harisbal/pandas that referenced this pull request Feb 28, 2018
@rth
Copy link
Contributor

rth commented Sep 27, 2018

@@ -194,7 +197,8 @@ def get_filepath_or_buffer(filepath_or_buffer, encoding=None,
-        return reader, encoding, compression
+        return reader, encoding, compression, True

This resulted in a backward incompatible change for pandas.io.common.get_filepath_or_buffer between 0.22 and 0.23. I couldn't find anything about it in the release notes either. Unless it's a private function, but nothing in the import path suggests it...

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
IO Network Local or Cloud (AWS, GCS, etc.) IO Issues Testing pandas testing functions or related to the test suite
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants