Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

TYP: Remove pandas.io.stata from Typing Blacklist #25940

Merged
merged 11 commits into from
Apr 10, 2019

Conversation

ryankarlos
Copy link
Contributor

@ryankarlos ryankarlos commented Mar 31, 2019

@codecov
Copy link

codecov bot commented Mar 31, 2019

Codecov Report

Merging #25940 into master will decrease coverage by <.01%.
The diff coverage is n/a.

Impacted file tree graph

@@            Coverage Diff             @@
##           master   #25940      +/-   ##
==========================================
- Coverage   91.81%   91.81%   -0.01%     
==========================================
  Files         175      175              
  Lines       52580    52580              
==========================================
- Hits        48278    48274       -4     
- Misses       4302     4306       +4
Flag Coverage Δ
#multiple 90.36% <ø> (ø) ⬆️
#single 41.9% <ø> (-0.08%) ⬇️
Impacted Files Coverage Δ
pandas/io/gbq.py 75% <0%> (-12.5%) ⬇️
pandas/core/frame.py 96.79% <0%> (-0.12%) ⬇️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update de3a85c...1d96fec. Read the comment docs.

@codecov
Copy link

codecov bot commented Mar 31, 2019

Codecov Report

Merging #25940 into master will decrease coverage by <.01%.
The diff coverage is n/a.

Impacted file tree graph

@@            Coverage Diff             @@
##           master   #25940      +/-   ##
==========================================
- Coverage   91.84%   91.83%   -0.01%     
==========================================
  Files         175      175              
  Lines       52517    52517              
==========================================
- Hits        48232    48228       -4     
- Misses       4285     4289       +4
Flag Coverage Δ
#multiple 90.39% <ø> (ø) ⬆️
#single 40.72% <ø> (-0.14%) ⬇️
Impacted Files Coverage Δ
pandas/io/gbq.py 75% <0%> (-12.5%) ⬇️
pandas/core/frame.py 96.79% <0%> (-0.12%) ⬇️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 9d66bdf...72a1702. Read the comment docs.

@ryankarlos
Copy link
Contributor Author

ryankarlos commented Mar 31, 2019

@WillAyd Is it necessary to remove the method data_label which shadows the instance variable as reported in #25932 - removing it is making the docstring checks fail in the Azure Pipelines- associated with StataReader not having a data_label attribute ?

@WillAyd
Copy link
Member

WillAyd commented Mar 31, 2019

@ryankarlos what failure are you getting when removing?

@WillAyd WillAyd added the Typing type annotations, mypy/pyright type checking label Mar 31, 2019
@ryankarlos ryankarlos force-pushed the Typing_fix_pandas.io.stata branch from d2a8770 to 1d96fec Compare March 31, 2019 19:45
@ryankarlos
Copy link
Contributor Author

ryankarlos commented Mar 31, 2019

@ryankarlos what failure are you getting when removing?

I think the error was linked to validate_docstrings.py which parses the doc/source/reference/io.rst file to do the docstring validation on all the methods in io - there is a reference to StataReader.data_label there, so it is throwing an error 'StataReader' has no attribute 'data_label' because I had removed this method. Should I just remove it from this reference file as well - not sure about any other dependencies on this ?

@ryankarlos
Copy link
Contributor Author

@WillAyd All green now after the fix.

@WillAyd
Copy link
Member

WillAyd commented Apr 2, 2019

Hmm not sure this is the right solution - would it actually make more sense to assign to self._data_label instead and keep the existing method?

@jreback
Copy link
Contributor

jreback commented Apr 2, 2019

cc @bashtage

Copy link
Contributor

@jreback jreback left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this needs to be a private attribute (_data_label) I think.

@bashtage
Copy link
Contributor

bashtage commented Apr 2, 2019

If you don't want to change the API, then switch self.data_label = _self._get... to self._data_label = _self._get... and then leave the method.

TBH it might as well be an attribute or a property, but this would break an API that is probably almost never used.*

  • Break it and you'll find someone who uses it I'm sure.

@ryankarlos
Copy link
Contributor Author

Thanks, makes sense - ill push the changes

Copy link
Member

@WillAyd WillAyd left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There are still a few other assignments to self.data_label in the module - can you check them all?

@bashtage
Copy link
Contributor

bashtage commented Apr 3, 2019

The attribute seems to have precedence over the method, and so the public attribute should be retained and the method deleted.

@ryankarlos
Copy link
Contributor Author

ryankarlos commented Apr 3, 2019

@bashtage just to clarify, are you saying better to revert back to the previous solution i.e switching back to self.data_label assignment in the module ?

@bashtage
Copy link
Contributor

bashtage commented Apr 3, 2019 via email

@WillAyd
Copy link
Member

WillAyd commented Apr 4, 2019

The method is documented as part of the API:

http://pandas.pydata.org/pandas-docs/stable/reference/io.html#stata

I agree that would be strange to use when compared against the attribute, but at the same time the method is what is exposed / documented so changing that would require a lot more effort / deprecation

@bashtage
Copy link
Contributor

bashtage commented Apr 4, 2019

@WillAyd The documentation is wrong and should be fixed. Any user who is using this feature will get their code broken by a move to a method, and so introducing a method is a defacto API change.

from pandas.io.stata import StataReader

sr = StataReader('pandas/tests/io/data/stata14_118.dta')
print('Attribute')
print(sr.data_label)
print('Method')
print(sr.data_label())

produces

Attribute
This is a  Ünicode data label
Method
Traceback (most recent call last):
  File "C:\Anaconda\lib\site-packages\IPython\core\interactiveshell.py", line 3296, in run_code
    exec(code_obj, self.user_global_ns, self.user_ns)
  File "<ipython-input-5-bce88bdbdeb6>", line 7, in <module>
    print(sr.data_label())
TypeError: 'str' object is not callable

@WillAyd
Copy link
Member

WillAyd commented Apr 4, 2019

Hmm OK interesting - thanks for confirming that. I suppose this is a strange in between of a instance variable and method which are two distinct things but probably should have been managed via a property. Given the instance variable is populate from a private method I suppose it's not supposed to be exposed as part of the API right?

@bashtage
Copy link
Contributor

bashtage commented Apr 4, 2019

data_label should be part of the API since it could be useful to an end user. It could be a property (read-only), which a doc string, which would keep it in the API docs.

Copy link
Contributor

@bashtage bashtage left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A simple solution is to keep private attribute changes and then the switch data label to a property.

pandas/tests/io/test_stata.py Outdated Show resolved Hide resolved
pandas/tests/io/test_stata.py Outdated Show resolved Hide resolved
Copy link
Contributor

@jreback jreback left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

aside from the @bashtage changes, pls explain why we cannot leave the

from pandas import .....

I find this easier to read.

@ryankarlos
Copy link
Contributor Author

ryankarlos commented Apr 4, 2019

Does this need to be reverted as well ? Was of the impression that the requirements in this issue were to explicitly import as defined internally ? Happy to revert though if not the case

aside from the @bashtage changes, pls explain why we cannot leave the

from pandas import .....

I find this easier to read.

Copy link
Contributor

@bashtage bashtage left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@ryankarlos ryankarlos force-pushed the Typing_fix_pandas.io.stata branch from df8beec to 4a58142 Compare April 4, 2019 23:49
@ryankarlos
Copy link
Contributor Author

ryankarlos commented Apr 5, 2019

Not sure why CI is failing.

@ryankarlos
Copy link
Contributor Author

ryankarlos commented Apr 5, 2019

@WillAyd All green now

@WillAyd
Copy link
Member

WillAyd commented Apr 6, 2019

Great thanks! I think we need a precursor to this PR to explicitly import objects in the top level of the code base instead of changing the import machinery here - any interest in tackling that?

@ryankarlos
Copy link
Contributor Author

Sure, but would this be for the entire codebase or just this module ? i may need some help if i get stuck (the import machinery is more complicated than i thought it would be). just to clarify , do you want me to revert the import changes back to from pandas import ..... in this PR ? I also wanted to work through some of the mpy blacklist in #25882 - is that then dependant on the PR you are suggesting ?

@WillAyd
Copy link
Member

WillAyd commented Apr 6, 2019

Sure, but would this be for the entire codebase or just this module

I think OK to start with just the module and we can iterate from there to get more as needed. Starting small is always better and can expand as needed from there

do you want me to revert the import changes back to from pandas import ..... in this PR ?

Yes but after the precursor of step one is done

I also wanted to work through some of the mpy blacklist in #25882 - is that then dependant on the PR you are suggesting

So the precursor will only affect modules that throw an error like "Module 'pandas' has no attribute 'XXX'" - you can certainly still work on other modules without that error and they will be unaffected

Just a heads up I have limited availability until Tuesday so I can help with little things here and there but may be delayed in responding

@ryankarlos
Copy link
Contributor Author

Ok thanks for the info, I have just made a PR #26017

Sure, but would this be for the entire codebase or just this module

I think OK to start with just the module and we can iterate from there to get more as needed. Starting small is always better and can expand as needed from there

do you want me to revert the import changes back to from pandas import ..... in this PR ?

Yes but after the precursor of step one is done

I also wanted to work through some of the mpy blacklist in #25882 - is that then dependant on the PR you are suggesting

So the precursor will only affect modules that throw an error like "Module 'pandas' has no attribute 'XXX'" - you can certainly still work on other modules without that error and they will be unaffected

Just a heads up I have limited availability until Tuesday so I can help with little things here and there but may be delayed in responding

@jbrockmendel
Copy link
Member

@ryankarlos can you merge master

@jreback jreback added this to the 0.25.0 milestone Apr 10, 2019
@jreback jreback merged commit 6bb9ab8 into pandas-dev:master Apr 10, 2019
@jreback
Copy link
Contributor

jreback commented Apr 10, 2019

thanks @ryankarlos

@ryankarlos ryankarlos deleted the Typing_fix_pandas.io.stata branch April 13, 2019 19:54
@ryankarlos ryankarlos changed the title Remove pandas.io.stata from Typing Blacklist TYP: Remove pandas.io.stata from Typing Blacklist Jan 6, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Typing type annotations, mypy/pyright type checking
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants