Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix Type Annotation in pandas/core/api.py #26148

Closed
vaibhavhrt opened this issue Apr 19, 2019 · 9 comments · Fixed by #26341
Closed

Fix Type Annotation in pandas/core/api.py #26148

vaibhavhrt opened this issue Apr 19, 2019 · 9 comments · Fixed by #26341
Labels
Typing type annotations, mypy/pyright type checking
Milestone

Comments

@vaibhavhrt
Copy link
Contributor

Part of #25882

Current errors are:

pandas\core\api.py:5: error: Module 'pandas.core.arrays.integer' has no attribute 'Int8Dtype'
pandas\core\api.py:5: error: Module 'pandas.core.arrays.integer' has no attribute 'Int16Dtype'
pandas\core\api.py:5: error: Module 'pandas.core.arrays.integer' has no attribute 'Int32Dtype'
pandas\core\api.py:5: error: Module 'pandas.core.arrays.integer' has no attribute 'Int64Dtype'
pandas\core\api.py:5: error: Module 'pandas.core.arrays.integer' has no attribute 'UInt8Dtype'
pandas\core\api.py:5: error: Module 'pandas.core.arrays.integer' has no attribute 'UInt16Dtype'
pandas\core\api.py:5: error: Module 'pandas.core.arrays.integer' has no attribute 'UInt32Dtype'
pandas\core\api.py:5: error: Module 'pandas.core.arrays.integer' has no attribute 'UInt64Dtype'
@vaibhavhrt
Copy link
Contributor Author

@WillAyd I am little confused here, these attributes does not exists in pandas.core.arrays.integer but they are being used everywhere. Can you explain a bit what's happening here.

@WillAyd
Copy link
Member

WillAyd commented Apr 19, 2019

Those objects are created dynamically at the end of that module (specifically inserted into sys.modules. I'm not sure if it's a limitation of the mypy import machinery in recognizing that or if there is an alternate way we should be approaching (ex: inserting into globals). If you can investigate would be great

@WillAyd WillAyd added the Typing type annotations, mypy/pyright type checking label Apr 19, 2019
@vaibhavhrt
Copy link
Contributor Author

I will try to figure out possible solutions.

@vaibhavhrt
Copy link
Contributor Author

@WillAyd mypy doesn't runs the code, python/mypy#5719 (comment). Therefore setattr(module, classname, dtype_type) is not working. globals()[classname] = dtype_type will not work either. In my opinion the best thing to do is to just ignore that line that's importing these attributes.

Another possible solution(not sure it will work) I can think of is to define these attributes(Int8Dtype, etc) as None somewhere in the module. Then set its values in the loop using exec() instead of using setattr on the module, but that's too much unnecessary work. Also I don't think mypy will recognise the value change if we use exec.

@WillAyd
Copy link
Member

WillAyd commented Apr 22, 2019

I suppose it makes sense that the static type checker wouldn't be able to detect these objects created at runtime. Though more verbose I think explicitly listing out these types as part of the module would be preferable

@vaibhavhrt
Copy link
Contributor Author

So get rid of the loop which is dynamically creating these integer type and create all integer types one by one?
Creating attributes inside the loop with setattr is a bit cryptic and hard to find just by taking a glance at the code, so I think it's a good idea of get rid of it.

@vaibhavhrt
Copy link
Contributor Author

vaibhavhrt commented Apr 26, 2019

Though more verbose I think explicitly listing out these types as part of the module would be preferable

@WillAyd how do list types as part of module?

@vaibhavhrt
Copy link
Contributor Author

vaibhavhrt commented May 10, 2019

@WillAyd instead of:

# create the Dtype
_dtypes = {}
for dtype in ['int8', 'int16', 'int32', 'int64',
              'uint8', 'uint16', 'uint32', 'uint64']:

    if dtype.startswith('u'):
        name = "U{}".format(dtype[1:].capitalize())
    else:
        name = dtype.capitalize()
    classname = "{}Dtype".format(name)
    numpy_dtype = getattr(np, dtype)
    attributes_dict = {'type': numpy_dtype,
                       'name': name,
                       '__doc__': _dtype_docstring.format(dtype=dtype)}
    dtype_type = register_extension_dtype(
        type(classname, (_IntegerDtype, ), attributes_dict)
    )
    setattr(module, classname, dtype_type)

    _dtypes[dtype] = dtype_type()

We can do this 👇

Int8Dtype = register_extension_dtype(
    type('Int8Dtype', (_IntegerDtype, ), {
        'type': getattr(np, 'int8'),
        'name': 'Int8',
        '__doc__': _dtype_docstring.format(dtype='int8')
    })
)

But we will have to do this 8 times for all integer types.
Writing integer types like this will fix type errors in pandas.core.api and will not introduce any new errors in pandas.core.arrays.integer.
Should I do it like this for all int types and open a PR?

Sorry I was busy for last few days and forgot about this.

@WillAyd
Copy link
Member

WillAyd commented May 10, 2019

Thanks @vaibhavhrt - yes I think that would be fine. More verbose but would enable static analysis on those objects so I think worth the tradeoff

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Typing type annotations, mypy/pyright type checking
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants