Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[python-package] add type hints for functions accepting dtypes #5773

Merged
merged 1 commit into from
Mar 9, 2023

Conversation

jameslamb
Copy link
Collaborator

Contributes to #3756.
Contributes to #3867.

Fixes the following errors from mypy:

python-package/lightgbm/basic.py:309: error: Argument 2 to "_cast_numpy_array_to_dtype" has incompatible type "type"; expected "dtype[Any]"  [arg-type]
python-package/lightgbm/basic.py:314: error: Argument 2 to "_cast_numpy_array_to_dtype" has incompatible type "type"; expected "dtype[Any]"  [arg-type]

@@ -240,7 +240,7 @@ def _is_numpy_column_array(data: Any) -> bool:
return len(shape) == 2 and shape[1] == 1


def _cast_numpy_array_to_dtype(array: np.ndarray, dtype: np.dtype) -> np.ndarray:
def _cast_numpy_array_to_dtype(array: np.ndarray, dtype: "np.typing.DTypeLike") -> np.ndarray:
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

these annotations come from https://numpy.org/devdocs/reference/typing.html.

numpy.typing is new as of numpy 1.20 (January 2021).

Expressing these annotations as string literals means that they won't cause issues at runtime for users using lightgbm + older numpy versions. See https://mypy.readthedocs.io/en/stable/runtime_troubles.html#import-cycles.

@@ -612,7 +616,7 @@ def _c_int_array(data):
return (ptr_data, type_data, data) # return `data` to avoid the temporary copy is freed


def _is_allowed_numpy_dtype(dtype) -> bool:
def _is_allowed_numpy_dtype(dtype: type) -> bool:
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why is this one just type?

Setting it to np.typing.DTypeLike results in these mypy errors.

python-package/lightgbm/basic.py:618: error: Argument 1 to "issubclass" has incompatible type "Union[dtype[Any], None, Type[Any], _SupportsDType[dtype[Any]], str, Union[Tuple[Any, int], Tuple[Any, Union[SupportsIndex, Sequence[SupportsIndex]]], List[Any], _DTypeDict, Tuple[Any, Any]]]"; expected "type"  [arg-type]
python-package/lightgbm/basic.py:619: error: Argument 1 to "issubclass" has incompatible type "Union[dtype[Any], None, Type[Any], _SupportsDType[dtype[Any]], str, Union[Tuple[Any, int], Tuple[Any, Union[SupportsIndex, Sequence[SupportsIndex]]], List[Any], _DTypeDict, Tuple[Any, Any]]]"; expected "type"  [arg-type]

And that's because it's not being passed a dtype... it's being passed a type.

_check_for_bad_pandas_dtypes(data.to_frame().dtypes)

bad_pandas_dtypes = [
f'{column_name}: {pandas_dtype}'
for column_name, pandas_dtype in pandas_dtypes_series.items()
if not _is_allowed_numpy_dtype(pandas_dtype.type)
]

import pandas as pd
s = pd.Series([1, 2, 3, 4, 5], dtype="Int64")
for column_name, pandas_dtype in s.to_frame().dtypes.items():
    print(pandas_dtype)
    print(type(pandas_dtype))

# Int64
# <class 'pandas.core.arrays.integer.Int64Dtype'>

@jameslamb jameslamb merged commit 0cff4f8 into master Mar 9, 2023
@jameslamb jameslamb deleted the ci/mypy-dtype-hints branch March 9, 2023 04:29
@github-actions
Copy link

This pull request has been automatically locked since there has not been any recent activity since it was closed.
To start a new related discussion, open a new issue at https://github.com/microsoft/LightGBM/issues
including a reference to this.

@github-actions github-actions bot locked as resolved and limited conversation to collaborators Aug 15, 2023
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants