-
-
Notifications
You must be signed in to change notification settings - Fork 1.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add high level from_array function in namedarray #8281
Conversation
…xarray into namedarray_from_array
for more information, see https://pre-commit.ci
for more information, see https://pre-commit.ci
…xarray into namedarray_from_array
for more information, see https://pre-commit.ci
…xarray into namedarray_from_array
for more information, see https://pre-commit.ci
…xarray into namedarray_from_array
for more information, see https://pre-commit.ci
for more information, see https://pre-commit.ci
…xarray into namedarray_from_array
for more information, see https://pre-commit.ci
…xarray into namedarray_from_array
for more information, see https://pre-commit.ci
for more information, see https://pre-commit.ci
…xarray into namedarray_from_array
for more information, see https://pre-commit.ci
for more information, see https://pre-commit.ci
for more information, see https://pre-commit.ci
for more information, see https://pre-commit.ci
Getting a little discouraged now. Since we use T_DuckArray we lock that type to the initialized NamedArray for the duration of the class instance. Array operations changes dtypes quite frequently within methods which makes it hard to change the typing correctly. I don't think this T_DuckArray strategy is possible until python/typing#548 is closed. I think I'll restart in another branch for historical purposes so I won't fall for the same traps too many more times. Failing proof of concept code: from __future__ import annotations
# from collections.abc import Hashable, Iterable, Mapping, Sequence
from typing import Any, Protocol, TypeVar, runtime_checkable, overload, Union, Generic
from typing_extensions import Self
import numpy as np
from numpy.typing import DTypeLike
# https://stackoverflow.com/questions/74633074/how-to-type-hint-a-generic-numpy-array
_T = TypeVar("_T")
_T_co = TypeVar("_T_co", covariant=True)
_DType = TypeVar("_DType", bound=np.dtype[Any])
_DType_co = TypeVar("_DType_co", covariant=True, bound=np.dtype[Any])
_ScalarType = TypeVar("_ScalarType", bound=np.generic)
_ScalarType_co = TypeVar("_ScalarType_co", bound=np.generic, covariant=True)
_IntOrUnknown = int
_Shape = tuple[_IntOrUnknown, ...]
_ShapeType = TypeVar("_ShapeType", bound=Any)
_ShapeType_co = TypeVar("_ShapeType_co", bound=Any, covariant=True)
# A protocol for anything with the dtype attribute
@runtime_checkable
class _SupportsDType(Protocol[_DType_co]):
@property
def dtype(self) -> _DType_co:
...
_DTypeLike = Union[
np.dtype[_ScalarType], type[_ScalarType], _SupportsDType[np.dtype[_ScalarType]]
]
@runtime_checkable
class _array(Protocol[_ShapeType_co, _DType_co]):
@property
def dtype(self) -> _DType_co:
...
# TODO: Should be -> T_DuckArray[_ScalarType]:
@overload
def astype(self, dtype: _DTypeLike[_ScalarType]) -> _Array[_ScalarType]:
...
# TODO: Should be -> T_DuckArray[Any]:
@overload
def astype(self, dtype: DTypeLike) -> _Array[Any]:
...
_Array = _array[Any, np.dtype[_ScalarType_co]]
T_DuckArray = TypeVar("T_DuckArray", bound=_Array[np.generic], covariant=True)
class Named(Generic[T_DuckArray]):
_data: T_DuckArray
def __init__(self, data: T_DuckArray) -> None:
self._data = data
@property
def dtype(self) -> np.dtype[Any]:
return self._data.dtype
@overload
def astype(self, dtype: _DTypeLike[_ScalarType]) -> _Named[_Array[_ScalarType]]:
...
@overload
def astype(self, dtype: DTypeLike) -> _Named[_Array[Any]]:
...
def astype(
self, dtype: _DTypeLike[_ScalarType] | DTypeLike
) -> _Named[_Array[_ScalarType]] | _Named[_Array[Any]]:
# Mypy keeps expecting the T_DuckArray, pyright thinks it's ok.
return type(self)(self._data.astype(dtype)) # type: ignore[arg-type]
_Named = Named
a = np.array([2, 3, 5], dtype=np.float64)
reveal_type(
a
) # note: Revealed type is "numpy.ndarray[Any, numpy.dtype[numpy.floating[numpy._typing._64Bit]]]"
narr = Named(a)
reveal_type(
narr
) # note: Revealed type is "Named[numpy.ndarray[Any, numpy.dtype[numpy.floating[numpy._typing._64Bit]]]]"
reveal_type(
narr.astype(np.dtype(np.int8))
) # note: Revealed type is "Named[_array[Any, numpy.dtype[numpy.signedinteger[numpy._typing._8Bit]]]]"
reveal_type(
narr.astype(np.int16)
) # note: Revealed type is "Named[_array[Any, numpy.dtype[numpy.signedinteger[numpy._typing._16Bit]]]]" |
for more information, see https://pre-commit.ci
The Idea is to avoid as much normalization in the NamedArray class as possible.
Different types are handled before initializing instead.
whats-new.rst
api.rst
References:
https://github.com/tomwhite/cubed/blob/ea885193dd37d27917a24878b51bb086aaef5fb1/cubed/core/ops.py#L34
https://stackoverflow.com/questions/74633074/how-to-type-hint-a-generic-numpy-array
https://numpy.org/doc/stable/reference/arrays.scalars.html#scalars
https://github.com/numpy/numpy/blob/040ed2dc9847265c581a342301dd87d2b518a3c2/numpy/__init__.pyi#L1423
https://github.com/numpy/numpy/blob/040ed2dc9847265c581a342301dd87d2b518a3c2/numpy/_typing/_array_like.py#L32
Mypy issues:
python/typing#548