Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Unexpected "error: Slice index must be an integer or None" #2410

Open
sametmax opened this issue Nov 7, 2016 · 33 comments
Open

Unexpected "error: Slice index must be an integer or None" #2410

sametmax opened this issue Nov 7, 2016 · 33 comments
Labels
bug mypy got something wrong false-positive mypy gave an error on correct code feature needs discussion priority-0-high

Comments

@sametmax
Copy link

sametmax commented Nov 7, 2016

Despite having:

    class g():
    ...
        def __getitem__(self, index):
            # type: (Union[int, slice, Callable]) -> Union[IterableWrapper, Any]

Running mypy triggers "error: Slice index must be an integer or None" for g()[lambda x: x > 4:].

@elazarg
Copy link
Contributor

elazarg commented Nov 7, 2016

You did not pass a callable to __getitem__. You pass a slice to __getitem__, but this slice is built with a lambda instead of an integer. You do not control the signature of slice.__init__.

@sametmax
Copy link
Author

sametmax commented Nov 7, 2016

Oh, right, so we need to update the typeshed for slice ? Indeed, I can't imagine other people won't have this problem. Numpy comes to mind, since it uses slices containing tuples to mimic multi-dimensional slicing.

@elazarg
Copy link
Contributor

elazarg commented Nov 7, 2016

I don't know. Passing a callable to slice instead of an integer is most likely an error:

elems[:x.count]  # oops, should be x.count()

@gvanrossum
Copy link
Member

gvanrossum commented Nov 7, 2016 via email

@elazarg
Copy link
Contributor

elazarg commented Nov 7, 2016

Sounds good to me. It might make existing annotation less safe in the short term, though (since slice will mean slice[Any]).

@JukkaL
Copy link
Collaborator

JukkaL commented Nov 7, 2016

I think that somebody proposed making slice generic some time a ago, but it was rejected because we didn't have use case.

@elazarg
Copy link
Contributor

elazarg commented Nov 7, 2016

Found it: python/typing#159

@elazarg
Copy link
Contributor

elazarg commented Nov 7, 2016

Perhaps we should have Slice[T], and slice may be an alias for Slice[int].

@JukkaL
Copy link
Collaborator

JukkaL commented Nov 7, 2016

That sounds reasonable to me, as potentially 90+% of slice objects have integer indices, and it would also be backward compatible.

@sametmax
Copy link
Author

sametmax commented Nov 7, 2016

@elazarg: I'd go for your solution. Indeed, my need to pass a callable to the slice is a niche, and numpy slicing is not very comon, so this allow the best of both worlds. The reason I allow callables into slices is that I am creating an object you can slice to do the equivalent of itertools.dropwhile and takewhiles as a shorcut.

@elazarg
Copy link
Contributor

elazarg commented Nov 12, 2016

(Just as a side note, there's a discussion in python-ideas of using slice[] as a slice constructor: https://mail.python.org/pipermail/python-ideas/2016-November/043630.html)

@shoyer
Copy link

shoyer commented Nov 13, 2016

I work on several projects in the scientific computing space that use slice objects with non-int arguments.

I don't like implicitly assuming that slice means a slice of integers, because this would be surprising to those of us who really want a generic slice object.

In NumPy, it's natural to pass NumPy integers into a slice, which do not subclass int (e.g., because they have fixed width), but otherwise are duck types that work almost exactly the same. The proper slice type for NumPy arrays would have elements of type SupportsIndex (i.e., values that implement __index__).

In pandas and xarray, slice objects can be used with non-integer values, e.g., for indexing along an axis with string labels. But there are no valid string strides, so instead we accept integer strides with the usual meaning. The fully specified valid slice type for a StringIndex would be something like Slice[str, str, int], where each field is presumed optional. At the very least, we need separate types for start/stop and step.

@gvanrossum
Copy link
Member

gvanrossum commented Nov 13, 2016 via email

@elazarg
Copy link
Contributor

elazarg commented Nov 14, 2016

I am not entirely sure what is the decision.

Should we write the type of slice everywhere as Slice[int, int, int]?

@gvanrossum
Copy link
Member

gvanrossum commented Nov 14, 2016 via email

@elazarg
Copy link
Contributor

elazarg commented Nov 14, 2016

@shoyer just gave another example - panda's Slice[str, str, int].

I think it would be nice to have Slice[T1, T2, T3], and Slice[T] as a shorthand for Slice[T, T, int]. I don't know if that's expressible.

@gvanrossum
Copy link
Member

gvanrossum commented Nov 14, 2016 via email

@dckc
Copy link

dckc commented Mar 5, 2017

Meanwhile, there are much more mundane ways to trigger this message:

$ cat slicetest.py
lo, hi = None, 2

print('abc'[lo:hi])

$ python slicetest.py 
ab
$ mypy slicetest.py
slicetest.py:3: error: Slice index must be an integer or None

The message says "... integer or None" but visit_slice_expr seems to allow int only.

@gvanrossum
Copy link
Member

@dckc You're using --strict-optional right?

@dckc
Copy link

dckc commented Mar 5, 2017

Yes, I suppose I am.

@Artimi
Copy link

Artimi commented Mar 8, 2018

I wanted to add just another usecase of slice having non-int arguments. In pandas you quite usually have time index. So another valid case is having e.g. datetime.date in slice:

In [1]: import datetime

In [2]: import pandas as pd

In [3]: df = pd.DataFrame(
   ...:  data = list(range(10)),
   ...:  index = [datetime.date.today() + datetime.timedelta(days = i) for i in range(10)]
   ...: )

In [4]: df
Out[4]: 
            0
2018-03-08  0
2018-03-09  1
2018-03-10  2
2018-03-11  3
2018-03-12  4
2018-03-13  5
2018-03-14  6
2018-03-15  7
2018-03-16  8
2018-03-17  9

In [5]: df[datetime.date(2018, 3, 12):]
Out[5]: 
            0
2018-03-12  4
2018-03-13  5
2018-03-14  6
2018-03-15  7
2018-03-16  8
2018-03-17  9

@shoyer
Copy link

shoyer commented Jan 6, 2019

This can also manifest itself in errors message like No overload variant of "slice" matches argument types "str", "str" if you write a slice literals like slice('a', 'b').

@eblume
Copy link

eblume commented May 14, 2019

On the off chance that additional non-scientific use cases are required, I have a library that would also like to use slice syntax to represent a timeline:

https://github.com/eblume/hermes/blob/4b42872d1b87216394e30463b3c8946c2f2013c5/hermes/timespan.py#L51-L69

I am admittedly still a mypy newcomer and might be making some basic mistakes there. Notably, mypy typechecks this without error, but fails a typecheck of a call to this slice syntax in another package importing the above linked code.

@ilevkivskyi
Copy link
Member

This actually triggers on something like

from typing import Any
  
x: Any
x['a':'b']

which is clearly a bug.

@ilevkivskyi ilevkivskyi added bug mypy got something wrong false-positive mypy gave an error on correct code labels Mar 2, 2020
@alvis
Copy link

alvis commented Mar 25, 2021

As @Artimi mentioned, there're many use cases in pandas that a time index is involved.
Would everyone be happy if we add the datetime type?

@ksamuel
Copy link

ksamuel commented Mar 28, 2021

datetime seems limited.

A slice is represent a boundary, but boundaries can be established with practically anything.

In fact, I quite like the idea of being able to pass a callable to __getitem__ mentioned by the OP: it lets you say "start to take elements when this condition arise, or stop taking elements when this one arises. You could even make a wrapper object like:


iterable = slicer(generator)
for x in iterable[lambda, lambda]:

It's much more elegant than going for itertools.dropwhile and takewhile.

I think slice() should allow library writers to do such experiments.

@BvB93
Copy link
Contributor

BvB93 commented Jun 17, 2021

So this issue is now starting to crop up on a semi-regular basis in the likes of numpy, scipy and pandas,
due to use of numpy's fixed-size integer types (that do not inherit from builtins.int).

It would already be a massive improvement if the rather restrictive None | int requirement for slice could
be changed into None | SupportsIndex. Since this type of slice-signature is already supported by the builtin
sequence types, this should not lead to any loss in type safety.

In [1]: class Index:
   ...:     def __init__(self, value: int) -> None:
   ...:         self.value = value
   ...:
   ...:     def __index__(self) -> int:
   ...:         return self.value
   ...:

In [3]: a = range(10)
   ...: a[Index(0):Index(5)]
Out[3]: range(0, 5)

In [4]: b = '0123456789'
   ...: b[Index(0):Index(5)]
Out[4]: '01234'

In [5]: c = [0, 1, 2, 3, 4, 5, 6, 7, 8, 9]
   ...: c[Index(0):Index(5)]
Out[5]: [0, 1, 2, 3, 4]

@emmatyping
Copy link
Collaborator

I can take a look at the implications of such a suggested change. I do think that allowing SupportsIndex is a natural thing to do.

@JelleZijlstra
Copy link
Member

python/typing#159 is closely related.

@eggplants
Copy link
Contributor

Just # type: ignore[misc]

hauntsaninja added a commit to hauntsaninja/mypy that referenced this issue Feb 20, 2023
Helps with python#2410, as suggested by BvB93 in python#2410 (comment)

PEP 696 will be the real solution here, since it will allow us to make
slice generic
hauntsaninja added a commit that referenced this issue Mar 11, 2023
Helps with #2410, as suggested by BvB93 in
#2410 (comment)

PEP 696 will be the real solution here, since it will allow us to make
slice generic with few backward compatibility issues
@sterliakov
Copy link
Contributor

If we have PEP 696 accepted, slice can become generic in type variable with default=int.

@joaoe
Copy link

joaoe commented Apr 29, 2024

I think it would be nice to have Slice[T1, T2, T3], and Slice[T] as a shorthand for Slice[T, T, int]. I don't know if that's expressible.

Since this is still open, there could be two versions

  1. Slice[IndexType] which implies slice(start:IndexType|None,stop:IndexType|None,step:IndexType|None)
  2. Slice[IndexType, StepType] which implies slice(start:IndexType|None,stop:IndexType|None,step:StepType|None)

If it is not possible to overload a generic type or provide defaults to a TypeVar argument, then the second options could have a different name like SliceStep.

For numbers, IndexType and StepType would typically be the same, but other types might be different, like datetime + timedelta or as someone mentioned above str+int to slice labels in an array of strings.

Me personally, I've now and then wanted slices with floats, or strings, so I could slice arrays of floats or the keys of a dict. But so far, easily enough achievable with a regular method, instead of overloading the getitem semantics.

@Jeitan
Copy link

Jeitan commented Apr 29, 2024

I don't know if this is related, but I landed here because I encountered this same mypy error with an implicit slice (is that a thing? not sure). Boils down to a:b behaving differently than slice(a,b).
With this setup:

import numpy as np
import pandas as pd
df = pd.DataFrame(data=np.random.randn(3, 6), index=np.arange(1, 4), columns=["A", "B", "C", "D", "E", "F"])

This is valid but causes a mypy error:
tmp = df.loc[2:3, "C":"E"]
"error: Slice index must be an integer, SupportsIndex or None"

This does not create the error:
tmp = df.loc[2:3, slice("C", "E")]

With pandas 2.2.2 and mypy 1.10.0, so I think must still be an issue. Doing the latter is a relatively easy fix, but here's yet another instance of a non-integer slice.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug mypy got something wrong false-positive mypy gave an error on correct code feature needs discussion priority-0-high
Projects
None yet
Development

No branches or pull requests