Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

What to do about Undefined #81

Closed
gvanrossum opened this issue Apr 13, 2015 · 11 comments
Closed

What to do about Undefined #81

gvanrossum opened this issue Apr 13, 2015 · 11 comments

Comments

@gvanrossum
Copy link
Member

There are many reasons to dislike Undefined. Unfortunately the alternatives have issues. We need to decide.

@gvanrossum
Copy link
Member Author

Jukka, Andrey and myself reviewed some uses of Undefined in Jukka's code library, and we couldn't find any particularly compelling examples. We're going to get rid of Undefined now. We can look for a replacement (e.g. # var <name>: <type> comments) later. As an immediate (if ugly) workaround, it looks like this:

x = Undefined(T)  # where T is a complex type

can usually be spelled effectively as follows:

x = cast(T, None)  # type: T

@gvanrossum
Copy link
Member Author

OK, killed it in the pep-0484.txt draft.

@JukkaL
Copy link
Contributor

JukkaL commented Apr 14, 2015

Should this be valid as well (we discussed it as an option):

x = None  # type: int     # Type of x is just int, not Optional[int]

This matches the way a lot of people write Python right now, and has the benefit of not needing to import anything from typing.

@gvanrossum
Copy link
Member Author

I like that, though we must be careful to explain the difference between using x: T = None as an argument, where the type becomes Optional[T], vs. using x = None # type: T where the type becomes T.

And please think about what it might become in 3.6 when we replace it with var x: T = None again...

@ambv
Copy link
Contributor

ambv commented Apr 18, 2015

@JukkaL, @gvanrossum: sorry, reopening for the following reasons.

I would personally argue that Jukka's last suggestion is dangerous because:

  • it's inconsistent with the PEP's statement that "When using a type comment, the type checker should still verify that the inferred type is consistent with the stated type."
  • programmers using the pattern to assign x = None when they mean only an int are trading UnboundLocalErrors for TypeErrors or AttributeErrors on NoneType further in the code; I don't find this a valid use case
  • a more wordy explanation of the issue here:
    • yes, x is temporarily None and we annotate it as just an Int
    • this is OK as long as this name is set to an Int before it takes part in any operations or calls that expect an Int
    • if the routine returns just an Int, it's OK for that name to be returned as long as it's given an Int value before-hand (side note observation: return x or 0 would be correct)
  • workarounds:
    • if somebody means just an int, initializing to an int value is an option
    • if somebody means int and None, stating # type: Optional[int] is closer to truth at execution time
    • if somebody means int and None, casting is also an option
  • as Guido points out, it's inconsistent with x: T = Noneand the future var x: T = None

One valid way to solve this in a generic way would be to treat those types as optional and to support type promotion (e.g. "we know this is always an int after this assignment") in the type checker. I think Mypy already supports a form of this. Am I right @JukkaL?

If Jukka's suggestion is to be documented in the PEP, I'd like a strong example why this is necessary.

@ambv ambv reopened this Apr 18, 2015
@refi64
Copy link

refi64 commented Apr 18, 2015

Honestly, I'd like:

x = None # type: forward int

Another idea could be:

x = None # type: unknown
# that says that the type will be inferred later on
x = 7 # now x's type is int

@ilevkivskyi
Copy link
Member

If you are thinking about adding
var x : int
in the future, then maybe one could choose
# var x : int
for the undefined variables now.

@ambv
Copy link
Contributor

ambv commented Apr 20, 2015

This is what I originally proposed, @ilevkivskyi. It was dismissed on the basis that:

  • it's visually much different from the other # type : ... comments
  • the var syntax might not be what we'll end up choosing for Python 3.6 and later
  • it looks like a line that does something but commented out while "debugging"
  • in Python 3.6 it might be a line that you want to comment out while "debugging"

@ilevkivskyi
Copy link
Member

These are all good objections, @ambv. Anyway, I think that the possibility to declare a variable is not very crucial. Probably, type hints for defined variables (including in with and for) are sufficient in 95% of cases. And I agree that x = None # type: int will give more problems than benefits.

@JukkaL
Copy link
Contributor

JukkaL commented May 3, 2015

I see two issues that we should resolve if we don't support the x = None # type: int special case idea.

  1. variables in stubs

How will we define a variable with an arbitrary type in a stub? Consider this stub:

from typing import IO

stream = ??? # type: IO[str]

How would we declare stream? Here are some options:

# ugly but works:
stream = cast(IO[str], None) 

# similar to above:
stream = cast(IO[str], ...) 

# literal ellipsis would be similar to default arg values, but this needs a PEP
# change since this could be flagged as an error by type checkers:
stream = ... # type: IO[str]

These alternatives aren't really reasonable, in my opinion:

stream = IO() # type: IO[str]   # error, IO is an ABC

stream = open('')   # ouch, type might be more precise or less precise than IO[str],
     # depending on the signature of the open built-in and level of type inference
     # for example, it could be IO[Any] or _io.TextIOWrapper

def _dummy() -> IO[str]: pass
stream = _dummy()  # ugly and verbose
  1. existing code that uses None to mean no value

It's a fairly common idiom to use None as a missing/uninitialized value for an attribute. Often code does not check for a None value when accessing the attribute. It's not obvious what's the best way of migrating code like that to static typing. We should pick at least one recommended approach and perhaps mention it in the PEP. For example, consider this class:

class Connnection:
    sock = None # type: socket.socket

    def __init__(self, addr: str) -> None:
        self.addr = addr

    def connect(self) -> None:
        self.sock = ...
        ...

    def read(self) -> bytes:
        return self.sock.recv(100)

Now a type checker may complain about the assignent to sock in the class body. Alternatively, the type of sock will be Optional[int], and a type checker will complain about the body of read. There are a few ways to fix this. First, we can cast the None initializer to socket.socket:

class Connnection:
    sock = cast(socket.socket, None)
    ... # rest as above

Second, we can add an explicit check to read:

    def read(self) -> bytes:
        assert self.sock
        return self.sock.recv(100)

Third, we can remove the assignment from the class body:

class Connnection:
    def __init__(self, addr: str) -> None:
        self.addr = addr

    def connect(self) -> None:
        self.sock = ...    # infer type from initializer (optionally add # type:)
        ...

Fourth, we can infer type Optional[int] for sock (or just int) but not complain about anything (this is what @vlasovskikh called 'weak unions', I think). This what mypy current does, but I don't really like it since this results in missing real bugs that can be difficult to catch otherwise.

The third option is not safe, since some external code may check if the value of the sock attribute is None and after the change it would raise an AttributeError. The second option is also not 100% safe, since if sock is None, read() raises an AssertionError instead of an AttributeError, and external code could plausibly catch AttributeError and changing the exception type could break some code (though this is unlikely in this case). The first option is safe but ugly, since we need to use a cast.

I don't have a strong opinion about this. Maybe all of these should be okay, depending on the context?

@gvanrossum
Copy link
Member Author

For (1) let's do ...:

stream = ... # type: IO[str]

So yes let's add this to the PEP.

For (2) I don't think the PEP needs to take a position (we don't have to provide guidance for every single idiom in existing Python code).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants