Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Optionally allow redefinition of variable with different type #6197

Merged
merged 56 commits into from
Jan 21, 2019

Conversation

JukkaL
Copy link
Collaborator

@JukkaL JukkaL commented Jan 15, 2019

If the flag --allow-redefinition is used, it's now possible to redefine a
variable with a different type. For example, this code will be okay:

x = 0  # Define 'x' with type 'int'
print(x)
x = 'a'  # Define another 'x' with type 'str'
print(x)

The variable needs to be read before redefinition is possible. This is mainly
to allow these idioms:

x: List[int]
x = []  # Not a redefinition

y = None  # type: List[int]  # Strict optional disabled
y = []  # Not a redefinition

It's also okay to use a type annotation in any of the assignments. This is
fine:

x: int = 0
print(x)
x: str = 'a'
print(x)

Redefinition is only allowed to happen in the same block and the nesting level
as the original definition. This lets us keep the implementation relatively simple,
and there will hopefully be less of a readability impact, since it's less likely that
users accidentally redefine a variable or use redefinition in confusing ways. We
may want to lift the restriction in the future, however. Function arguments can
be redefined in the body but only in non-nested blocks.

The implementation does a simple static analysis to rename variables to make
them distinct when we detect redefinition. This happens before semantic
analysis pass 1.

The new behavior is not fully backward compatible with the old behavior. For
example, here we lose type context in the second assignment:

x: List[int] = []
f(x)
x = []  # Need type annotation since this creates an independent variable

Internally we use a sequence of single quotes as a variable suffix
to represent renamed variants. The implementation provides an unmangle()
function to get the original name from a potentially mangled name.

It's still impossible to redefine final names, classes and certain other things.
The main use case is simple variables, either in single or multiple assignments.

I did some benchmarking (not with the latest version though) and the performance
impact of the new pass was minor (under 2% IIRC).

Things to do in additional PRs:

  • Add documentation.
  • Make sure we unmangle names everywhere.
  • Consider allowing redefinition to happen sometimes in another block.

@JukkaL JukkaL requested a review from ilevkivskyi January 15, 2019 17:42
Copy link
Member

@ilevkivskyi ilevkivskyi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for this huge amount of work! This will make mypy more "pythonic" and closer to the spirit of PEP 526 (annotation is not a declaration, but just a hint). Overall PR looks very good, I have a bunch of small comments (mostly requests for documentation).

I would say it is important to add docs before this will be released. Supporting "conditional" redefinition is less important (because it may require some non-trivial work). But also note that it will solve one of the current downsides:

x: object
x
x = 42
if bool():
    x = "hi"  # error here

Also an idea about

x: List[str] = []
x
x  = []  # error here

Maybe when we detect UninhabitedType, instead of immediately giving an error we can try to re-infer type of r.h.s. using type of the "previous" x (with one prime less) as a context?

def __init__(self) -> None:
# Counter for labeling new blocks
self.block_id = 0
self.disallow_redef_depth = 0
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe add a comment documenting this attribute?

# Counter for labeling new blocks
self.block_id = 0
self.disallow_redef_depth = 0
self.loop_depth = 0
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This one is pretty clear, but some details would be helpful (is it for "continuous" nesting of loops or total, ie. including while (...): if(...): while (...): ... etc).

mypy/renaming.py Outdated
self.block_loop_depth = {} # type: Dict[int, int]
# Stack of block ids being processed.
self.blocks = [] # type: List[int]
# List of scopes; each scope maps short name to block id.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
# List of scopes; each scope maps short name to block id.
# List of scopes; each scope maps short (i.e. unsuffixed) name to block id.

mypy/renaming.py Outdated
# List of scopes; each scope maps short name to block id.
self.var_blocks = [] # type: List[Dict[str, int]]

# References to variables the we may need to rename. List of
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
# References to variables the we may need to rename. List of
# References to variables that we may need to rename. List of


for arg in fdef.arguments:
name = arg.variable.name()
can_be_redefined = name != 'self' # TODO: Proper check
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What about 'cls'?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We special case self since it can define new attributes. cls doesn't have this property, so it doesn't need to be special cased. Added comment.

@@ -1,7 +1,7 @@
"""Shared definitions used by different parts of semantic analysis."""

from abc import abstractmethod, abstractproperty
from typing import Optional, List, Callable
from typing import Optional, List, Callable, Dict, Set
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
from typing import Optional, List, Callable, Dict, Set
from typing import Optional, List, Callable

mypy/semanal.py Outdated
@@ -1750,11 +1752,12 @@ def add_type_alias_deps(self, aliases_used: Iterable[str],
self.cur_mod_node.alias_deps[target].update(aliases_used)

def visit_assignment_stmt(self, s: AssignmentStmt) -> None:
self.unwrap_final(s)
is_final = self.unwrap_final(s)
s.is_final_def = is_final
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is_final only used once. I would just write s.is_final_def = self.unwrap_final(s).

mypy/semanal.py Outdated
@@ -2127,24 +2135,52 @@ def analyze_name_lvalue(self,
# Since the is_new_def flag is set, this must have been analyzed
# already in the first pass and added to the symbol table.
# An exception is typing module with incomplete test fixtures.
if (is_final
and unmangle(lval.name) + "'" in self.globals
and unmangle(lval.name) + "'" != lval.name):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks a bit mysterious. A comment explaining why name with two primes or more are "bad".

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Refactored and added a comment.

x = 2 # E: Cannot assign to final name "x"
def f() -> int:
global x
x = 3 # E: Cannot assign to final name "x"
x = 3 # No error here is okay since we reported an error above
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If one removes the assignment above, will this become an error?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes. Added a test case.

_ = 0
_ = ''
x, _ = 0, C()
[builtins fixtures/tuple.pyi]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would add test cases for

e = 1
e
try:
    ...
except Exception as e:
    ...

and

with open('file') as f:  # IO[str]
    ...
with open('file', 'b') as f:  # IO[bytes]
    ...

(maybe plus some variations).

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added test cases.

@JukkaL
Copy link
Collaborator Author

JukkaL commented Jan 18, 2019

I think that I've now addressed all the feedback.

Agreed that this will need to be documented by the next release (or we should hide the option).

Maybe when we detect UninhabitedType, instead of immediately giving an error we can try to re-infer type of r.h.s. using type of the "previous" x (with one prime less) as a context?

I thought about this but I couldn't come up with a simple implementation, so I decided to postpone this. Calculating the previous 'x' is more complicated, since the rules for renaming are different for globals and locals, for example. This is a potential future improvement. I'll create a follow-up issue about this.

Copy link
Member

@ilevkivskyi ilevkivskyi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the update! LGTM.

Please don't forget to create follow-up issues for the things we discussed: allowing more (conditional) redefinitions, problem with empty containers and using "previous" type, adding docs. (You also mentioned "Make sure we unmangle names everywhere" but I am not sure what is missing in this PR).

@neighthan
Copy link

Is there a way to only allow redefinition if an explicit type is given? So that

x = 0
print(x)
x: str = 'a'
print(x)

is valid but

x = 0
print(x)
x = 'a'
print(x)

is not? This would prevent any accidental redefinition and make it clear that the type is supposed to be changing (in the example above it's clear anyway, but if we had x = some_func() it might not be obvious anymore).

@oakkitten
Copy link

this doesn't seem to work for def:

from typing import Callable, cast

foo: str = "a"
foo: int = cast(int, foo)

bar: Callable[[], int] = lambda: 1
bar: int = cast(int, bar)

def baz() -> int:
    return 1
baz: int = cast(int, baz)

11: error: Name 'baz' already defined on line 9

any workarounds?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants