-
-
Notifications
You must be signed in to change notification settings - Fork 2.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Allow subclassing NamedTuple as a nested class #4419
Conversation
Thanks! Could you also add a direct test for local namedtuples (i.e., not a semanal test)? |
done! |
Friendly ping on this. |
I hope someone else can approve/merge this; I'd be lying if I said I understand the repercussions of moving |
@@ -671,6 +671,7 @@ def analyze_class_body(self, defn: ClassDef) -> Iterator[bool]: | |||
if self.analyze_typeddict_classdef(defn): | |||
yield False | |||
return | |||
self.setup_class_def_analysis(defn) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
While this works, I'm not sure if this is the best change that could be made. As we can see in analyze_namedtuple_classdef
:
(referencing this TODO),
the root of the issue is that the analysis and processing of namedtuples is unconditionally global. If I understand things correctly, your change adds the node early to the symbol table so it is coerced into being a local definition by setup_class_def_analysis
. That works, but is rather brittle.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree that this is incomplete because we need to handle local definitions of NamedTuple in a few other places, but I would still argue that we should call setup_class_def_analysis
for NamedTuples in this way. Local classes are added to the symbol table in setup_class_def_analysis
, and there is no reason not to add local NamedTuples at that point as well.
This doesn't seem early to me because semantic analysis appears to me to proceed in two steps of (1) adding nodes with basic info to the symbol table and then (2) adding more complex type info. I think here is the right place to do step (1) for NamedTuples.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The TODO comments by @JukkaL appear to be referring to assigning node.kind
as GDEF
or LDEF
, which I think would be simple to fix.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Okay I think this is acceptable.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Great, Let me fix the todos.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actually, my understanding is LDEF and GDEF for NamedTuples, TypedDicts, and Enums is a longstanding issue that is not the cause of the specific problem this PR is solving, and I am not too confident on the solution. So I won't address those.
@JukkaL Would you look at this? I think you could merge this / request changes with confidence. |
I'll have a look. This could have implications for incremental and fine-grained incremental modes. In particular, it's possible for a local named tuple to leak to the surrounding class, and this might plausibly cause trouble: class A:
def f(self) -> None:
class N(NamedTuple):
...
self.n = N(...) |
I'll be looking at what you've been doing on dmypy to gain a better understanding of this. |
I don't think dmypy has anything to do with this, does it? |
My understanding was @JukkaL was referring to dmypy when talking about
fine-grained incremental mode.
…On Fri, Jan 12, 2018 at 10:33 PM Guido van Rossum ***@***.***> wrote:
I'll be looking at what you've been doing on dmypy to gain a better
understanding of this.
I don't think dmypy has anything to do with this, does it?
—
You are receiving this because you authored the thread.
Reply to this email directly, view it on GitHub
<#4419 (comment)>, or mute
the thread
<https://github.com/notifications/unsubscribe-auth/ANBOAkW008CELeQx92l08xkRWk2raKEFks5tKDIFgaJpZM4RRDyF>
.
|
Not quite -- The problem Jukka was referring to can occur with regular incremental mode. If there's a In any case, you nerd-sniped me, and I constructed the following example (with your patch applied):
from typing import NamedTuple
class C:
def __init__(self) -> None:
class A(NamedTuple):
x: int
self.a = A(1)
from a import C
reveal_type(C().a)
Without your patch, the line Note that if instead of In case you're interested, here's the traceback:
|
I'm grateful for the very detailed explanation on this! I'm reading through the related discussion (like #2855) while thinking about a solution. |
Sorry that this is so long, but I wanted to document my thought process as I debugged this. I'm planning on going with approach (1) below. The problem is local namedtuples are stored under the classe's symbol table but local classes are stored under the global symbol table, and subclasses of namedtuple under my patch have inconsistent behavior that combines both. Consider the file a.py: from typing import NamedTuple
class C:
def __init__(self) -> None:
class A(NamedTuple):
x: int
class B:
def __init__(self, x):
self.x = x
D = NamedTuple('D', [('x', int)])
self.a = A(1)
self.b = B(1)
self.d = D(1)
Inner NamedTuples defined with Whereas inner classes expect to be stored in the global symbol table because the This matters because if the node with fullname Lines 2418 to 2421 in 6fce774
Likewise, under my patch, I see three possible approaches:
I believe the reason inner classes are stored under the global symbol table but inner namedtuples/typedicts are stored in the class symbol table is that the code was written by different developers. ilevkivskyi implemented caching local classes in #2855, whereas gvanrossum implemented caching inner TypedDicts and |
Also related: #2931
|
@@ -964,10 +968,14 @@ def analyze_namedtuple_classdef(self, defn: ClassDef) -> Optional[TypeInfo]: | |||
if base_expr.fullname == 'typing.NamedTuple': | |||
node = self.lookup(defn.name, defn) | |||
if node is not None: | |||
if self.type or self.is_func_scope(): | |||
name = defn.name + '@' + str(defn.line) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The name has to be mangled so that, for incremental mode, it matches the name of type (but it makes sense anyway so it does not leak outside the containing scope).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is different from how we do name mangling for other kinds of types, such as enums. We should only do it in a function scope.
Bump for feedback. @JukkaL I did fix the type leaking issue by storing NamedTuples class definitions under the outer class, if it exists. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is basically reasonable, modulo some details. If the details can't be worked out cleanly, it might make sense to do some refactoring before fixing the underlying issue. It's not a problem with this PR specifically -- I'm a little worried that this adds extra complexity to code that is already very messy (and kind of broken even).
Also, it seems like TypedDict classes have the same issues, but this only addresses NamedTuple classes. This brings some asymmetry. At least a TODO comment about this would be helpful.
@@ -964,10 +968,14 @@ def analyze_namedtuple_classdef(self, defn: ClassDef) -> Optional[TypeInfo]: | |||
if base_expr.fullname == 'typing.NamedTuple': | |||
node = self.lookup(defn.name, defn) | |||
if node is not None: | |||
if self.type or self.is_func_scope(): | |||
name = defn.name + '@' + str(defn.line) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is different from how we do name mangling for other kinds of types, such as enums. We should only do it in a function scope.
@@ -1046,7 +1054,11 @@ def setup_class_def_analysis(self, defn: ClassDef) -> None: | |||
local_name = defn.info._fullname + '@' + str(defn.line) | |||
defn.info._fullname = self.cur_mod_id + '.' + local_name | |||
defn.fullname = defn.info._fullname | |||
self.globals[local_name] = node | |||
if self.type and self.is_namedtuple_classdef(defn): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This special casing of named tuples here is kind ugly. Why do we need this?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The problem is that, inside a class method, local namedtuples are stored under the classes's symbol table (and are expected to be stored there in other parts of the code) but local classes are stored under the global symbol table. Storing subclasses of namedtuple in the global table breaks incremental mode.
As you suggested a little refactoring might be more effective than this haphazard fix, I would suggest storing local classes under the class symbol table. Then we wouldn't need this. IIRC local TypedDicts and other special types (???) are also stored in the class symbol table, so it be most consistent to always store local classes under the local symbol table.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I went a little it more into the technical details in #4419 (comment)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would be delighted to make that change if you agree.
[case testNamedTupleDirectSubclass] | ||
from typing import NamedTuple | ||
class A(NamedTuple): | ||
x: int |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Style nit: indent should be 4 spaces.
from typing import NamedTuple | ||
def foo(): | ||
class A(NamedTuple): | ||
x: int |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Style nit: indent should be 4 spaces.
@@ -956,6 +956,10 @@ def get_all_bases_tvars(self, defn: ClassDef, removed: List[int]) -> TypeVarList | |||
tvars.extend(base_tvars) | |||
return remove_dups(tvars) | |||
|
|||
def is_namedtuple_classdef(self, defn: ClassDef) -> bool: | |||
base_exprs = defn.base_type_exprs | |||
return any(getattr(b, 'fullname', None) == 'typing.NamedTuple' for b in base_exprs) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This duplicates similar logic in analyze_namedtuple_classdef
. There should be only one implementation of looking for a typing.NamedTuple
base class.
Style nit: We avoid the use of getattr
if there's a reasonable way around it, since it can't be type checked.
So is the decision here to do the refactor first? |
This is now quite stale -- there was no response to the code review from February, and there's now (of course) a merge conflict. Should we just close it? (That would make me sad given the effort put into this, but sometimes it's the only alternative.) |
I would say there was actually a response to the code review. Look! #4419 (comment) Anyway, after getting a real job I view this whole affair as nonessential tinkering that may benefit no-one, which is probably why Jukka doesn't have time to respond. I also think expert work should be left to experts. So long, and thanks for all the fish. |
Thanks for clarifying, and sorry for misreading the issue.
|
Fixes #4349.
This 1-line change allows subclassing NamedTuple inside a function.
The current problem is that inner NamedTuples aren't added to the symbol table.
Inner classes get added in
self.setup_class_def_analysis(defn)
, so we just call that method before callinganalyze_namedtuple_classdef
, which looks up the node in the symbol table.As far as I can tell,
setup_class_def_analysis
does the work that would be done in pass 1'svisit_class_defn
, which, for inner classes, adds the defn to the local symbol table and the global symbol table, under a mangled name. That should also happen for NamedTuples.I added some semaneval unit tests for this change to verify the change in behavior, but that may not be the right place. I might want to add a test to prove that local NamedTuples need to be go in the global symbol table as well.