-
-
Notifications
You must be signed in to change notification settings - Fork 30.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
FrameLocalsProxy
is stricter than dict
about what constitutes a match
#120906
Comments
Interesting issue. I don't have a definitive answer here but this is something we need to deal with, because we are getting some weird issues here: import sys
class MyString(str):
pass
def f():
x = 1
local = sys._getframe().f_locals
local[MyString('x')] = 2
print(local.keys())
# ['x', 'local', 'x']
print(local)
# {'x': 2, 'local': {...}}
f() Internally we check if the input key is an exact unicode, but we do utilize dict for certain features ( My preferred solution is to enforce any key to be an exact unicode string. The reason for that is, unlike a generic
I'm a bit worried about the can of worms when we allow subclasses of unicode strings - what if it overwrite the |
Just to explain the context I ran into it in: The Cython compiler wraps most of it's strings into an We have an It's easy enough to work around so I'm relaxed about the solution whatever you decide to do. I think what we were doing was mostly accidental and there wasn't a hidden special use-case behind it. But for this use-case I'm just reading existing values from the |
We need to discuss this further with @markshannon, but I think the desired behavior would be either
|
I just remembered that we already had that discussion when I was implementing PEP 667. The PEP suggests that we should allow arbitrary types. Then the question would be what if the user gives a unicode-like key, do we try to access the fast variables with that key? |
To avoid breaking mapping invariants, if a given key compares equal to the name of a local variable, it needs to be handled as a lookup of that variable name. Having a |
But that's not the mapping protocol right? What if a key has a different hash, but the equal value? What if the key is class Evil:
def __eq__(self, other):
return True It's unhashable but only comparing equality will match it to an arbitrary local (the first one). Are we going to implement the full mapping interface? Checking everything as dict/mapping does? The dark corner we left here might eat us in the future. |
That's a violation of the hash protocol: objects that compare equal must have the same hash (or at least one must be unhashable). The IMO, it would also be OK to convert the argument to |
I don't think Looking at the way Accepting Unicode subclasses would be enough to avoid the compatibility issue that affected Cython. To accept arbitrary objects the way a general mapping does, the fallback search loop would need to use I also noticed a redundancy in the way the checks for whether or not a value is currently bound (or visible) are handled: names can't be duplicated, so we can return static int
framelocalsproxy_getdefinedindex(_PyInterpreterFrame *frame, PyCodeObject *co, int i, bool read)
{
/*
* Conditionally returns the given fast locals index
* - if read == true, returns the index if the value is not NULL
* - if read == false, returns the index if the value is not hidden
* - otherwise returns -1
*/
if (read) {
if (framelocalsproxy_getval(frame, co, i) != NULL) {
return i;
}
} else {
if (!(_PyLocals_GetKind(co->co_localspluskinds, i) & CO_FAST_HIDDEN)) {
return i;
}
}
return -1;
} A search loop accepting arbitrary keys would then look something like: PyCodeObject *co = _PyFrame_GetCode(frame->f_frame);
// For actual Unicode keys, the key is likely interned and we can do a pointer comparison.
if (PyUnicode_CheckExact(key) {
for (int i = 0; i < co->co_nlocalsplus; i++) {
PyObject *name = PyTuple_GET_ITEM(co->co_localsplusnames, i);
if (name == key) {
return framelocalsproxy_getdefinedindex(frame->f_frame, co, i, read);
}
}
}
// Fall back to an equality comparison if the interned string check fails
if (PyUnicode_Check(key) {
// Fast path for strings and string subclasses
for (int i = 0; i < co->co_nlocalsplus; i++) {
PyObject *name = PyTuple_GET_ITEM(co->co_localsplusnames, i);
if (_PyUnicode_Equals(name, key)) {
return framelocalsproxy_getdefinedindex(frame->f_frame, co, i, read);
}
}
} else {
// Full rich comparison for other objects
for (int i = 0; i < co->co_nlocalsplus; i++) {
PyObject *name = PyTuple_GET_ITEM(co->co_localsplusnames, i);
int eq_result = PyObject_RichCompareBool(name, key, Py_EQ);
if (eq_result < 0) {
return -2; // Callers will need to check for results < -1 and propagate the error
}
if (eq_result) {
return framelocalsproxy_getdefinedindex(frame->f_frame, co, i, read);
}
}
} |
My worry about supporting a unicode subclass or an arbitrary object as a key is - it does not fit the mapping protocol. If an object overrides it's None of the proposals above have this feature. So basically, we are inventing a very new mapping protocol specifically for That's why I liked "unicode only" solution - that's a simple rule that works and easy to understand. It will cause some trouble, but we are already causing troubles. We need to consider - what if a key is a unicode subclass, but has different If we can't do that, I kind of like what @encukou suggested - we convert the non-exact-unicode keys to unicode first, for both read and write. That's a golden rule that everyone can follow, and that will solve many benign cases like this very issue. |
I finally looked at the code, rather than going just by the conversation here. But I couldn't just look... If an object overrides its The hash protocol:
Keys that aren't unicode subclasses shouldn't be a problem. |
Oh sorry I did not realize you posted in the issue. I don't think I'm getting notification from this issue. Could you point to me the documentation about the hash protocol? I could not find it. |
See hashable in the docs:
|
The data model docs in https://docs.python.org/3/reference/datamodel.html#object.__hash__ also state "The only required property is that objects which compare equal have the same hash value". However, they go into more detail about how to define Classes that don't abide by those rules just straight up don't work properly, so the fact they won't misbehave as badly with locals proxy instances as they do with regular dicts isn't a problem we need to be concerned about. |
…22309) Co-authored-by: Alyssa Coghlan <[email protected]>
…sProxy (pythonGH-122309) Co-authored-by: Alyssa Coghlan <[email protected]> (cherry picked from commit 5912487)
…sProxy (pythonGH-122309) Co-authored-by: Alyssa Coghlan <[email protected]> (cherry picked from commit 5912487)
This issue is all fixed now, right? |
…GH-122309) (#122488) [3.13] gh-120906: Support arbitrary hashable keys in FrameLocalsProxy (GH-122309) Co-authored-by: Alyssa Coghlan <[email protected]> (cherry picked from commit 5912487)
Indeed it is! |
…pythonGH-122309) Co-authored-by: Alyssa Coghlan <[email protected]>
Bug report
Bug description:
In Python 3.12 and below this prints
True
. In Python 3.13 this printFalse
. I think it comes down to the check for exact unicode:cpython/Objects/frameobject.c
Line 112 in f4ddaa3
The change in behaviour isn't a huge problem so if it's intended then I won't spend waste any time complaining about it, but I do think it's worth confirming that it is intended/desired.
CPython versions tested on:
3.13
Operating systems tested on:
Linux
Linked PRs
The text was updated successfully, but these errors were encountered: