Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

runtime checkable protocols potential raising error with custom getattribute #105134

Closed
hmc-cs-mdrissi opened this issue May 31, 2023 · 12 comments
Closed
Assignees
Labels
3.12 bugs and security fixes 3.13 bugs and security fixes stdlib Python modules in the Lib dir topic-typing type-bug An unexpected behavior, bug, or error

Comments

@hmc-cs-mdrissi
Copy link
Contributor

Bug report

I'm unsure if this is more tensorflow bug or python bug but it does trace to this pr and error message traces to cpython source. I've also commented on issue in tensorflow side here.

# typing 3.9/3.10 produces no error, typing_extensions will trigger it.
from typing_extensions import Protocol, runtime_checkable

from tensorflow.python.trackable.data_structures import _DictWrapper


@runtime_checkable
class Foo(Protocol):
    def bar(self) -> None:
        ...


isinstance(_DictWrapper(), Foo)

With typing extensions an error message is produced of,

Traceback (most recent call last):
  File "/Users/pa-loaner/Snapchat/Dev/training-platform/scratch/runtime_protocol_custom_dict.py", line 12, in <module>
    isinstance(_DictWrapper(), Foo)
  File "/Users/pa-loaner/Snapchat/Dev/.venvs/tf212/lib/python3.9/site-packages/typing_extensions.py", line 604, in __instancecheck__
    val = inspect.getattr_static(instance, attr)
  File "/Users/pa-loaner/.pyenv/versions/3.9.16/lib/python3.9/inspect.py", line 1711, in getattr_static
    instance_result = _check_instance(obj, attr)
  File "/Users/pa-loaner/.pyenv/versions/3.9.16/lib/python3.9/inspect.py", line 1654, in _check_instance
    instance_dict = object.__getattribute__(obj, "__dict__")
TypeError: this __dict__ descriptor does not support '_DictWrapper' objects

Your environment

I've tested on cpython 3.9.16 and 3.10.10 where using typing works. I'd suspect 3.11 will work as well. Using typing_extensions 4.6 triggers error with inspect.getattr_static usage. Reported here instead of typing extensions as adding getattr static there was backport from here. I'd expect this to reproduce on 3.12 although installing tf on 3.12 is likely tricky and easier to test with typing extensions. The _DictWrapper class is here and is like a subclass of __dict__ with a custom getattribute to do some extra tracking of elements.

I used latest version of tensorflow (2.12). The underlying error message comes from this line. As this can be triggered in attribute access unsure whether it should be TypeError vs AttributeError. If it was attribute error getattr static would catch it and be fine. If it needs to be type error then it's unclear to me whether tensorflow class is breaking some assumption or if getattr static should also catch type errors here. In particular it's interesting that object.__getattribute__ can raise TypeError and inspect.getattr_static can fail.

I'll try to look tomorrow if I can make a more self contained example that doesn't require tensorflow.

@hmc-cs-mdrissi hmc-cs-mdrissi added the type-bug An unexpected behavior, bug, or error label May 31, 2023
@AlexWaygood
Copy link
Member

Thanks! Looks like the root cause here is probably a bug in inspect.getattr_static rather than typing. That does pose interesting questions for the typing_extensions backport, though.

@AlexWaygood AlexWaygood added stdlib Python modules in the Lib dir 3.12 bugs and security fixes 3.13 bugs and security fixes labels May 31, 2023
@AlexWaygood AlexWaygood self-assigned this May 31, 2023
@hmc-cs-mdrissi
Copy link
Contributor Author

hmc-cs-mdrissi commented May 31, 2023

One thing I noticed from exploring DictWrapper is it strangely satisfies

d = _DictWrapper()
assert isinstance(d, dict) # Passes
assert issubclass(_DictWrapper, dict) # Fails

This looks to come from library wrapt being used to make DictWrapper behave like a dict. It relies mainly on ObjectProxy for this. My rough gut is there's likely a reproducing example only using wrapt/ObjectProxy without needing tensorflow. It feels like it partially proxies dict but not enough here.

@AlexWaygood
Copy link
Member

Yes, so here's a repro that doesn't involve typing, just inspect.getattr_static (using Python 3.11, since tensorflow doesn't install using CPython main):

>>> import inspect
>>> from tensorflow.python.trackable.data_structures import _DictWrapper
>>> inspect.getattr_static(_DictWrapper(), 'bar')
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "C:\Users\alexw\AppData\Local\Programs\Python\Python311\Lib\inspect.py", line 1825, in getattr_static
    instance_result = _check_instance(obj, attr)
                      ^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "C:\Users\alexw\AppData\Local\Programs\Python\Python311\Lib\inspect.py", line 1772, in _check_instance
    instance_dict = object.__getattribute__(obj, "__dict__")
                    ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
TypeError: this __dict__ descriptor does not support '_DictWrapper' objects

I think there's probably bugs in CPython and [either tensorflow or wrapt] here. The _DictWrapper class must be pretty weird if object.__getattribute__ raises TypeError. But also, inspect.getattr_static should be resilient to this kind of edge case: it should probably never raise TypeError.

@AlexWaygood
Copy link
Member

Interesting! It looks like inspect.getattr_static is buggy on <=3.11, but not on 3.12+. Here's a repro on 3.11 that just involves wrapt (no tensorflow code):

>>> import inspect, wrapt
>>> class Foo: pass
...
>>> class Bar(Foo, wrapt.ObjectProxy): pass
...
>>> inspect.getattr_static(Bar({}), 'bar')
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "C:\Users\alexw\AppData\Local\Programs\Python\Python311\Lib\inspect.py", line 1825, in getattr_static
    instance_result = _check_instance(obj, attr)
                      ^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "C:\Users\alexw\AppData\Local\Programs\Python\Python311\Lib\inspect.py", line 1772, in _check_instance
    instance_dict = object.__getattribute__(obj, "__dict__")
                    ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
TypeError: this __dict__ descriptor does not support 'Bar' objects

But on main, it works as expected:

>>> import inspect, wrapt
>>> class Foo: pass
...
>>> class Bar(Foo, wrapt.ObjectProxy): pass
...
>>> inspect.getattr_static(Bar({}), 'bar')
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "C:\Users\alexw\coding\cpython\Lib\inspect.py", line 1867, in getattr_static
    raise AttributeError(attr)
AttributeError: bar

Lots of changes have been made to inspect.getattr_static in Python 3.12 (in the name of improving performance -- see #103193), so it's entirely possible that one of these changes "accidentally" fixed this bug.

@AlexWaygood
Copy link
Member

It reproduces with the wrapt C extension, but not without it. Which might be why it reproduces on 3.11, but not main...

@AlexWaygood
Copy link
Member

AlexWaygood commented May 31, 2023

Here's a repro that doesn't involve tensorflow or wrapt: https://github.com/AlexWaygood/getattr-static-repro.

Unfortunately it involves a 3,000-line C extension (I've just vendored wrapt's C extension to create the reproducible example)! I doubt I'll be the best person to help narrow this down further, since I'm no good with C. Maybe @carljm or @JelleZijlstra would be able to help narrow this down further?

@AlexWaygood
Copy link
Member

@AlexWaygood
Copy link
Member

AlexWaygood commented May 31, 2023

The TypeError here is because inspect.getattr_static is doing object.__getattribute__(x, '__dict__'), and where x is an instance of wrapt.ObjectProxy, that can raise TypeError if the wrapt C extension is installed.

I think it's only possible for object.__getattribute__(x, '__dict__') to raise TypeError if x has been created via a custom C extension. And I think that means that inspect.getattr_static is doing nothing wrong here, and there is only a bug in wrapt. But, very interested to hear if @carljm has any thoughts on that!

@JelleZijlstra
Copy link
Member

Haven't had a chance to investigate much, but here's the C stack trace (on main) for where the this __dict__ descriptor does not support 'Bar' objects error is raised:

    frame #3: 0x00000001000f3474 python`raise_dict_descr_error(obj=<unavailable>) at typeobject.c:2873:5 [opt]
    frame #4: 0x00000001000f3234 python`subtype_dict(obj=0x000000010317b9d0, context=<unavailable>) at typeobject.c:0 [opt]
    frame #5: 0x0000000100090e48 python`getset_get(descr=0x000000010319dd90, obj=0x000000010317b9d0, type=<unavailable>) at descrobject.c:203:16 [opt]
    frame #6: 0x00000001000d3008 python`_PyObject_GenericGetAttrWithDict(obj=0x000000010317b9d0, name=0x000000010050ba88, dict=0x0000000000000000, suppress=0) at object.c:0 [opt]
    frame #7: 0x00000001000d2e68 python`PyObject_GenericGetAttr(obj=<unavailable>, name=<unavailable>) at object.c:1515:12 [opt]
    frame #8: 0x00000001000e9c28 python`wrap_binaryfunc(self=0x000000010317b9d0, args=0x000000010317a0d0, wrapped=0x00000001000d2e54) at typeobject.c:7724:12 [opt]
    frame #9: 0x0000000100092cc0 python`wrapperdescr_raw_call(descr=0x0000000100e00290, self=0x000000010317b9d0, args=0x000000010317a0d0, kwds=0x0000000000000000) at descrobject.c:537:12 [opt]
    frame #10: 0x000000010009100c python`wrapperdescr_call(descr=0x0000000100e00290, args=0x000000010317a0d0, kwds=0x0000000000000000) at descrobject.c:574:14 [opt]
    frame #11: 0x00000001000861b8 python`_PyObject_MakeTpCall(tstate=0x00000001005751b8, callable=0x0000000100e00290, args=0x0000000100a7c1c8, nargs=<unavailable>, keywords=0x0000000000000000) at call.c:240:18 [opt]
    frame #12: 0x0000000100085f28 python`_PyObject_VectorcallTstate(tstate=0x00000001005751b8, callable=0x0000000100e00290, args=0x0000000100a7c1c8, nargsf=9223372036854775810, kwnames=0x0000000000000000) at pycore_call.h:90:16 [opt]
    frame #13: 0x0000000100086a04 python`PyObject_Vectorcall(callable=<unavailable>, args=<unavailable>, nargsf=<unavailable>, kwnames=<unavailable>) at call.c:325:12 [opt]

@carljm
Copy link
Member

carljm commented Jun 2, 2023

I have this on my list to take a look at, but it may be some time next week before I can get to it.

@carljm
Copy link
Member

carljm commented Jun 20, 2023

I looked into this a bit. Here's what's happening:

  1. Foo is a normal Python type, which has a __dict__ descriptor (it only inherits object, whose tp_dictoffset is 0, so it gets Py_TPFLAGS_MANAGED_DICT and a tp_dictoffset of -1, so its tp_getset includes __dict__, which means that its own type dict gets populated with the default descriptor for __dict__ by type_add_getset).
  2. Bar, on the other hand, inherits the builtin type WraptObjectProxy_Type (it also inherits Foo, but WraptObjectProxy_Type is the nearest base defining a C instance layout -- see best_base -- so it is used to define the layout for Bar), whose tp_dictoffset is non-zero, therefore Bar doesn't get Py_TPFLAGS_MANAGED_DICT and its own tp_dictoffset is left at 0, therefore (see type_new_set_slots) it does not get a __dict__ descriptor in its type dict.
  3. The MRO of Bar is (Bar, Foo, WraptObjectProxy_Type, object).
  4. So (in inspect.getattr_static) when we look up __dict__ in the type dictionary of Bar, we don't find it in Bar.__dict__, we go up the MRO to Foo, and we do find the default __dict__ descriptor in Foo.__dict__.
  5. When we actually try to run this descriptor's getter, get_builtin_base_with_dict finds WraptObjectProxy_Type as the nearest builtin base type with a dict, and so we try to use its __dict__ descriptor. But it doesn't have one! (No entry for __dict__ in WraptObjectProxy_getset.) Thus we get the error.

I'm not 100% sure about the logic behind get_builtin_base_with_dict; it seems a bit ad-hoc. But it's also been there a long time and would be very disruptive to change, so I'm not going to think too hard about changing it. If we accept its implementation as given, that implies that this is a bug in WraptObjectProxy: C extension types that define tp_dictoffset really need to also define a __dict__ descriptor in order to avoid this problem.

Given all that, the question remains about what inspect.getattr_static should do when it encounters a type that is broken in this way. On one hand, the whole point of inspect.getattr_static is to check for existence of an attribute "safely." But on the other hand, I think this particular TypeError pretty much always indicates a bug in the C extension type, and I don't like papering over bugs and allowing them to persist longer undetected. In this case, if we catch the TypeError the result would be inspect.getattr_static would always fail to find instance attributes on instances of such broken types; this seems even more confusing than raising the TypeError. And I think the proper definition of "safely" for inspect.getattr_static is not "can never raise" but rather "doesn't execute arbitrary Python code," and no arbitrary Python code is being executed here. Raising in case of a badly-defined type seems to me still the best option for inspect.getattr_static.

So my recommendation would be to close this issue as not a bug in Python.

@AlexWaygood
Copy link
Member

Thanks so much for the in-depth investigation @carljm! This confirms my suspicions, and I agree with your conclusion.

@AlexWaygood AlexWaygood closed this as not planned Won't fix, can't repro, duplicate, stale Jun 20, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
3.12 bugs and security fixes 3.13 bugs and security fixes stdlib Python modules in the Lib dir topic-typing type-bug An unexpected behavior, bug, or error
Projects
None yet
Development

No branches or pull requests

4 participants