Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Public C API for accessing code object fields #94936

Closed
Fidget-Spinner opened this issue Jul 17, 2022 · 14 comments
Closed

Public C API for accessing code object fields #94936

Fidget-Spinner opened this issue Jul 17, 2022 · 14 comments
Assignees
Labels
3.12 bugs and security fixes type-feature A feature request or enhancement

Comments

@Fidget-Spinner
Copy link
Member

Fidget-Spinner commented Jul 17, 2022

A user/maintainer of a popular library documented their use case for accessing fields in the code object. In 3.11 the fields in C are gone but still available as properties in Python. Accessing these properties can be made faster by exposing their functions in the C API. The current option is PyObject_GetAttrString(code, "attr") which is much slower.

https://discuss.python.org/t/getting-the-class-name-of-a-code-frame-object-in-cpython-3-11-c-api/17396

A similar issue is #92154.

I am marking this as a release blocker @pablogsal as it would be mucho bueno if we could get this before rc1.

@Fidget-Spinner Fidget-Spinner added type-feature A feature request or enhancement release-blocker labels Jul 17, 2022
@Fidget-Spinner Fidget-Spinner self-assigned this Jul 17, 2022
@pablogsal
Copy link
Member

Mucho bueno indeed :)

I am willing to consider it but how many new APIs are we talking about? Also, we should double check and agree that exposing these fields is not going to backfire on us if we decide to change the implementation or add new stuff that will invalidate the meaning of the current fields. Maybe we can consider adding them with prefixed underscores (probably a bad idea)?

@pablogsal
Copy link
Member

@Fidget-Spinner
Copy link
Member Author

3 getter functions. I've thought about whether they will restrict our code. Thing is we already are restricted because all three are already exposed as Python properties. So we have no choice but to maintain backwards compatiblity, even without the new C API changes.

@markshannon
Copy link
Member

PyObject_GetAttrString(code, "attr") may be slower, but I doubt that PyObject_GetAttr(code, attr) is much slower given that most of the attributes of code objects are computed.

Calling PyObject_GetAttr(code, co_varnames) requires that the co_varnames tuple is created. Dispatching is not likely to be any slower than creating the tuple.

If performance is the issue, we should offer a faster API (for 3.12)
Something like:
PyObject *PyCode_GetLocalName(int n)

We should also offer the same API in Python.

@markshannon
Copy link
Member

Not only that, but co_nlocals and co_varnames don't make much sense.

def f(a, b):
     c = a + b
    def d(): 
       return c
    return d
def g(a, b):
     c = a + b
    def d(): 
       return a + b
    return d
>>> f.__code__.co_nlocals
3
>>> g.__code__.co_nlocals
4
>>> f.__code__.co_varnames
('a', 'b', 'd')
>>> g.__code__.co_varnames
('a', 'b', 'c', 'd')

@Fidget-Spinner
Copy link
Member Author

If performance is the issue, we should offer a faster API (for 3.12) Something like: PyObject *PyCode_GetLocalName(int n)

We should also offer the same API in Python.

Yeah I planned to lazily create the introspection information and cache them there (kind of like how debug information in frames are lazily created). However, that missed that for 3.11 so it should go in 3.12. Something like

struct PyCodeObject {
    ...
    PyCodeIntrospectInformation *introspect_information;
}

struct debug information {
    PyObject *co_code;
    PyObject *co_varnames;
    PyObject *co_freevars;
    ...
}

@gvanrossum
Copy link
Member

Given that the user requesting this is writing a profiler and needs access to the frame as well, why not just recommend that they include internal headers and call _PyCode_GetVarnames(), _PyCode_GetCellvars and _PyCode_GetFreevars directly? Looking at #95008 (presumably a tentative fix?) it would seem that all the required functions already exist but aren't public any more. That seems prudent since there are some concerns about the future-preparedness of these functions anyway.

In the past I think we would never have considered such a late request for new APIs a blocker.

@Fidget-Spinner
Copy link
Member Author

OK. I'll remove the blocker. However, I think we should consider adding a performant version of these APIs in 3.12.

@pablogsal pablogsal removed the 3.11 only security fixes label Jul 23, 2022
@pablogsal
Copy link
Member

After reading the discussion, I concur with Guido. I am removing the 3.11 label. This feature can only be considered from 3.12 onwards.

@vstinner
Copy link
Member

vstinner commented Aug 3, 2022

See also #91248

@Fidget-Spinner
Copy link
Member Author

Docs TODO.

@gvanrossum
Copy link
Member

Docs TODO.

What more docs do you need?

@Fidget-Spinner
Copy link
Member Author

Docs TODO.

What more docs do you need?

Woops I forgot that I already documented these. Sorry 😆 .

I'll leave this issue open for the next thing I plan to do in 3.12:

I want a better/more performant implementation for 3.12. It will cache the debug info in a lazy debug struct much like how we do it for PyInterpreterFrame and PyFrameObject now.

@vstinner
Copy link
Member

vstinner commented Aug 5, 2022

IMO the problem of migrating existing C extensions to Python 3.11 is now solved. What's New in Python 3.11:

PyCodeObject no longer has the co_code, co_varnames, co_cellvars and co_freevars fields. Instead, use PyCode_GetCode(), PyCode_GetVarnames(), PyCode_GetCellvars() and PyCode_GetFreevars() respectively to access them via the C API. (Contributed by Brandt Bucher in bpo-46841 and Ken Jin in gh-92154 and gh-94936.)

I close the issue.

I'll leave this issue open for the next thing I plan to do in 3.12

Please open a separated issue for that.

@vstinner vstinner closed this as completed Aug 5, 2022
mfrischknecht added a commit to mfrischknecht/nixpkgs that referenced this issue Dec 6, 2023
`python-qt` v. 3.3.0 wasn't compatible with Python 3.11 yet
(it contained include statements for header files that were
removed by 3.11 [1] and used a python tuple field that was
replaced through a dedicated getter function [2]).

Switching to version 3.4.2 also means we can remove the
`remove-unneeded-pydebug-include.patch`, as it has since
been merged.

[1]: https://docs.python.org/3.11/whatsnew/3.11.html

> The non-limited API files cellobject.h, classobject.h, code.h, context.h,
> funcobject.h, genobject.h and longintrepr.h have been moved to the Include/cpython
> directory. Moreover, the eval.h header file was removed. These files must not be
> included directly, as they are already included in Python.h: Include Files.
> If they have been included directly, consider including Python.h instead.

[2]: python/cpython#94936 (comment)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
3.12 bugs and security fixes type-feature A feature request or enhancement
Projects
None yet
Development

No branches or pull requests

6 participants