Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add PyWeakref_IsDead() to test if a weak reference is dead #48

Open
5 of 6 tasks
colesbury opened this issue Nov 14, 2024 · 9 comments
Open
5 of 6 tasks

Add PyWeakref_IsDead() to test if a weak reference is dead #48

colesbury opened this issue Nov 14, 2024 · 9 comments

Comments

@colesbury
Copy link

colesbury commented Nov 14, 2024

EDIT: Added, -1 error return case per @vstinner's suggestion.

EDIT 2: @vstinner added a vote.

I propose adding a dedicated C API function to check if a weak reference is dead:

// Returns 1 if the pointed to object is dead, 0 if it's alive, and -1 with an error set if `ref` is not a weak reference.
int PyWeakref_IsDead(PyObject *ref);

Motivation

Prior to Python 3.13, you could check if a weak reference is dead via PyWeakref_GetObject(ref) == Py_None, but that function is now deprecated. You might try writing an "is dead" check using PyWeakref_GetRef. For example:

int is_dead(PyObject *ref) {
    PyObject *tmp;
    if (PyWeakref_GetRef(&ref, &tmp) < 0) {
        return -1;
    }
    else if (tmp == NULL) {
        return 1;
    }
    Py_DECREF(tmp);
    return 0;
}

In addition to not being ergonomic, the problem with this code is that the Py_DECREF(tmp) may introduce a side effect from a calling a destructor, at least in the free threading build where some other thread may concurrently drop the last reference. Our internal _PyWeakref_IS_DEAD implementation avoids this problem, but it's not possible to reimplement that code using our existing public APIs.

This can be a problem when you need to check if a weak reference is dead within a lock, such as when cleaning up dictionaries or lists of weak references -- you don't want to execute arbitrary code via a destructor while holding the lock.

I've run into this in two C API extensions this week that are not currently thread-safe with free threading:

  • CFFI uses a dictionary that maps a string keys to unowned references. I'm working on update it to use PyWeakReference, but the "is dead" clean-up checks are more difficult due to the above issues.
  • Pandas cleans up a list of weak references. (The code is not currently thread-safe, and probably needs a lock.)

Vote

@vstinner
Copy link

Your rationale makes sense so adding int PyWeakref_IsDead(PyObject *ref) LGTM.

I would just add an error case: raise an exception (TypeError) and return -1 if the argument is not a weak reference object.

@encukou
Copy link
Collaborator

encukou commented Nov 15, 2024

An alternative is allowing int PyWeakref_GetRef(ref, NULL).
IMO, we should generally default to allowing NULL for PyObject ** output arguments, so that we can skip refcounting when the user doesn't need that result. It's very common that functions like this have other interesting effects, and it seems suboptimal to keep adding a separate “check” function some releases after the “get” function.
(Note that the output argument case is very different from accepting NULL for a PyObject * argument -- that is something to avoid, since it can easily come from a failed API call.)

Unfortunately, allowing the NULL now would mean that code tested on 3.14 will fail on 3.13. Even with that, I'd personally still slightly prefer PyWeakref_GetRef(ref, NULL).
(If we go that way, in 3.13.1+ it should fail with exception if it gets a NULL.)

@colesbury
Copy link
Author

I've updated the issue to specify a -1 return value if the argument is not a weak reference object.

@serhiy-storchaka
Copy link

On one hand, PyWeakref_IsDead may be significantly more efficient. On other hand, how often do you need to know that the reference was alive without getting its value? Note that the result can become obsolete right after obtaining it in a GIL-less build.

@colesbury
Copy link
Author

@serhiy-storchaka - you need it any time you want to implement a collection of weakrefs (things like WeakValueDictionary) and things like that are pretty common. As I wrote above, I ran into this twice in a single week separately in pandas and cffi. You can find a number of other examples if you search GitHub.

It's not just a matter of efficiency. As I wrote in the issue:

In addition to not being ergonomic, the problem with this code is that the Py_DECREF(tmp) may introduce a side effect from a calling a destructor, at least in the free threading build where some other thread may concurrently drop the last reference. Our internal _PyWeakref_IS_DEAD implementation avoids this problem, but it's not possible to reimplement that code using our existing public APIs.

This can be a problem when you need to check if a weak reference is dead within a lock, such as when cleaning up dictionaries or lists of weak references -- you don't want to execute arbitrary code via a destructor while holding the lock.

@serhiy-storchaka
Copy link

On third hand, what is dead will remain dead forever. For the purpose of weakref collections this should be enough. I support this proposition.

@vstinner
Copy link

vstinner commented Dec 2, 2024

I added a vote in the first message: please vote :-)

@serhiy-storchaka
Copy link

Can we guarantee that it does not fail if the argument is a weakref?

@vstinner
Copy link

vstinner commented Dec 2, 2024

Can we guarantee that it does not fail if the argument is a weakref?

I think that it's a reasonable assumption, yes.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants