Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

gh-106168: Check allocated instead of size index bounds in PyList_SET_ITEM() #111480

Merged
merged 1 commit into from
Oct 30, 2023

Conversation

scoder
Copy link
Contributor

@scoder scoder commented Oct 30, 2023

Check the index bound assertions in PyList_SET_ITEM() against [0:allocated] instead of [0:size] to re-allow valid use cases that assign within the allocated area.

First increasing the size before setting a new value seems less safe and clean than allowing access to the allocated area and adjusting the size on successful updates.

…) against [0:allocated] instead of [0:size] to re-allow valid use cases that assign within the allocated area.
@scoder
Copy link
Contributor Author

scoder commented Oct 30, 2023

I'm not sure what to make of the recommendation in the "what's new" paragraph. I specifically wouldn't recommend setting the size first, before setting the value. That seems incorrect and fragile. I'd remove it, but that'd leave us entirely without a recommendation.

@vstinner
Copy link
Member

I dislike PyList_New() + PyList_SET_ITEM() API since the list is immediately tracked by the GC and so calling gc.get_objects() can expose an invalid list object (ex: calling repr(list) can crash). See:

With this change, PyList_SetItem() and PyList_SET_ITEM() become inconsistent: PyList_SetItem() uses list->ob_size, whereas PyList_SET_ITEM() uses list->allocated.

PyList_SET_ITEM() is a strange beast: it doesn't DECREF the old value, it must only be used to initialize a list newly allocated by PyList_New(). But PyList_New(size) sets list->ob_size to size.

PyList_SetItem() is more regular: it calls DECREF on the old item.

Well, here the problem is not about designing a correct API, but to migrate Python 3.12 code to Python 3.13. Previously, PyList_SET_ITEM() could write safely in [0; allocated[ range, and in Python 3.13 it's limited to [0; ob_size[.

@vstinner
Copy link
Member

I'm not sure what to make of the recommendation in the "what's new" paragraph.

Since Python 3.13.0 final was not released, a Changelog entry is enough to document changes since 3.13 alpha1 (a NEWS.d entry added by the blurb tool).

@vstinner vstinner merged commit 940ee96 into python:main Oct 30, 2023
28 checks passed
@vstinner
Copy link
Member

I merged your change without Changelog entry, I don't think that it's needed for such subtile change. Thanks.

@vstinner
Copy link
Member

I created issue #111489: [C API] Make _PyList_FromArraySteal() function public.

@scoder scoder deleted the fix_pylist_set_item_check branch November 1, 2023 09:06
@scoder
Copy link
Contributor Author

scoder commented Nov 1, 2023

without Changelog entry, I don't think that it's needed for such subtile change

I was referring to the existing whats-new paragraph, which says "if the assertion fails, set the size first". That is no longer true.

@scoder
Copy link
Contributor Author

scoder commented Nov 1, 2023

I dislike PyList_New() + PyList_SET_ITEM() API since the list is immediately tracked by the GC and so calling gc.get_objects() can expose an invalid list object (ex: calling repr(list) can crash)

That's why I prefer not requiring users to change the size before they set the value. Doing that would bring the list in an invalid state.

With this change, PyList_SetItem() and PyList_SET_ITEM() become inconsistent: PyList_SetItem() uses list->ob_size, whereas PyList_SET_ITEM() uses list->allocated.

I don't see an inconsistency here. They serve different needs, that's why we have two different functions. As you write, PyList_SetItem() decrefs the current value and replaces it with a new one. That can only safely be done within [0:size]. PyList_SET_ITEM() writes a value to an array position that does not yet have a value. That can only safely be done within [0:allocated]. Different needs, different functions, different boundary conditions.

You could rather argue that PyList_SET_ITEM() is only really valid within [size:allocated]. Within [0:size], it would overwrite existing, owned values. However, enforcing that would probably break even more code that might just look slightly brittle but is actually working perfectly fine (because it does what you suggested, it updates the size immediately before setting the value).

Cython uses the _SET_ITEM() functions to initialise freshly created lists and tuples, but it also calls PyList_SET_ITEM() in a fast, inlined PyList_Append() implementation:
https://github.com/cython/cython/blob/master/Cython/Utility/Optimize.c#L29-L77
We use this for list comprehensions, for example, where we control the list internally. As long as the next item fits into the already allocated array, we just put it there and increase the size.

@vstinner
Copy link
Member

vstinner commented Nov 1, 2023

I was referring to the existing whats-new paragraph, which says "if the assertion fails, set the size first". That is no longer true.

Ah oops, I tried to find a What's New entry, but I failed to find it, and so made the assumption that I didn't document the change. It should be updated, you're right.

@vstinner
Copy link
Member

vstinner commented Nov 1, 2023

Ah oops, I tried to find a What's New entry, but I failed to find it, and so made the assumption that I didn't document the change. It should be updated, you're right.

I wrote PR #111618 to update What's New in Python 3.13.

FullteaR pushed a commit to FullteaR/cpython that referenced this pull request Nov 3, 2023
…st_SET_ITEM() (python#111480)

Check the index bound assertions in PyList_SET_ITEM() against [0:allocated] instead of [0:size] to re-allow valid use cases that assign within the allocated area.
aisk pushed a commit to aisk/cpython that referenced this pull request Feb 11, 2024
…st_SET_ITEM() (python#111480)

Check the index bound assertions in PyList_SET_ITEM() against [0:allocated] instead of [0:size] to re-allow valid use cases that assign within the allocated area.
Glyphack pushed a commit to Glyphack/cpython that referenced this pull request Sep 2, 2024
…st_SET_ITEM() (python#111480)

Check the index bound assertions in PyList_SET_ITEM() against [0:allocated] instead of [0:size] to re-allow valid use cases that assign within the allocated area.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants