Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

gh-95781: More strict format string checking in PyUnicode_FromFormatV() #95784

Conversation

serhiy-storchaka
Copy link
Member

@serhiy-storchaka serhiy-storchaka commented Aug 8, 2022

An unrecognized format character in PyUnicode_FromFormat() and
PyUnicode_FromFormatV() now sets a SystemError.
In previous versions it caused all the rest of the format string to be
copied as-is to the result string, and any extra arguments discarded.

…ormatV()

An unrecognized format character in PyUnicode_FromFormat() and
PyUnicode_FromFormatV() now sets a SystemError.
In previous versions it caused all the rest of the format string to be
copied as-is to the result string, and any extra arguments discarded.
@malemburg
Copy link
Member

Hi Serhiy, I believe a ValueError or a RuntimeError would be more appropriate. SystemErrors are reserved for CPython internal errors, but the API can also be used in 3rd party extensions. Other than that +1 on the PR.

@serhiy-storchaka
Copy link
Member Author

SystemErrors are reserved for improper use of the C API (if it can be detected at all, otherwise you can get a crash or corrupted memory). These errors cannot be triggered by Python code. ValueError, etc is used if the error depends on the data.

BTW, I am going to replace some ValueErrors currently raised for invalid format string with SystemErrors. But this is a different issue.

@serhiy-storchaka serhiy-storchaka merged commit 62f0650 into python:main Aug 8, 2022
@serhiy-storchaka serhiy-storchaka deleted the capi-unicode-fromformat-invalid-format branch August 8, 2022 16:21
@malemburg
Copy link
Member

SystemErrors are reserved for improper use of the C API (if it can be detected at all, otherwise you can get a crash or corrupted memory). These errors cannot be triggered by Python code. ValueError, etc is used if the error depends on the data.

BTW, I am going to replace some ValueErrors currently raised for invalid format string with SystemErrors. But this is a different issue.

I believe you have a misunderstanding there, or the docs are out of date. Please see https://docs.python.org/3/library/exceptions.html#SystemError vs. https://docs.python.org/3/library/exceptions.html#RuntimeError

The "SystemError" is explicitly referring to CPython internals and users are requested to raise such issues with the CPython maintainers. "RuntimeError" are meant for things where CPython finds an issue with a runtime use of an API for which we don't have any other specific error.

If this reasoning has changed, the docs would need to be updated.

I do see the SystemError also used in getargs.c, which uses format strings as well, so perhaps the docs need an update to broaden them to include C API internals as well - and include 3rd party developers as contact group too.

@vstinner
Copy link
Member

vstinner commented Aug 8, 2022

The old behavior was really strange and likely a consequence of the implementation rather than a deliberate design choice. Thanks for fixing it!

I don't think that the exact exception type matters, as soon as it raises an exception. But I also prefer SystemError when the C API is misused. PyErr_BadInternalCall() is a good example of that.

@serhiy-storchaka
Copy link
Member Author

Thanks for pointing to the documentation. I think it should be updated, because it clearly does not match the current practice of using SystemError. Many C API functions raise SystemError when called with improper arguments (like NULL, or negative size, or wrong type in the concrete API). It is even often explicitly documented. I cannot recall any API call which raises RuntimeError for bad arguments.

@serhiy-storchaka
Copy link
Member Author

Opened #95799 for updating SystemError documentation.

iritkatriel pushed a commit to iritkatriel/cpython that referenced this pull request Aug 11, 2022
…ormatV() (pythonGH-95784)

An unrecognized format character in PyUnicode_FromFormat() and
PyUnicode_FromFormatV() now sets a SystemError.
In previous versions it caused all the rest of the format string to be
copied as-is to the result string, and any extra arguments discarded.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
topic-C-API type-feature A feature request or enhancement
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants