Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add HPyUnicode_FromFormat(v) and HPyErr_Format. #405

Merged
merged 6 commits into from
Feb 22, 2023

Conversation

steve-s
Copy link
Contributor

@steve-s steve-s commented Feb 13, 2023

Git log message:

 Implement API helper functions  HPyUnicode_FromFormat(V) and HPyErr_Format
    
    The HPy implementation was taken from CPython, but was adapted to be to operate
    only with utf-8 encoding, because that is how HPy exposes Python strings
    (i.e., HPyUnicode_AsUTF8AndSize). CPython starts with UCS1 and widens to
    UCS2,3,4 depending on the largest character encountered so far. We keep
    everything in utf-8, or in fact in ascii as long as we do not encounter any of
    the Python specific formatting units such as %S or %V. The only place where
    we actually need to have special handling for wider characters is with the width
    and precision flags for Python specific formatting units.
    
    The contract is bit stricter than CPython, for example: width/precision are not
    just ignored for %c and %p, but cause an error. NULL and HPy_NULL values are
    not just asserted, but cause a Python level system error. From quick check of
    the top 4000 packages it seems that this should not cause any compatibility
    issues.
    
    We chose to not expose HPyErr_FormatV yet, because it does not seem necessary:
    no usages in top 4000 packages.

…rmat

The HPy implementation was taken from CPython, but was adapted to be to operate
only with utf-8 encoding, because that is how HPy exposes Python strings
(i.e., HPyUnicode_AsUTF8AndSize). CPython starts with UCS1 and widens to
UCS2,3,4 depending on the largest character encountered so far. We keep
everything in utf-8, or in fact in ascii as long as we do not encounter any of
the Python specific formatting units such as %S or %V. The only place where
we actually need to have special handling for wider characters and surrogate
pairs is with the width and precision flags for Python specific formatting
units.

The contract is bit stricter than CPython, for example: width/precision are not
just ignored for %c and %p, but cause an error. NULL and HPy_NULL values are
not just asserted, but cause a Python level system error. From quick check of
the top 4000 packages it seems that this should not cause any compatibility
issues.

We chose to not expose HPyErr_FormatV yet, because it does not seem necessary:
no usages in top 4000 packages.
@mattip
Copy link
Contributor

mattip commented Feb 21, 2023

I wonder what this means for other implementations. Did you already implement this in GraalPython?

@fangerer
Copy link
Contributor

fangerer commented Feb 21, 2023

I wonder what this means for other implementations. Did you already implement this in GraalPython?

There is nothing to implement on GraalPy or on PyPy. The formatting code will be in the user's extension (like HPyArg_Parse and friends). The used ABI (e.g. HPy_Str, etc.) already exists.

@mattip mattip merged commit aa49881 into hpyproject:master Feb 22, 2023
@mattip
Copy link
Contributor

mattip commented Feb 22, 2023

Thanks @steve-s

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants