Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

"z" format specifier is treated differently in unicode and bytes #104018

Closed
navytux opened this issue Apr 30, 2023 · 5 comments
Closed

"z" format specifier is treated differently in unicode and bytes #104018

navytux opened this issue Apr 30, 2023 · 5 comments
Labels
interpreter-core (Objects, Python, Grammar, and Parser dirs) type-bug An unexpected behavior, bug, or error

Comments

@navytux
Copy link

navytux commented Apr 30, 2023

Hello up there. I've hit a discrepancy in how z flags is handled by % in unicode and bytes:

kirr@deca:~$ python3
Python 3.11.2 (main, Mar 13 2023, 12:18:29) [GCC 12.2.0] on linux
Type "help", "copyright", "credits" or "license" for more information.

>>> '%zf' % 1                                       <--   unicode
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
ValueError: unsupported format character 'z' (0x7a) at index 1

>>> b'%zf' % 1                                      <--   bytes
b'1.000000'

>>> b'%zf' % 0.0                                    <--   +0 -> 0
b'0.000000'

>>> b'%zf' % -0.0                                   <--   -0 -> 0
b'0.000000'

>>> b'%f' % -0.0                                    <--   -0 -> -0 if run without 'z'
b'-0.000000'

In other words there is inconsistency in how 'z' is handled by '%' for unicode and bytes, and there is also inconsistency in how 'z' was supposed to be handled by .format and not handled by '%' as originally discussed on BPO-45995.

'z' handling was implemented in #30049 and indeed there I see b'%z' being fully handled:

b0b836b20cb5#diff-f6d440aad34e1c4535c0d898c0197a95490766c745991caace6f64b5dd1ece51

but u'%z' being only partly handled internally without corresponding frontend parsing that bytes has:

b0b836b20cb5#diff-34c966e7876d6f8bf801dd51896327e4f68bba02cddb95fbf3963f0b2e39c38a

In my view the fix should be either a) to add '%z' handling to unicode, or b) to remove '%z' handling from bytes.

Thanks beforehand,
Kirill

  • CPython versions tested on: 3.11.2
  • Operating system and architecture: Debian GNU/Linux 12 on AMD64

/cc @belm0, @mdickinson

Linked PRs

@navytux navytux added the type-bug An unexpected behavior, bug, or error label Apr 30, 2023
@arhadthedev arhadthedev added the interpreter-core (Objects, Python, Grammar, and Parser dirs) label Apr 30, 2023
@belm0
Copy link
Contributor

belm0 commented May 1, 2023

Good catch. I think having that enabled was left over from an early incarnation of the PR, before it was decided that %-format would not be supported.

For now, I confirmed that the tests still pass after disable of case 'z' in _PyBytes_FormatEx().

Fix should include tests to confirm that "z" format is not accepted for %-formatting.

belm0 added a commit to belm0/cpython that referenced this issue May 1, 2023
…rings

PEP-0682 specified that %-formatting would not support the "z"
specifier, but it was unintentionally allowed for a byte strings.

Issue: python#104018
mdickinson pushed a commit that referenced this issue May 1, 2023
…H-104033)

PEP-0682 specified that %-formatting would not support the "z" specifier,
but it was unintentionally allowed for bytes. This PR makes use of the "z"
flag an error for %-formatting in a bytestring.

Issue: #104018

---------

Co-authored-by: blurb-it[bot] <43283697+blurb-it[bot]@users.noreply.github.com>
miss-islington pushed a commit to miss-islington/cpython that referenced this issue May 1, 2023
…rings (pythonGH-104033)

PEP-0682 specified that %-formatting would not support the "z" specifier,
but it was unintentionally allowed for bytes. This PR makes use of the "z"
flag an error for %-formatting in a bytestring.

Issue: pythonGH-104018

---------

(cherry picked from commit 3ed8c88)

Co-authored-by: John Belmonte <[email protected]>
Co-authored-by: blurb-it[bot] <43283697+blurb-it[bot]@users.noreply.github.com>
@mdickinson
Copy link
Member

Fixed in #104018 and fix backported to 3.11 in #104033. Thanks @navytux for the report and @belm0 for the quick fix.

mdickinson pushed a commit that referenced this issue May 1, 2023
…trings (GH-104033) (#104058)

gh-104018: disallow "z" format specifier in %-format of byte strings (GH-104033)

PEP-0682 specified that %-formatting would not support the "z" specifier,
but it was unintentionally allowed for bytes. This PR makes use of the "z"
flag an error for %-formatting in a bytestring.

Issue: GH-104018

---------

(cherry picked from commit 3ed8c88)

Co-authored-by: John Belmonte <[email protected]>
Co-authored-by: blurb-it[bot] <43283697+blurb-it[bot]@users.noreply.github.com>
carljm added a commit to carljm/cpython that referenced this issue May 1, 2023
* main: (463 commits)
  pythongh-104057: Fix direct invocation of test_super (python#104064)
  pythongh-87092: Expose assembler to unit tests (python#103988)
  pythongh-97696: asyncio eager tasks factory (python#102853)
  pythongh-84436: Immortalize in _PyStructSequence_InitBuiltinWithFlags() (pythongh-104054)
  pythongh-104057: Fix direct invocation of test_module (pythonGH-104059)
  pythongh-100458: Clarify Enum.__format__() change of mixed-in types in the whatsnew/3.11.rst (pythonGH-100387)
  pythongh-104018: disallow "z" format specifier in %-format of byte strings (pythonGH-104033)
  pythongh-104016: Fixed off by 1 error in f string tokenizer (python#104047)
  pythonGH-103629: Update Unpack's repr in compliance with PEP 692 (python#104048)
  pythongh-102799: replace sys.exc_info by sys.exception in inspect and traceback modules (python#104032)
  Fix typo in "expected" word in few source files (python#104034)
  pythongh-103824: fix use-after-free error in Parser/tokenizer.c (python#103993)
  pythongh-104035: Do not ignore user-defined `__{get,set}state__` in slotted frozen dataclasses (python#104041)
  pythongh-104028: Reduce object creation while calling callback function from gc (pythongh-104030)
  pythongh-104036: Fix direct invocation of test_typing (python#104037)
  pythongh-102213: Optimize the performance of `__getattr__` (pythonGH-103761)
  pythongh-103895: Improve how invalid `Exception.__notes__` are displayed (python#103897)
  Adjust expression from `==` to `!=` in alignment with the meaning of the paragraph. (pythonGH-104021)
  pythongh-88496: Fix IDLE test hang on macOS (python#104025)
  Improve int test coverage (python#104024)
  ...
@navytux
Copy link
Author

navytux commented May 2, 2023

@belm0, @mdickinson, thanks for the prompt fix.

May I ask why static formatfloat() in bytesobject.c remains with F_NO_NEG_0 handling? Offhand it looks like that flag bit could never make it into that function, but I might be missing something. The same question applies to unicodeobject.c .

belm0 added a commit to belm0/cpython that referenced this issue May 2, 2023
@belm0
Copy link
Contributor

belm0 commented May 2, 2023

May I ask why static formatfloat() in bytesobject.c remains with F_NO_NEG_0 handling? Offhand it looks like that flag bit could never make it into that function, but I might be missing something. The same question applies to unicodeobject.c .

Thank you, please see #104107

belm0 added a commit to belm0/cpython that referenced this issue May 2, 2023
…oat()

This is a cleanup overlooked in PR python#104033.

Please skip NEWS.
@navytux
Copy link
Author

navytux commented May 3, 2023

Thanks

kumaraditya303 pushed a commit that referenced this issue May 7, 2023
miss-islington pushed a commit to miss-islington/cpython that referenced this issue May 7, 2023
…oat() (pythonGH-104107)

This is a cleanup overlooked in PR pythonGH-104033.
(cherry picked from commit 69621d1)

Co-authored-by: John Belmonte <[email protected]>
kumaraditya303 pushed a commit that referenced this issue May 7, 2023
…loat() (GH-104107) (#104260)

gh-104018: remove unused format "z" handling in string formatfloat() (GH-104107)

This is a cleanup overlooked in PR GH-104033.
(cherry picked from commit 69621d1)

Co-authored-by: John Belmonte <[email protected]>
jbower-fb pushed a commit to jbower-fb/cpython-jbowerfb that referenced this issue May 8, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
interpreter-core (Objects, Python, Grammar, and Parser dirs) type-bug An unexpected behavior, bug, or error
Projects
None yet
Development

No branches or pull requests

4 participants