Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

gh-102701: Fix overflow in dictobject.c #102750

Merged
merged 3 commits into from
Mar 17, 2023
Merged

Conversation

methane
Copy link
Member

@methane methane commented Mar 16, 2023

@methane methane added type-bug An unexpected behavior, bug, or error interpreter-core (Objects, Python, Grammar, and Parser dirs) needs backport to 3.11 only security fixes labels Mar 16, 2023
@methane methane requested a review from markshannon as a code owner March 16, 2023 08:27
@markshannon
Copy link
Member

The change looks good, but it will need a test and a machine large enough to test it on.

@methane
Copy link
Member Author

methane commented Mar 17, 2023

I used GCP n2d-standard-16 (64GiB RAM):

inada.naoki@instance-1:~/cpython$ /usr/bin/time ./python -m test -vM 64g -m DictTest test_bigmem
== CPython 3.12.0a6+ (heads/main-dirty:1c9f3391b9, Mar 16 2023, 08:21:15) [GCC 12.2.0]
== Linux-5.19.0-1015-gcp-x86_64-with-glibc2.36 little-endian
== Python build: debug
== cwd: /home/inada.naoki/cpython/build/test_python_3512æ
== CPU count: 16
== encodings: locale=UTF-8, FS=utf-8
0:00:00 load avg: 0.51 Run tests sequentially
0:00:00 load avg: 0.51 [1/1] test_bigmem
test_dict (test.test_bigmem.DictTest.test_dict) ...
 ... expected peak memory use: 53.3G
 ... process data size: 0.1G
 ... process data size: 0.5G
 ... process data size: 0.9G
 ... process data size: 1.4G
(snip)
 ... process data size: 51.7G
 ... process data size: 51.7G
 ... process data size: 41.0G
(snip)
 ... process data size: 20.0G
ok

----------------------------------------------------------------------
Ran 1 test in 123.835s

OK
test_bigmem passed in 2 min 3 sec

== Tests result: SUCCESS ==

1 test OK.

Total duration: 2 min 3 sec
Tests result: SUCCESS
106.85user 17.31system 2:04.05elapsed 100%CPU (0avgtext+0avgdata 53939612maxresident)k
0inputs+0outputs (0major+16111371minor)pagefaults 0swaps

53939612maxresident)k is 51.44GiB. So expected peak memory use: 53.3G is accurate enough.

@@ -596,7 +596,7 @@ new_keys_object(PyInterpreterState *interp, uint8_t log2_size, bool unicode)

assert(log2_size >= PyDict_LOG_MINSIZE);

usable = USABLE_FRACTION(1<<log2_size);
usable = USABLE_FRACTION((size_t)1<<log2_size);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am asking for my curiosity(to become familiar with cpython); How does casting the result of 1<<log2_size to size_t fix the overflow in the dict?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

1 is int. And int is 32bit on most platforms. So 1 << 32 will overflow.
Here is example:

$ cat x.c
#include <stdio.h>
#include <stdlib.h>

#define USABLE_FRACTION(n) (((n) << 1)/3)

int main() {
    size_t x = USABLE_FRACTION(1 << 31);
    size_t y = USABLE_FRACTION((size_t)1 << 31);
    printf("x=%zd y=%zd\n", x, y);
}

$ gcc x.c && ./a.out
x.c:7:16: warning: shifting a negative signed value is undefined [-Wshift-negative-value]
    size_t x = USABLE_FRACTION(1 << 31);
               ^~~~~~~~~~~~~~~~~~~~~~~~
x.c:4:34: note: expanded from macro 'USABLE_FRACTION'
#define USABLE_FRACTION(n) (((n) << 1)/3)
                             ~~~ ^
1 warning generated.
x=0 y=1431655765

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks!

@methane methane merged commit 65fb7c4 into python:main Mar 17, 2023
@methane methane deleted the fix-102701 branch March 17, 2023 13:39
@miss-islington
Copy link
Contributor

Thanks @methane for the PR 🌮🎉.. I'm working now to backport this PR to: 3.11.
🐍🍒⛏🤖

@bedevere-bot
Copy link

GH-102777 is a backport of this pull request to the 3.11 branch.

@bedevere-bot bedevere-bot removed the needs backport to 3.11 only security fixes label Mar 17, 2023
miss-islington pushed a commit to miss-islington/cpython that referenced this pull request Mar 17, 2023
(cherry picked from commit 65fb7c4)

Co-authored-by: Inada Naoki <[email protected]>
miss-islington added a commit that referenced this pull request Mar 17, 2023
(cherry picked from commit 65fb7c4)

Co-authored-by: Inada Naoki <[email protected]>
carljm added a commit to carljm/cpython that referenced this pull request Mar 17, 2023
* main: (34 commits)
  pythongh-102701: Fix overflow in dictobject.c (pythonGH-102750)
  pythonGH-78530: add support for generators in `asyncio.wait` (python#102761)
  Increase stack reserve size for Windows debug builds to avoid test crashes (pythonGH-102764)
  pythongh-102755: Add PyErr_DisplayException(exc) (python#102756)
  Fix outdated note about 'int' rounding or truncating (python#102736)
  pythongh-102192: Replace PyErr_Fetch/Restore etc by more efficient alternatives (python#102760)
  pythongh-99726: Improves correctness of stat results for Windows, and uses faster API when available (pythonGH-102149)
  pythongh-102192: remove redundant exception fields from ssl module socket (python#102466)
  pythongh-102192: Replace PyErr_Fetch/Restore etc by more efficient alternatives (python#102743)
  pythongh-102737: Un-ignore ceval.c in the CI globals check (pythongh-102745)
  pythonGH-102748: remove legacy support for generator based coroutines from `asyncio.iscoroutine` (python#102749)
  pythongh-102721: Improve coverage of `_collections_abc._CallableGenericAlias` (python#102722)
  pythonGH-102653: Make recipe docstring show the correct distribution (python#102742)
  Add comments to `{typing,_collections_abc}._type_repr` about each other (python#102752)
  pythongh-102594: PyErr_SetObject adds note to exception raised on normalization error (python#102675)
  pythongh-94440: Fix issue of ProcessPoolExecutor shutdown hanging (python#94468)
  pythonGH-100112:  avoid using iterable coroutines in asyncio internally (python#100128)
  pythongh-102690: Use Edge as fallback in webbrowser instead of IE (python#102691)
  pythongh-102660: Fix Refleaks in import.c (python#102744)
  pythongh-102738: remove from cases generator the code related to register instructions (python#102739)
  ...
Fidget-Spinner pushed a commit to Fidget-Spinner/cpython that referenced this pull request Mar 27, 2023
warsaw pushed a commit to warsaw/cpython that referenced this pull request Apr 11, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
interpreter-core (Objects, Python, Grammar, and Parser dirs) type-bug An unexpected behavior, bug, or error
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants