Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

bpo-1635741: Enable unicode_release_interned() without insure or valgrind. #21087

Closed

Conversation

shihai1991
Copy link
Member

@shihai1991 shihai1991 commented Jun 23, 2020

@shihai1991
Copy link
Member Author

After this PR, from sys.gettotalrefcount: 17647 to sys.gettotalrefcount: 14775.

Copy link
Member

@vstinner vstinner left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you please try to measure the important on performance of this change? You can use:

python3 -m pyperf command -- ./python -S -c pass

@@ -1270,6 +1269,7 @@ finalize_interp_types(PyThreadState *tstate, int is_main_interp)
_PyBytes_Fini();
}
_PyUnicode_Fini(tstate);
_PyList_Fini(tstate);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do you clear list after unicode?

The rationale of the current code is that a list can contains slice, unicode, etc. So list must be cleared first.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

oh the assert failed: python: Objects/listobject.c:144: PyList_New: Assertion `state->numfree != -1' failed.

the traceback:

#0  0x00007ffff6ce2207 in raise () from /lib64/libc.so.6
#1  0x00007ffff6ce38f8 in abort () from /lib64/libc.so.6
#2  0x00007ffff6cdb026 in __assert_fail_base () from /lib64/libc.so.6
#3  0x00007ffff6cdb0d2 in __assert_fail () from /lib64/libc.so.6
#4  0x0000000000446b7f in PyList_New (size=size@entry=1524) at ./Include/internal/pycore_pystate.h:95
#5  0x00000000004560c3 in dict_keys (mp=mp@entry=0x7ffff7fa70b0) at Objects/dictobject.c:2196
#6  0x000000000045e16f in PyDict_Keys (mp=0x7ffff7fa70b0) at Objects/dictobject.c:2758
#7  0x00000000004b1f34 in unicode_release_interned () at Objects/unicodeobject.c:15666
#8  0x00000000004def2a in _PyUnicode_Fini (tstate=tstate@entry=0xa29a60) at Objects/unicodeobject.c:16207
#9  0x0000000000531adb in finalize_interp_types (tstate=tstate@entry=0xa29a60, is_main_interp=is_main_interp@entry=1) at Python/pylifecycle.c:1272
#10 0x0000000000531b62 in finalize_interp_clear (tstate=tstate@entry=0xa29a60) at Python/pylifecycle.c:1305
#11 0x0000000000534591 in Py_FinalizeEx () at Python/pylifecycle.c:1475
#12 0x000000000041d6ef in Py_RunMain () at Modules/main.c:634
#13 0x000000000041d73f in pymain_main (args=args@entry=0x7fffffffe2c0) at Modules/main.c:662
#14 0x000000000041d7bb in Py_BytesMain (argc=<optimized out>, argv=<optimized out>) at Modules/main.c:686
#15 0x000000000041c33d in main (argc=<optimized out>, argv=<optimized out>) at ./Programs/python.c:15

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

copy from the failed ci gate:
python: Objects/dictobject.c:610: free_keys_object: Assertion `state->keys_numfree != -1' failed.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hum, maybe unicode_release_interned() must use PyDict_Next() to iterate on the dict, rather than calling PyDict_Keys() which creates a new list.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, I will try it.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this line can not reverted now, because many assert failed in dictobject.c:

#0  0x00007ffff6ce2207 in raise () from /lib64/libc.so.6
#1  0x00007ffff6ce38f8 in abort () from /lib64/libc.so.6
#2  0x00007ffff6cdb026 in __assert_fail_base () from /lib64/libc.so.6
#3  0x00007ffff6cdb0d2 in __assert_fail () from /lib64/libc.so.6
#4  0x000000000045647d in free_keys_object (keys=keys@entry=0x7ffff7e41030) at Objects/dictobject.c:616
#5  0x000000000045c8a4 in dictkeys_decref (dk=0x7ffff7e41030) at Objects/dictobject.c:336
#6  PyDict_Clear (op=0x7ffff7fa80b0) at Objects/dictobject.c:1767
#7  0x00000000004b23c3 in unicode_release_interned () at Objects/unicodeobject.c:15708
#8  0x00000000004deeb9 in _PyUnicode_Fini (tstate=tstate@entry=0xa2a7c0) at Objects/unicodeobject.c:16204
#9  0x0000000000531a7e in finalize_interp_types (tstate=tstate@entry=0xa2a7c0, is_main_interp=is_main_interp@entry=1) at Python/pylifecycle.c:1269
#10 0x0000000000531ae6 in finalize_interp_clear (tstate=tstate@entry=0xa2a7c0) at Python/pylifecycle.c:1298
#0  0x00007ffff6ce2207 in raise () from /lib64/libc.so.6
#1  0x00007ffff6ce38f8 in abort () from /lib64/libc.so.6
#2  0x00007ffff6cdb026 in __assert_fail_base () from /lib64/libc.so.6
#3  0x00007ffff6cdb0d2 in __assert_fail () from /lib64/libc.so.6
#4  0x0000000000457e15 in dict_dealloc (mp=0x7ffff7fa80b0) at Objects/dictobject.c:336
#5  0x000000000046a0c9 in _Py_Dealloc (op=<optimized out>) at Objects/object.c:2211
#6  0x00000000004b22d6 in _Py_DECREF (op=<optimized out>, lineno=15709, filename=0x6a3552 "Objects/unicodeobject.c") at ./Include/object.h:449
#7  unicode_release_interned () at Objects/unicodeobject.c:15709
#8  0x00000000004ded7b in _PyUnicode_Fini (tstate=tstate@entry=0xa2a7c0) at Objects/unicodeobject.c:16204
#9  0x0000000000531942 in finalize_interp_types (tstate=tstate@entry=0xa2a7c0, is_main_interp=is_main_interp@entry=1) at Python/pylifecycle.c:1269
#10 0x00000000005319aa in finalize_interp_clear (tstate=tstate@entry=0xa2a7c0) at Python/pylifecycle.c:1298

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

could we remove those asserts? I don't know there are some risks or not after we remove those asserts.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The above traceback is about PyDict_Fini(), not PyList_Fini().

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The rationale of the current code is that a list can contains slice, unicode, etc. So list must be cleared first.

Hi, victor. I have a question about it: if we broke the clear order, there will be some resource leak in here?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Or _PyUnicode_Fini() before _PyDict_Fini()? _PyUnicode_Fini() calling PyDict_Clear(interned) to relase interned(no other malloc operation).
I am not sure it's ok or not.

@vstinner
Copy link
Member

Can you please try to measure the important on performance of this change? You can use:

Note: Please use a release build (./configure && make).

@shihai1991
Copy link
Member Author

shihai1991 commented Jun 23, 2020

Can you please try to measure the important on performance of this change? You can use:

Note: Please use a release build (./configure && make).

Hi, victor. Got this benchmark result.

$ python3 -m pyperf command -- ./python -S -c pass
.....................
WARNING: the benchmark result may be unstable

  • the standard deviation (1.62 ms) is 11% of the mean (14.3 ms)

Try to rerun the benchmark with more runs, values and/or loops.
Run 'python3 -m pyperf system tune' command to reduce the system jitter.
Use pyperf stats, pyperf dump and pyperf hist to analyze results.
Use --quiet option to hide these warnings.

command: Mean +- std dev: 14.3 ms +- 1.6 ms

$ python3 -m pyperf command -- ./python -S -c pass --duplicate=100
.....................
WARNING: the benchmark result may be unstable

  • the standard deviation (2.33 ms) is 15% of the mean (15.1 ms)

Try to rerun the benchmark with more runs, values and/or loops.
Run 'python3 -m pyperf system tune' command to reduce the system jitter.
Use pyperf stats, pyperf dump and pyperf hist to analyze results.
Use --quiet option to hide these warnings.

command: Mean +- std dev: 15.1 ms +- 2.3 ms

@vstinner
Copy link
Member

Hi, victor. Got this benchmark result.

What did you measure? What is the reference (unmodified master branch) and what is the modified code (with your PR)?

You can store the benchmark result in a file (-o bench.json) and then compare two files (python3 -m pyperf compare_to ref.json pr.json).

@vstinner
Copy link
Member

What is the dictionary size of the interned strings at Python exit (in unicode_release_interned())?

@vstinner
Copy link
Member

Please rebase your PR on master, I modified unicodeobject.c to make singletons per interpreter.

@shihai1991
Copy link
Member Author

Hi, victor. Got this benchmark result.

What did you measure? What is the reference (unmodified master branch) and what is the modified code (with your PR)?

You can store the benchmark result in a file (-o bench.json) and then compare two files (python3 -m pyperf compare_to ref.json pr.json).

python3 is master, ./python with tihs PR. I test it again after I fix the assert problem.

@vstinner
Copy link
Member

python3 is master, ./python with tihs PR

You tested ./python is both cases. Usually, to measure the performance of a change, you need a reference (your "python3") and the modified Python (your "./python").

@shihai1991
Copy link
Member Author

python3 is master, ./python with tihs PR

You tested ./python is both cases. Usually, to measure the performance of a change, you need a reference (your "python3") and the modified Python (your "./python").

ok, Look like I confused test python3's performance with use pyperf of python3 to compare other python version's load performance.
you mean this one, right?:

python3 -m pyperf command ./python_ref  xxx -o ref.json
python3 -m pyperf command ./python_pr xxx -o pr.json
python3 -m pyperf compare_to ref.json pr.json

@vstinner
Copy link
Member

python3 -m pyperf command ./python_ref  xxx -o ref.json
python3 -m pyperf command ./python_pr xxx -o pr.json
python3 -m pyperf compare_to ref.json pr.json

Yeah, that would work :-)

Anyway, I expect no significant difference. So the most important data point is the number of items in the dictionary at exit.

@shihai1991
Copy link
Member Author

python3 -m pyperf command ./python_ref  xxx -o ref.json
python3 -m pyperf command ./python_pr xxx -o pr.json
python3 -m pyperf compare_to ref.json pr.json

Yeah, that would work :-)

Anyway, I expect no significant difference. So the most important data point is the number of items in the dictionary at exit.

Got this result ;) :
$ python3.9 -m pyperf compare_to ref.json pr.json
Benchmark hidden because not significant (1): command

@shihai1991 shihai1991 requested review from 1st1, rhettinger, skrah and a team as code owners June 25, 2020 08:48
@@ -1270,6 +1269,7 @@ finalize_interp_types(PyThreadState *tstate, int is_main_interp)
_PyBytes_Fini();
}
_PyUnicode_Fini(tstate);
_PyList_Fini(tstate);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since the function now uses PyDict_Next(), can you please revert this change?

Objects/unicodeobject.c Outdated Show resolved Hide resolved
@shihai1991
Copy link
Member Author

What is the dictionary size of the interned strings at Python exit (in unicode_release_interned())?

(gdb) p PyDict_Size(interned)
$1 = 1471

@vstinner
Copy link
Member

vstinner commented Jul 1, 2020

I wrote PR #21269 which clears interned strings in the correct order.

@vstinner
Copy link
Member

vstinner commented Jul 1, 2020

I merged my PR #21269.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants