Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

GH-91432: Implement the FOR_ITER_SET specialization #94104

Closed
wants to merge 5 commits into from

Conversation

corona10
Copy link
Member

@corona10 corona10 commented Jun 22, 2022

Microbenchmark

result: Mean +- std dev: [base] 662 ns +- 5 ns -> [FOR_ITER_SET] 616 ns +- 2 ns: 1.07x faster

import pyperf

runner = pyperf.Runner()
runner.timeit(name="set iter",
              stmt="""
for e in s:
    pass
""",
              setup = """
s = set(range(100))
"""
              )

Leak test

➜  cpython git:(gh-91432) ✗ ./python.exe -m test test_set -R 3:3
Raised RLIMIT_NOFILE: 256 -> 1024
0:00:00 load avg: 7.62 Run tests sequentially
0:00:00 load avg: 7.62 [1/1] test_set
beginning 6 repetitions
123456
......

== Tests result: SUCCESS ==

1 test OK.

Total duration: 23.3 sec
Tests result: SUCCESS

I just follow @sweeneyde's work.
(Following the faster CPython project is kind of my daily hobby ;) Sorry if you already worked on your local branch.. )
According to his stat, set iteration (3.8%) would be worth optimizing as same as tuple iteration (2.8%)
This PR would be decided to be merged after #94096 is merged.

@corona10 corona10 changed the title GH-91432: Implement FOR_ITER_SET GH-91432: Implement the FOR_ITER_SET specialization Jun 22, 2022
@corona10 corona10 marked this pull request as ready for review June 22, 2022 08:56
@corona10 corona10 requested review from a team, markshannon and rhettinger as code owners June 22, 2022 08:56
@markshannon
Copy link
Member

Is iteration over sets common enough to justify this?
The stats on pyperformance suggest it is only ~3% of FOR_ITER executions.

Our main motivation for specializing FOR_ITER is to allow us to iterate over generators without making C calls and without slowing down iterating over sequences.

@corona10
Copy link
Member Author

corona10 commented Jun 22, 2022

@markshannon
Thanks, yeah stat looks changed from https://gist.github.com/sweeneyde/e1a7e98890e75c9355f99067b03ee37b.
(What makes change?)
dict keys and ascii string is more worth to specializing them.
It seems that all I can defend is that the numbers are lower than theirs(dict keys, ascii string) but are similar lol.

I am okay not applying this change if this is not worth specializing in the operation.
Can we run the performance benchmark? (Sorry, I don't have the physical machine to run this.. )
if there is no contribution to performance improvement, I am fine to not specialize it.

@markshannon
Copy link
Member

The change in stats is presumably from merging #91713

I'll run pyperformance on this PR, but its in the queue behind #94109, so might take a while...

Comment on lines +825 to 832
static PyObject *setiter_iternext(_PySetIterObject *si) {
PyObject *stack[1];
int err = _PySetIter_GetNext(si, stack);
if (err <= 0) {
return NULL;
}
return stack[0];
}
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
static PyObject *setiter_iternext(_PySetIterObject *si) {
PyObject *stack[1];
int err = _PySetIter_GetNext(si, stack);
if (err <= 0) {
return NULL;
}
return stack[0];
}
static PyObject *setiter_iternext(_PySetIterObject *si) {
PyObject *key;
if (_PySetIter_GetNext(si, &key) <= 0) {
return NULL;
}
return key;
}

might be better

@markshannon
Copy link
Member

Performance shows a 1% speedup, although I think this just shows that 1% variations can be caused by code layout changes and other build effects.

Projecting from stats, FOR_ITER_SET would represent about 0.03% of executed instructions. (FOR_ITER_ADAPTIVE is 0.8% and of that only 4% are sets.)

I doubt that any further specialization of FOR_ITER is worthwhile.

@corona10
Copy link
Member Author

@markshannon agree Let's close the PR!

@corona10 corona10 closed this Jun 23, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants