-
Notifications
You must be signed in to change notification settings - Fork 2.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Increase heuristic effort for optimization level 2 #12149
Conversation
This commit tweaks the heuristic effort in optimization level 2 to be more of a middle ground between level 1 and 3; with a better balance between output quality and runtime. This places it to be a better default for a pass manager we use if one isn't specified. The tradeoff here is that the vf2layout and vf2postlayout search space is reduced to be the same as level 1. There are diminishing margins of return on the vf2 layout search especially for cases when there are a large number of qubit permutations for the mapping found. Then the number of sabre trials is brought up to the same level as optimization level 3. As this can have a significant impact on output and the extra runtime cost is minimal. The larger change is that the optimization passes from level 3. This ends up mainly being 2q peephole optimization. With the performance improvements from Qiskit#12010 and Qiskit#11946 and all the follow-on PRs this is now fast enough to rely on in optimization level 2.
One or more of the the following people are requested to review this:
|
Pull Request Test Coverage Report for Build 8693815185Details
💛 - Coveralls |
For the initial VF2Layout call this commit expands the vf2 call limit back to the previous level instead of reducing it to the same as level 1. The idea behind making this change is that spending up to 10s to find a perfect layout is a worthwhile tradeoff as that will greatly improve the result from execution. But scoring multiple layouts to find the lowest error rate subgraph has a diminishing margin of return in most cases as there typically aren't thousands of unique subgraphs and often when we hit the scoring limit it's just permuting the qubits inside a subgraph which doesn't provide the most value. For VF2PostLayout the lower call limits from level 1 is still used. This is because both the search for isomorphic subgraphs is typically much shorter with the vf2++ node ordering heuristic so we don't need to spend as much time looking for alternative subgraphs.
Due to potential instability in the 2q peephole optimization we run we were using the `MinimumPoint` pass to provide backtracking when we reach a local minimum. However, this pass adds a significant amount of overhead because it deep copies the circuit at every iteration of the optimization loop that improves the output quality. This commit tweaks the O2 pass manager construction to only run 2q peephole once, and then updates the optimization loop to be what the previous O2 optimization loop was.
I ran the "utility scale" asv benchmarks with this PR and got the following results:
It's about what I expected. Improvements in quality since we're ramping up heurstic effort in a bunch of places but also at the cost of runtime. The The only other change I really want to do is building off of #12171 I'd like to change the default sabre heuristic we use in level 2 to
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think there'll be more tweaks to O2 as we speed up and streamline other passes, add sabre heuristics, etc, but this is a solid starting point
Summary
This commit tweaks the heuristic effort in optimization level 2 to be more of a middle ground between level 1 and 3; with a better balance between output quality and runtime. This places it to be a better default for a pass manager we use if one isn't specified. The tradeoff here is that the vf2layout and vf2postlayout search space is reduced to be the same as level 1. There are diminishing margins of return on the vf2 layout search especially for cases when there are a large number of qubit permutations for the mapping found. Then the number of sabre trials is brought up to the same level as optimization level 3. As this can have a significant impact on output and the extra runtime cost is minimal. The larger change is that the optimization passes from level 3. This ends up mainly being 2q peephole optimization. With the performance improvements from #12010 and #11946 and all the follow-on PRs this is now fast enough to rely on in optimization level 2.
Details and comments
Related to: #7112