-
Notifications
You must be signed in to change notification settings - Fork 2.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fully port Split2QUnitaries to rust #13025
Conversation
One or more of the following people are relevant to this code:
|
Pull Request Test Coverage Report for Build 10772500080Details
💛 - Coveralls |
46b7b8f
to
0689b73
Compare
This commit builds off of Qiskit#13013 and the other data model in Rust infrastructure and migrates the InverseCancellation pass to operate fully in Rust. The full path of the transpiler pass now never leaves Rust until it has finished modifying the DAGCircuit. There is still some python interaction necessary to handle parts of the data model that are still in Python, mainly for creating `UnitaryGate` instances and `ParameterExpression` for global phase. But otherwise the entirety of the pass operates in rust now. This is just a first pass at the migration here, it moves the pass to use loops in rust. The next steps here are to look at operating the pass in parallel. There is no data dependency between the optimizations being done for different gates so we should be able to increase the throughput of the pass by leveraging multithreading to handle each gate in parallel. This commit does not attempt this though, because of the Python dependency and also the data structures around gates and the dag aren't really setup for multithreading yet and there likely will need to be some work to support that. Part of Qiskit#12208
0689b73
to
bee99b1
Compare
Some of the logic inside the Split2QUnitaries pass was updated in a recently merged PR. This commit makes those changes so the rust implementation matches the current state of the previous python version.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I left a couple of minor comments, apart from these, this LGTM. Do you have some benchmark data? :-)
let nodes: Vec<NodeIndex> = dag.topological_op_nodes()?.collect(); | ||
for node in nodes { | ||
if let NodeType::Operation(inst) = &dag.dag[node] { | ||
let qubits = dag.get_qargs(inst.qubits).to_vec(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we need a to_vec
here or would it be faster to omit it here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The to_vec()
here is needed to get an owned object here. dag.get_qargs()
returns a slice that refers back to qubits owned by dag
's qubits list. If I didn't have this the compiler would complain that have I have an immutable reference to dag
via qubits
when I try to mutate the dag later in replace_on_incoming_qubits
. I only added this because the compiler errored when I originally kept it as a slice. This is the same reason I had to collect the node indices into a Vec
instead of directly iterating over the nodes.
crates/circuit/src/dag_circuit.rs
Outdated
&mut self, | ||
py: Python, // Unused if cache_pygates isn't enabled | ||
node: NodeIndex, | ||
mut insert: F, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does F
need to be mutable?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It does because I used FnMut
for the the type here. It doesn't have to be mutable if we accepted an Fn
for F
instead. I just wasn't sure which kind of callables we'd allow longer term and opted to be a bit more conservative in the typing, and having FnMut
means the callable is allowed to mutate it's environment. I can change it to Fn
in this case because the only callback we have so far doesn't need to be FnMut
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In this case, FnMut
is probably better as it allows the user more flexibility, it especially allows users to easily track the changes performed by replace_on_incoming_qubits
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Changed in: 2ff94f5
We don't need the callback to be mutable currently so relax the trait to just be `Fn` instead of `FnMut`. If we have a need for a mutable environment callback in the future we can change this easily enough without any issues.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, thanks!
For benchmarks I didn't do anything too extensive. But I ran this script: import statistics
import time
from qiskit.circuit.random import random_circuit
from qiskit.transpiler.passes import ConsolidateBlocks, Split2QUnitaries
split_pass = Split2QUnitaries()
times = []
for _ in range(100):
qc = random_circuit(100, 1000)
blocked = ConsolidateBlocks()(qc)
start = time.perf_counter()
split_pass(blocked)
stop = time.perf_counter()
runtime = stop - start
print(runtime)
times.append(runtime)
mean_runtime = statistics.geometric_mean(times)
print(f"Mean runtime over 100x 100x1000 random circuits: {mean_runtime} sec.") with both
With this PR it yielded:
It's not as much of as speedup as I was hoping for, but still pretty good. Also, this isn't a perfect comparison because it does include the circuit -> dag overhead also I have no idea how many substitutions are actually getting made by this. |
Probably close to zero, which is the expected case for a reasonable circuit. :-) I think a lot of users would be willing to pay a lot of classical compute if they can save one two-qubit gate without introducing further errors. |
Lol, it's actually zero because I forgot to unroll 3q or more and also the collect blocks pass, so there are no collected unitaries in the circuit. I'll fix it and rerun the script. |
Summary
This commit builds off of #13013 and the other data model in Rust
infrastructure and migrates the InverseCancellation pass to
operate fully in Rust. The full path of the transpiler pass now never
leaves Rust until it has finished modifying the DAGCircuit. There is
still some python interaction necessary to handle parts of the data
model that are still in Python, mainly for creating
UnitaryGate
instances and
ParameterExpression
for global phase. But otherwisethe entirety of the pass operates in rust now.
This is just a first pass at the migration here, it moves the pass to
use loops in rust. The next steps here are to look at operating
the pass in parallel. There is no data dependency between the
optimizations being done for different gates so we should be able to
increase the throughput of the pass by leveraging multithreading to
handle each gate in parallel. This commit does not attempt
this though, because of the Python dependency and also the data
structures around gates and the dag aren't really setup for
multithreading yet and there likely will need to be some work to
support that.
Details and comments
This PR is based on top of #13013 and as such github shows the entire contents of #13013 in addition to the contents of this PR. To see the contents of this PR you can look at HEAD on this branch, or just look at:Rebased0689b73
Part of #12208