Speedup Pauli network synthesis code #4
Merged
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
This PR includes several low-hanging performance improvements for pauli network synthesis.
Reasoning about PauliDag's front layer
Instead of recalculating the front layer in the DAG every time, we use a vector
in_degree
to store the number of unprocessed predecessors for every node in the DAG. In addition, the nodes with no unprocessed predecessors are kept in a vectorfront_nodes
. Note that we no longer remove nodes from a DAG, instead we consider them as processed and decrease thein_degree
of their successors by 1.Unfortunately, the output of the new method is not identical to the one before, probably because the order of Pauli operators in the Pauli set passed to
single_synthesis_step
may be different (even after sorting by support size). Empirically, this makes the results a bit better in some cases and a bit worse in some others.Speeding up
PauliSet.commute
For some testcases a considerable time is spent is
PauliSet.commute
, due to its callingget_as_vec_bool
and allocating/deallocating vectors. It is significantly faster to just useget_entry
method. In addition, we do not need to return or reason about Pauli phases.Benchmarking
Here are two illustrative benchmarks from
benchpress
:19.4
seconds, new method:0.3
seconds129.1
seconds, new method13.7
seconds(with run arguments:
metric = &Metric::COUNT
,preserve_order = true
,nshuffles = 0
,skip_sort = false
,fix_clifford = false
)Other comments
The
PauliDag
's functionsingle_step_synthesis
now gets the synthesized circuit as an argument, and adds new gates to it directly. I think this makes the code a bit cleaner.One additional small fix is for the function
check_circuit
to correctly account for all-identity Paulis. Note that this checking function is still not quite correct, since it only checks that each Pauli is eventually transformed to a single-qubit rotation, but ignores the commutativity between Paulis.In the future we could extend the synthesis functions to return a full circuit, consisting both of Clifford gates and Pauli operations. This would make Rustiq integration in Qiskit significantly simpler. I actually have the code for this already.
This PR is not based on #2 or #3 (so some of the changed code could introduce new clippy warnings).