Speedup Pauli network synthesis code #4

alexanderivrii · 2024-10-19T20:16:04Z

This PR includes several low-hanging performance improvements for pauli network synthesis.

Reasoning about PauliDag's front layer

Instead of recalculating the front layer in the DAG every time, we use a vector in_degree to store the number of unprocessed predecessors for every node in the DAG. In addition, the nodes with no unprocessed predecessors are kept in a vector front_nodes. Note that we no longer remove nodes from a DAG, instead we consider them as processed and decrease the in_degree of their successors by 1.

Unfortunately, the output of the new method is not identical to the one before, probably because the order of Pauli operators in the Pauli set passed to single_synthesis_step may be different (even after sorting by support size). Empirically, this makes the results a bit better in some cases and a bit worse in some others.

Speeding up `PauliSet.commute`

For some testcases a considerable time is spent is PauliSet.commute, due to its calling get_as_vec_bool and allocating/deallocating vectors. It is significantly faster to just use get_entry method. In addition, we do not need to return or reason about Pauli phases.

Benchmarking

Here are two illustrative benchmarks from benchpress:

test 35: old method: 19.4 seconds, new method: 0.3 seconds
test 66: old method: 129.1 seconds, new method 13.7 seconds
(with run arguments: metric = &Metric::COUNT, preserve_order = true, nshuffles = 0, skip_sort = false, fix_clifford = false)

Other comments

The PauliDag's function single_step_synthesis now gets the synthesized circuit as an argument, and adds new gates to it directly. I think this makes the code a bit cleaner.

One additional small fix is for the function check_circuit to correctly account for all-identity Paulis. Note that this checking function is still not quite correct, since it only checks that each Pauli is eventually transformed to a single-qubit rotation, but ignores the commutativity between Paulis.

In the future we could extend the synthesis functions to return a full circuit, consisting both of Clifford gates and Pauli operations. This would make Rustiq integration in Qiskit significantly simpler. I actually have the code for this already.

This PR is not based on #2 or #3 (so some of the changed code could introduce new clippy warnings).

alexanderivrii · 2024-10-21T13:44:43Z

Here are some preliminary benchmarking results on my laptop for the 100 benchpress Hamiltonians
rustiq_changes.xlsx

alexanderivrii added 3 commits October 19, 2024 22:47

speedup pauli dag code

e2b9885

more perfromance imrpovements

66da740

git ignore

e29b256

alexanderivrii changed the title ~~speedup pauli dag code~~ Speedup Pauli network synthesis code Oct 20, 2024

fix to remove rotations synthesized before any synthesis steps

7924f69

alexanderivrii mentioned this pull request Oct 31, 2024

Further speedup pauli network synthesis algorithms #5

Merged

alexanderivrii added 4 commits November 6, 2024 13:27

Merge branch 'main' into speedup_pauli_dag

ac189af

fix clippy complaints

f3a734e

Merge branch 'main' into speedup_pauli_dag

b2985a1

fix clippy warnings

2ad60fa

smartiel merged commit 5fc61ce into smartiel:main Nov 6, 2024
3 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Speedup Pauli network synthesis code #4

Speedup Pauli network synthesis code #4

alexanderivrii commented Oct 19, 2024 •

edited

Loading

alexanderivrii commented Oct 21, 2024

Speedup Pauli network synthesis code #4

Speedup Pauli network synthesis code #4

Conversation

alexanderivrii commented Oct 19, 2024 • edited Loading

Reasoning about PauliDag's front layer

Speeding up PauliSet.commute

Benchmarking

Other comments

alexanderivrii commented Oct 21, 2024

alexanderivrii commented Oct 19, 2024 •

edited

Loading

Speeding up `PauliSet.commute`