Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Add Rust-based OpenQASM 2 converter (Qiskit#9784)
* Add Rust-based OpenQASM 2 converter This is a vendored version of qiskit-qasm2 (https://pypi.org/project/qiskit-qasm2), with this initial commit being equivalent (barring some naming / documentation / testing conversions to match Qiskit's style) to version 0.5.3 of that package. This adds a new translation layer from OpenQASM 2 to Qiskit, which is around an order of magnitude faster than the existing version in Python, while being more type safe (in terms of disallowing invalid OpenQASM 2 programs rather than attempting to construction `QuantumCircuit`s that are not correct) and more extensible. The core logic is a hand-written lexer and parser combination written in Rust, which emits a bytecode stream across the PyO3 boundary to a small Python interpreter loop. The main bulk of the parsing logic is a simple LL(1) recursive-descent algorithm, which delegates to more specific recursive Pratt-based algorithm for handling classical expressions. Many of the design decisions made (including why the lexer is written by hand) are because the project originally started life as a way for me to learn about implementations of the different parts of a parser stack; this is the principal reason there are very few external crates used. There are a few inefficiencies in this implementation, for example: - the string interner in the lexer allocates twice for each stored string (but zero times for a lookup). It may be possible to completely eliminate allocations when parsing a string (or a file if it's read into memory as a whole), but realistically there's only a fairly small number of different tokens seen in most OpenQASM 2 programs, so it shouldn't be too big a deal. - the hand-off from Rust to Python transfers small objects frequently. It might be more efficient to have a secondary buffered iterator in Python space, transferring more bytecode instructions at a time and letting Python resolve them. This form could also be made asynchronous, since for the most part, the Rust components only need to acquire the CPython GIL at the API boundary. - there are too many points within the lexer that can return a failure result that needs unwrapping at every site. Since there are no tokens that can span multiple lines, it should be possible to refactor so that almost all of the byte-getter and -peeker routines cannot return error statuses, at the cost of the main lexer loop becoming responsible for advancing the line buffer, and moving the non-ASCII error handling into each token constructor. I'll probably keep playing with some of those in the `qiskit-qasm2` package itself when I have free time, but at some point I needed to draw the line and vendor the package. It's still ~10x faster than the existing one: In [1]: import qiskit.qasm2 ...: prog = """ ...: OPENQASM 2.0; ...: include "qelib1.inc"; ...: qreg q[2]; ...: """ ...: prog += "rz(pi * 2) q[0];\ncx q[0], q[1];\n"*100_000 ...: %timeit qiskit.qasm2.loads(prog) ...: %timeit qiskit.QuantumCircuit.from_qasm_str(prog) 2.26 s ± 39.5 ms per loop (mean ± std. dev. of 7 runs, 1 loop each) 22.5 s ± 106 ms per loop (mean ± std. dev. of 7 runs, 1 loop each) `cx`-heavy programs like this one are actually the ones that the new parser is (comparatively) slowest on, because the construction time of `CXGate` is higher than most gates, and this dominates the execution time for the Rust-based parser. * Work around docs failure on Sphinx 5.3, Python 3.9 The version of Sphinx that we're constrained to use in the docs build can't handle the `Unpack` operator, so as a temporary measure we can just relax the type hint a little. * Remove unused import * Tweak documentation * More specific PyO3 usage * Use PathBuf directly for paths * Format * Freeze dataclass * Use type-safe id types This should have no impact on runtime or on memory usage, since each of the new types has the same bit width and alignment as the `usize` values they replace. * Documentation tweaks * Fix comments in lexer * Fix lexing version number with separating comments * Add test of pathological formatting * Fixup release note * Fix handling of u0 gate * Credit reviewers Co-authored-by: Luciano Bello <[email protected]> Co-authored-by: Kevin Hartman <[email protected]> Co-authored-by: Eric Arellano <[email protected]> * Add test of invalid gate-body statements * Refactor custom built-in gate definitions The previous system was quite confusing, and required all accesses to the global symbol table to know that the `Gate` symbol could be present but overridable. This led to confusing logic, various bugs and unnecessary constraints, such as it previously being (erroneously) possible to provide re-definitions for any "built-in" gate. Instead, we keep a separate store of instructions that may be redefined. This allows the logic to be centralised to only to the place responsible for performing those overrides, and remains accessible for error-message builders to query in order to provide better diagnostics. * Credit Sasha Co-authored-by: Alexander Ivrii <[email protected]> * Credit Matthew Co-authored-by: Matthew Treinish <[email protected]> * Remove dependency on `lazy_static` For a hashset of only 6 elements that is only checked once, there's not really any point to pull in an extra dependency or use a hash set at all. * Update PyO3 version --------- Co-authored-by: Luciano Bello <[email protected]> Co-authored-by: Kevin Hartman <[email protected]> Co-authored-by: Eric Arellano <[email protected]> Co-authored-by: Alexander Ivrii <[email protected]> Co-authored-by: Matthew Treinish <[email protected]>
- Loading branch information