Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

LLVM Opaque Pointers feature not compatible with named pointers for Qubit and Result #30

Open
swernli opened this issue Dec 16, 2022 · 6 comments

Comments

@swernli
Copy link
Contributor

swernli commented Dec 16, 2022

The LLVM project has had an ongoing workstream to define behavior for treating all pointers as opaque by default (see Opaque Pointers in the LLVM docs), and the feature is on by default in LLVM 15 with typed pointers being removed in the upcoming LLVM 16. This means the current mechanism used by QIR to identify qubit and result types as distinct from other values via named pointer types will no longer work. For example, the measurement call in the base profile that today looks like this:

call void @__quantum__qis__mz__body(%Qubit* null, %Result* writeonly null)

would become this:

call void @__quantum__qis__mz__body(ptr null, ptr writeonly null)

As noted in the opaque pointers documentation,

Frontends need to be adjusted to track pointee types independently of LLVM, insofar as they are necessary for lowering.

This means that type information for pointers is explicitly and intentionally no longer included in the LLVM IR and meant to inserted when the pointer is read from or written to via load and store instructions. This does not map well to the current design for %Qubit* and %Result* which expects that these values are never loaded or stored and are merely identified as distinct by their named types.

As a possible design solution going forward, the %Qubit and %Result types could be more formally defined as distinct, sized structs, which would allow them to retain their type information in the IR. Correspondingly, the specifications and profiles would need to updated with this definition and the signatures of all relevant APIs updated to reflect that. A similar effort would need to be undertake for other types currently defined as typed pointers such as the %String*, %BigInt*, %Tuple*, and %Array*.

@nikic
Copy link

nikic commented Dec 16, 2022

The "target extension type" concept being introduced in https://reviews.llvm.org/D135202 may be of interest to you. It addresses similar needs that have been encountered with SPIR-V.

@peter-campora
Copy link
Contributor

First thoughts on this.

I wouldn't like seeing: call void @__quantum__qis__mz__body(ptr null, ptr writeonly null)

To someone consuming the code, they can't quickly verify the difference between these two pointers were it not for the writeonly convention. This makes it harder to detect if some codegen bug put incorrect id's into the wrong argument to these function call (for example swapping the two ptr args in the call above would only be detectable as a bug because of the use of writeonly).

To the point about replacing %Qubit* with a sized struct, could we imagine something like: %Qubit = type { i64, ptr } where the underlying ptr could still be used to point to some other type defined by a back-end like: %BackendQubit that a back-end provider knows how to work with when dealing with loads and stores? And similarly for %Result*? It would definitely cause some changes to code-gen since it's not as easy to pass these structs as arguments and identify the unique id as it was using inttoptr instructions. But I think with the right tooling and conventions it may not be too much of an ergonomic overhead. Downside, there may be some performance overhead for people directly linking definitions for these functions when compared with passing a pointer at call-sites?

@swernli
Copy link
Contributor Author

swernli commented Jan 25, 2023

@nikic Thank for pointing out the target extension type! If I'm understanding it correctly, a target extension type can optionally include type information, but if it doesn't, then it can't be introspected. Is that correct? If so, that does seem to match the intended behavior of Qubit and Result. These would then become target("Qubit") and target("Result"). After transformation to static Qubit and result allocation, we'd need a way to identify particular instance. Today that uses inttoptr for named pointer types. Is there a similar strategy we could employ for target extension types? Would it have to use a function call?

@nikic
Copy link

nikic commented Jan 25, 2023

After transformation to static Qubit and result allocation, we'd need a way to identify particular instance. Today that uses inttoptr for named pointer types. Is there a similar strategy we could employ for target extension types? Would it have to use a function call?

So you're currently doing something like inttoptr i64 42 to %Qubit*? In that case yes, I think converting that to something like call target("Qubit") @llvm.qubit.from.int(i64 42) or so would be the right way to represent that.

@schweitzpgi
Copy link

It's been 6 months since the last post on this topic. Any thoughts or recommendations?

@swernli
Copy link
Contributor Author

swernli commented Dec 12, 2024

@bettinaheim Updating this issue with our latest thinking, please feel free to ping the working group as well for their visibility.

We took some time to discuss trade offs and implementation details on this subject and came to some recommendations that I'll update here. On the working group that's been discussing opaque pointer transition (which I've been part of) we were leaning toward using Target Extension Types introduced in LLVM 16 and setting tools to LLVM 16 as a compatibility version since that is the last version that can be configured to still generate typed pointers. However, after some more thinking, @idavis and I wanted to recommend a different approach: just embrace opaque pointers. Our reasoning follows below.

Using Opaque Pointers

Upgrading to a newer LLVM version that uses opaque pointers without replacing the current typed pointers with a different typed alternative means that LLVM IR from version 14 or before that reads:
image
would instead make use of the “ptr” type and read:
image

Pros

  • Backward compatible: newer LLVM versions can consume both bitcode (.bc) and textual IR (.ll) with typed pointers and convert each instance to “ptr” automatically, so tooling expecting opaque pointers in spots that used to have typed pointers will handle older QIR with typed pointers without any special handling required.
  • Static identifier patterns remain the same: places where a QIR program uses an inline “inttoptr” instruction to convert static IDs into typed pointers have the same pattern with opaque pointers, avoiding the need to correlate integer constants from elsewhere in the code to their usage as arguments.
  • Follows existing upgrade guidance: Any clients wishing to adjust to opaque pointers can use guidance provided by the LLVM specification on how to shift code (and expectations) to opaque pointers, so no special/additional guidance is required in the QIR spec (though it may still be helpful to include pointers).
  • Seamless use of compiler tools: Because opaque pointers are the current standard, existing compiler tools (and runtime libraries) work off-the-shelf without need for extra transformation of input QIR or custom compiler updates.

Cons

  • No function agnostic analysis: Since qubits and results will not be identifiable by type, any analysis or transformations must infer the type from how the argument is used in function calls, requiring recognition of those function calls by name and their arguments by order.

Using a Typed Alternative

As an alternative, the QIR specification and tooling could be changed to express Qubit and/or Results using a different, non-pointer type that allows for identifying them by type. One such approach explored by the workstream on opaque pointers is using LLVM Target Extension Types (introduced in LLVM 16). This would allow programs to indicate qubits in a target-specific type with unknown size with syntax like:
image
While the elements of this design have not been fully proposed, the participants in the ongoing workstream have discussed some of the trade-offs.

Pros

  • Allows for type-based analysis: tools processing the QIR can perform analysis or transformation based on the type information even if the code includes calls to custom functions.
  • Prevents pointer-specific operations on identifiers: since Qubit IDs are meant to be opaque, static values, using a restrictive type can prevent using them in other APIs or built-in instructions.

Cons

  • Incompatible with existing tools: Using a restrictive type also prevents some use of off-the-shelf compiler tools (such as clang).
  • Not backward compatible: The change in type signature is a breaking change that prevents consuming older, typed-pointer QIR or using existing QIR library implementations, forcing updates across the whole ecosystem.
  • Requires use of new runtime functions: To generate a new, restrictive type, the QIR would need to include calls to runtime functions that return that type, which is a major change from the inline identifiers used with pointers.

Proposal: Embrace Opaque Pointers

Based on the trade-offs, we propose embracing opaque pointers as a means for upgrade to newer LLVM versions. This does not preclude a later switch to stricter types, and the QIR Alliance workstream can continue to explore those options. It does allow a reasonable upgrade process, backward compatibility, and continued use of off-the-shelf compiler tools. The trade-off of losing access to function agnostic analysis seems reasonable in the short term as no major examples of these kinds of analysis passes have been deployed, so the breaking change should be minimal.

From there, the specification updates would be fairly straightforward: instances of %Qubit* and %Result* in examples can be change to ptr and the paragraphs that explain the motivation behind the typed pointers can be replaced with short explanation of the use of ptr to represent opaque identifiers.

In addition, a paragraph or dedicated page can be added to explain the backward compatibility story. Rather than expecting users to pick a version of tools like PyQIR where both typed and opaque pointers are supported, they should simply use the tools that produce the output they require. Since LLVM 18 and 19 can still consume IR and bitcode with typed pointers and automatically convert it to opaque in memory, a user who wishes to just consume QIR should use the latest version of the tools. A user who needs to generate QIR should choose either older or newer versions of their tool based on whether they need to produce typed pointer output or opaque pointer output. This table helps explain the compatibility story:

Input Output Tooling
Typed Pointer QIR Typed Pointer QIR Use LLVM 14 QIR tools
Typed Pointer QIR Opaque Pointer QIR Use latest LLVM QIR tools
Opaque Pointer QIR Opaque Pointer QIR Use latest LLVM QIR tools
Opaque Pointer QIR Typed Pointer QIR NOT SUPPORTED

The only scenario that is not supported is "downgrading" QIR from opaque pointer to typed pointers, which seems like a reasonable compatibility restriction. With this approach, tools like qir-runner that only consume QIR can be updated first with no dependency or breaking while tools like PyQIR can have newer versions released where parsing supports both pointer styles but QIR generation is tied to the version of LLVM (fixes and updates would need to be backported to the older PyQIR version as appropriate). As an example of what the PyQIR changes could look like, see the changes in the swernli/llvm19 working branch. A backend that consumes QIR using PyQIR could then update to the latest LLVM version for that and their other tooling while maintaining compatibility with older clients producing typed pointers. Meanwhile, a given client producing QIR would remain on typed pointers (or provide a configurable switch) until all the consumers they submit to are ready for opaque pointers and then change over to producing only opaque pointer QIR.

For the ecosystem, embracing opaque pointers should unlock the ability to use newer LLVM tools with the latest features and fixes without requiring the larger set of changes and development required for a typed alternative.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants