Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Costs] Framework #913

Merged
merged 15 commits into from
May 15, 2024
16 changes: 15 additions & 1 deletion qualtran/_infra/bloq.py
Original file line number Diff line number Diff line change
Expand Up @@ -40,7 +40,7 @@
from qualtran.cirq_interop import CirqQuregT
from qualtran.cirq_interop.t_complexity_protocol import TComplexity
from qualtran.drawing import WireSymbol
from qualtran.resource_counting import BloqCountT, GeneralizerT, SympySymbolAllocator
from qualtran.resource_counting import BloqCountT, CostKey, GeneralizerT, SympySymbolAllocator
from qualtran.simulation.classical_sim import ClassicalValT


Expand Down Expand Up @@ -296,6 +296,20 @@ def build_call_graph(self, ssa: 'SympySymbolAllocator') -> Set['BloqCountT']:
"""
return self.decompose_bloq().build_call_graph(ssa)

def my_static_costs(self, cost_key: 'CostKey'):
"""Override this method to provide static costs.

The system will query a particular cost by asking for a `cost_key`. This method
can optionally provide a value, which will be preferred over a computed cost.

Static costs can be provided if the particular cost cannot be easily computed or
as a performance optimization.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One issue here is that my_static_costs method does not have access to the costs cache. So, if it were to do any non-trivial computation where it needs to compute its cost in terms of the cost of its callees, then it wouldn't be able to pass on the cache to get_cost_value method.

We had faced this issue with TComplexity protocol and that's why we ended up using a global cache. In this framework, we should maybe accept a cache dictionary as another argument to this method since the call site already has it ?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ah the intent here is for my_static_costs to be used rarely and only for truly static costs that you can just declare. Consider it the generic analog to the _t_complexity_ override where you're really only allowed to fill in numbers.

The cache is available to the compute method, which is analogous to every strategy except the t_complexity._from_explicit_annotation strategy in the t complexity protocol.

I'll update some docstrings to clarify the expected use of my_static_costs.

Copy link
Collaborator

@tanujkhattar tanujkhattar May 9, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I understand that, but I think there's value in supporting the use cases where my_static_costs needs to do something more complex than specifying just numbers.

An example would be supporting the cost key for circuit depth. The compute method would correspond to calling bloq.decompose_bloq() and then (one strategy would be to) find the longest path in a weighted DAG where the weights are given by depth of each subbloq. This will take time at least O(N) if the decomposition has N nodes (unlike other costs like T-complexity, which scale as O(M) where M is number of unique callees and often M << N)

Now, if we stick to the strategy where my_static_costs specifies just numbers and compute works as described above, the depth computation would be prohibitively slow for a wide variety of bloqs where decomposing the bloq would result in a large number of nodes. Some concrete examples are bloqs based on unary iteration, arithmetic circuits like adders etc.

In all these examples, as bloq authors we very well know the circuit structure and can easily annotate my_static_costs. But it's not enough to specify "a number" - we need the my_static_costs to be an expression in terms of costs of its callees. For unary iteration based bloqs, the my_static_costs would be something like c * N + sum(get_bloq_cost(bloq_for_nth_operation(n)) for n in range(self.selection_range)) - i.e. a constant for the unary iteration tree + depth for each controlled operation. For arithmetic circuits, we'd likely want to specify costs as a function of get_cost(Toffoli); which users can choose to specify by providing the cost dictionary since a Toffoli can have multiple different decompositions optimizing depth vs T-counts etc. and users can choose which decomposition should be used by providing a cost for depth of Toffoli.

Tl;Dr - I think its restrictive to assume that my_static_costs would only ever specify truly static costs and I think there's value in making it extensible to support the use case where it'll recursively need and use costs for it's callees.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the idea is to group any of this logic in the compute method, which can dispatch based on the type of bloq (analogous to how we provide costs for leaf bloqs in the existing t_complexity protocol).

We should also aim to reduce the number of bloqs where doing an O(N) operation is considered prohibitively costly. In your example, you still need to iterate over self.selection_range bloqs. I guess the presumption is that the constant term absorbs a number of bit-twiddling subbloqs that meaningfully exceeds the selection_range; but we should probably drive our efforts towards encoding the "simple" structure of unary iteration in the bloq decomposition hierarchy

Copy link
Collaborator

@tanujkhattar tanujkhattar May 9, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the idea is to group any of this logic in the compute method, which can dispatch based on the type of bloq (analogous to how we provide costs for leaf bloqs in the existing t_complexity protocol).

Yeah, I think that would just not be feasible for the cost types that need to look at the decompose composite bloq instead of just the bloq counts. I've raised this concern before and I provide a very concrete exampel above.

In your example, you still need to iterate over self.selection_range bloqs. I guess the presumption is that the constant term absorbs a number of bit-twiddling subbloqs that meaningfully exceeds the selection_range

For this specific example, the presumption is that building the composite bloq can be much slower than iterating over the callees, even if both of these take O(N) time. #609 (comment) was an example of how much of a different this can make.

We should also aim to reduce the number of bloqs where doing an O(N) operation is considered prohibitively costly

Well, that's a very optimistic goal and I don't think it aligns well with the rest of the philosophy of Qualtran. One of the reasons we have the concept of a call graph is because we think there will be cases where constructing a composite bloq in O(N) time would be prohibitively more expensive than just specifying a list of callees and their bloq counts.

The same logic applies to costs that depend upon the structure of the decomposed composite bloq, like depth. There will be cases where computing the composite bloq and then processing it to figure out the cost would be prohibitively more expensive than writing an expression for computing the cost in terms of the callees. For costs like T-complexity, this "expression" is simple and can be generalized and put as part of the compute method but for costs like depth this "expression" needs to live inside the bloq. That place can be my_static_costs (or we can rename it to be more descriptive for this use case). But as the design stands right now, we don't support this flow at all where users can write the expression for computing a cost for a bloq (in terms of its callees) as part of the bloq itself.

For some more concrete numbers, here are the timings to construct a composite bloq for a 1000 and 5000 bit adder. You can see this is already getting very expensive and these are numbers which we encounter often.

image

So now, if a user were to compute the cost of an adder for depth, they would either not override my_static_costs and we'll pay the time penalty of calling adder.decompose_bloq() (which is huge) OR the user would need to encode an expression that hardcodes the depth of subbloqs like AND gates; which isn't ideal and is similar to what we were doing in T-complexity and got rid of by introducing the call graph hierarchy. The ideal solution here would be to allow users to specify things like adder_depth = c1 * cost(And()) + c2 * cost(And().adjoint()) + c3 * cost(CNOT()) where c1, c2 and c3 are constants which are known to the bloq authors because they know the structure of the decomposition. A function similar to my_static_costs can be used for such annotations.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You can put that logic in the compute method.

def compute():
  if isisntance(bloq, Add):
    return c1 * cost(And())

  return sum(depth(b) for b in b.decompose())

Remember: the extensibility axis is to support adding more costs. This is particularly acute for circuit depth, which isn't the most important cost for certain architectures.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You can put that logic in the compute method.

This means every time a bloq author writes a bloq which is slow to decompose for large instances, they would have to update the compute method of the cost key - this does not scale at all (unless we assume all bloqs are authored by us and we maintain a central repository of these if / else overloads in the compute method of the cost key).

Remember: the extensibility axis is to support adding more costs. This is particularly acute for circuit depth, which isn't the most important cost for certain architectures

The argument you have given above completely ignores the ease of extensibility for more costs. Since the PR is already merged now, I'll open a new issue to track this so that whenever we end up implementing a cost like depth which depends upon the structure of decomposition, we can revisit this discussion.


This method must return `NotImplemented` if a value cannot be provided for the specified
CostKey.
"""
return NotImplemented

def call_graph(
self,
generalizer: Optional[Union['GeneralizerT', Sequence['GeneralizerT']]] = None,
Expand Down
61 changes: 61 additions & 0 deletions qualtran/bloqs/for_testing/costing.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,61 @@
# Copyright 2024 Google LLC
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# https://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
from typing import Any, Sequence, Set, Tuple

from attrs import field, frozen

from qualtran import Bloq, Signature
from qualtran.resource_counting import BloqCountT, CostKey, SympySymbolAllocator


def _convert_callees(callees: Sequence[BloqCountT]) -> Tuple[BloqCountT, ...]:
# Convert to tuples in a type-checked way.
return tuple(callees)


@frozen
class CostingBloq(Bloq):
"""A bloq that lets you set the costs via attributes."""

name: str
num_qubits: int
callees: Sequence[BloqCountT] = field(converter=_convert_callees, factory=tuple)
static_costs: Sequence[Tuple[CostKey, Any]] = field(converter=tuple, factory=tuple)

@property
def signature(self) -> 'Signature':
return Signature.build(register=self.num_qubits)

def build_call_graph(self, ssa: 'SympySymbolAllocator') -> Set['BloqCountT']:
return set(self.callees)

def my_static_costs(self, cost_key: 'CostKey'):
return dict(self.static_costs).get(cost_key, NotImplemented)

def pretty_name(self):
return self.name

def __str__(self):
return self.name


def make_example_costing_bloqs():
from qualtran.bloqs.basic_gates import Hadamard, TGate, Toffoli

func1 = CostingBloq(
'Func1', num_qubits=10, callees=[(TGate(), 10), (TGate().adjoint(), 10), (Hadamard(), 10)]
)
func2 = CostingBloq('Func2', num_qubits=3, callees=[(Toffoli(), 100)])
algo = CostingBloq('Algo', num_qubits=100, callees=[(func1, 1), (func2, 1)])
return algo
32 changes: 32 additions & 0 deletions qualtran/bloqs/for_testing/costing_test.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,32 @@
# Copyright 2024 Google LLC
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# https://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.

from qualtran.bloqs.for_testing.costing import make_example_costing_bloqs
from qualtran.resource_counting import format_call_graph_debug_text


def test_costing_bloqs():
algo = make_example_costing_bloqs()
g, _ = algo.call_graph()
assert (
format_call_graph_debug_text(g)
== """\
Algo -- 1 -> Func1
Algo -- 1 -> Func2
Func1 -- 10 -> Hadamard()
Func1 -- 10 -> TGate()
Func1 -- 10 -> TGate(is_adjoint=True)
Func2 -- 100 -> Toffoli()
Toffoli() -- 4 -> TGate()"""
)
2 changes: 1 addition & 1 deletion qualtran/bloqs/phase_estimation/lp_resource_state.py
Original file line number Diff line number Diff line change
Expand Up @@ -20,7 +20,7 @@
import cirq
import numpy as np
import sympy
from numpy._typing import NDArray
from numpy.typing import NDArray

from qualtran import (
Bloq,
Expand Down
4 changes: 3 additions & 1 deletion qualtran/resource_counting/__init__.py
Original file line number Diff line number Diff line change
Expand Up @@ -25,8 +25,10 @@
SympySymbolAllocator,
get_bloq_callee_counts,
get_bloq_call_graph,
print_counts_graph,
build_cbloq_call_graph,
format_call_graph_debug_text,
)

from ._costing import GeneralizerT, get_cost_value, get_cost_cache, query_costs, CostKey, CostValT

from . import generalizers
15 changes: 9 additions & 6 deletions qualtran/resource_counting/_call_graph.py
Original file line number Diff line number Diff line change
Expand Up @@ -14,7 +14,7 @@

"""Functionality for the `Bloq.call_graph()` protocol."""

import collections.abc as abc
import collections.abc
from collections import defaultdict
from typing import Callable, Dict, List, Optional, Sequence, Set, Tuple, Union

Expand Down Expand Up @@ -231,7 +231,7 @@ def get_bloq_call_graph(
keep = lambda b: False
if generalizer is None:
generalizer = lambda b: b
if isinstance(generalizer, abc.Sequence):
if isinstance(generalizer, collections.abc.Sequence):
generalizer = _make_composite_generalizer(*generalizer)

g = nx.DiGraph()
Expand All @@ -243,8 +243,11 @@ def get_bloq_call_graph(
return g, sigma


def print_counts_graph(g: nx.DiGraph):
def format_call_graph_debug_text(g: nx.DiGraph) -> str:
"""Print the graph returned from `get_bloq_counts_graph`."""
for b in nx.topological_sort(g):
for succ in g.succ[b]:
print(b, '--', g.edges[b, succ]['n'], '->', succ)
lines = []
for gen in nx.topological_generations(g):
for b in sorted(gen, key=str):
for succ in sorted(g.succ[b], key=str):
lines.append(f"{b} -- {g.edges[b, succ]['n']} -> {succ}")
return '\n'.join(lines)
Loading
Loading