-
Notifications
You must be signed in to change notification settings - Fork 1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Avoid copying unnecessary buffers between simulation iterations #4789
Conversation
cirq-core/cirq/sim/act_on_args.py
Outdated
args._log_of_measurement_results = self.log_of_measurement_results.copy() | ||
return args | ||
|
||
def _on_copy(self: TSelf, args: TSelf): | ||
def _on_copy(self: TSelf, args: TSelf, with_buffer: bool = True): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since this is meant to be overridden, I'm not sure we can add a parameter to it without breaking external implementations of ActOnArgs, unless there's some Python magic that allows it. We may have to add a separate _on_copy_reusing_buffer
(which has a default implementation calling _on_copy()
), and have copy
call the appropriate one according to the input.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also I think I prefer the arg name reuse_buffer
with default value False.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree that adding a parameter will break external implementations of ActOnArgs
, but adding a new method has drawbacks, too. The most challenging part is to connect the new method with copy
. I noticed that you have updated the issue. So do you now support the way of adding a parameter to copy
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, that's a tough one. Technically changing OperationTarget.copy
's signature is a breaking change too, though I never intended OperationTarget
to be implemented directly by third parties.
I'd probably go about it the way you've done so far. But in ActOnArgs.copy
, I'd suggest adding an if 'reuse_buffer' in inspect.signature(_on_copy)
check, then either pass or don't pass the argument accordingly. Then also if reuse_buffer
is not in the signature, emit a deprecation warning. We could do the same thing when calling a.copy(reuse_buffer)
in ActOnArgsContainer if we want to be especially cautious. (Sorry about changing the issue mid-flight, I didn't think anybody would be actively working on it).
@95-martin-orion do you have an opinion here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@daxfohl I am just trying to learn new things, and I have learned a lot so far! Thanks for your patient explanations. I will look into the suggestions you list here and in the issue.
target.target_tensor = self.target_tensor.copy() | ||
if with_buffer: | ||
if reuse_buffer: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if not reuse_buffer
here and above
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My understanding is: if the buffer is copied to the new object, then it can be "reused"; otherwise, the object might need to create its own buffer (from scratch). The point is, the context of the parameter is only the copy
method. Its semantics should not be attached to a specific usage such as _run
. What do you think?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In act_on_args.copy in line 118, it does a shallow copy, meaning that if we don't do anything, then the buffer instance itself is not copied. Thus if reuse_buffer
is set here, that means we don't have to do anything else. If reuse_buffer
is false, then that means the calling function wants to forcibly deep-copy the buffer. So this condition needs changed to if not reuse_buffer
.
I'm not quite sure I catch the question about semantics, but whether-or-not-to-copy-the-buffer is context-dependent, not data-structure-dependent. If the calling function is running repetitions serially (as _run
does), then the buffer can be safely reused from one repetition to another. If the user is running them in parallel, then all referencing the same buffer is obviously unsafe. So it depends on the use case.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looking good, a few more comments
@@ -68,7 +68,7 @@ def apply_operation(self, op: 'cirq.Operation'): | |||
protocols.act_on(op, self) | |||
|
|||
@abc.abstractmethod | |||
def copy(self: TSelfTarget) -> TSelfTarget: | |||
def copy(self: TSelfTarget, reuse_buffer: bool = False) -> TSelfTarget: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Make sure to add Args to the docstring with an explanation of reuse_buffer
. (Not just "True to reuse the buffer". Something like "If True, any buffers will be reused by the copy. This will save time by avoiding a deep copy of the buffers, but should only be used when there is no chance that the two objects will be writing to the buffers simultaneously.")
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have added the paragraph to the docstring, but I do not fully understand it. It seems that there are no buffers shared between the iterations of the simulation.
Cirq/cirq-core/cirq/sim/simulator_base.py
Lines 268 to 278 in b60b9f8
for i in range(repetitions): | |
all_step_results = self._core_iterator( | |
general_suffix, | |
sim_state=act_on_args.copy() if i < repetitions - 1 else act_on_args, | |
) | |
for step_result in all_step_results: | |
pass | |
for k, v in step_result.measurements.items(): | |
if k not in measurements: | |
measurements[k] = [] | |
measurements[k].append(np.array(v, dtype=np.uint8)) |
I cannot find where the shared buffer is stored. It is not in ActOnDensityMatrixArgs
because not copying the buffer does not affect the simulation. And it is not in DensityMatrixSimulator
either.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this comment old? You have updated the function in simulator_base
to apply the resuse_buffers=True
option.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We have not discussed it yet and I still hold the same opinion. I agree that without copying buffers would improve performance. I am just not sure about the name reuse_buffer
and the explanation in the docstring.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure, how would you phrase it? My wording probably wasn't the best.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we say skip_buffer
? The docstring can be something like True to skip copying buffers
. I think we can just describe the behavior related to the parameter instead of the purpose.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm, that sounds to me like it would just set the new object buffer to null. But I think I understand your point in that reuse_buffers could be confusing too. Maybe something more direct like shallow_copy_buffers?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, I support a more direct way. But, is the buffer shallow copied? I only write
if not reuse_buffer:
target.available_buffer = [b.copy() for b in self.available_buffer]
in act_on_density_matrix_args.py
. Is this what you expected? I did not assign anything to target.available_buffer
if reuse_buffer
is True
, because I think it will simply not be used.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Well by "shallow copied", I mean the reference to available_buffer
is copied, which occurs in ActOnArgs.copy
. So documentation-wise, I guess it depends on whether you're viewing it from the perspective of the user who is calling copy
or the perspective of the user who is implementing a new subclass of ActOnArgs
.
Maybe clearer in both situations would be to call it deep_copy_buffers
, and default that to True. I always prefer naming booleans such that the default is False, but perhaps it's more clear to use deep_copy_buffers=True
as default here. (There are plenty of other places where we default a boolean to True, so it wouldn't be an oddball or anything). WDYT?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Agree. I will update the code soon.
cirq-core/cirq/sim/act_on_args.py
Outdated
"""Creates a copy of the object.""" | ||
args = copy.copy(self) | ||
self._on_copy(args) | ||
self._on_copy(args, reuse_buffer) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Make sure to do a if 'reuse_buffer' in inspect.signature(_on_copy)
check here for backwards compatibility, and output a deprecation warning in the else
branch.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have added the deprecation warning and a test for it, but I am not sure whether I am doing it in the right way. I can continue improving it based on your feedback.
cirq-core/cirq/sim/simulator_base.py
Outdated
@@ -268,7 +268,7 @@ def _run( | |||
for i in range(repetitions): | |||
all_step_results = self._core_iterator( | |||
general_suffix, | |||
sim_state=act_on_args.copy() if i < repetitions - 1 else act_on_args, | |||
sim_state=act_on_args.copy(True) if i < repetitions - 1 else act_on_args, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Prefer named arg style .copy(reuse_buffer=True)
, for boolean args.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see. The code is changed
Also remove buffer from __repr__
@daxfohl I think I have implemented all the points you mentioned in this PR and in the issue. Please let me know if I missed something. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good so far, just some test changes left I think. BTW you can ignore the part of the issue where it says don't emit the buffer in the repr. I'm still not sure about that one.
@@ -226,6 +228,12 @@ def test_copy_succeeds(): | |||
assert copied.qubits == (q0, q1) | |||
|
|||
|
|||
def test_copy_deprecation_warning(): | |||
args = create_container(qs2, False) | |||
with pytest.warns(DeprecationWarning, match='reuse_buffer'): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
use with cirq.testing.assert_deprecated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Got it. The code has been updated.
Also add back buffer to __repr__.
Also remove unused import
Thanks. I have changed the way of testing and added back the buffer in |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think these look good and will be a nice improvement. Just two small things remaining. I'm curious about your earlier comment though https://github.com/quantumlib/Cirq/pull/4789/files#r778747253. Am I missing something?
self.available_buffer = available_buffer | ||
if qid_shape is None: | ||
target_shape = target_tensor.shape | ||
assert len(target_shape) % 2 == 0 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This should raise ValueError
. (Typically users wouldn't call this constructor directly so assertion kind of makes sense here, but it's still a public function of a public class, so raising explicit errors is preferred).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Agree. It has been updated in the new commit. A test case is added as well.
@@ -68,7 +68,7 @@ def apply_operation(self, op: 'cirq.Operation'): | |||
protocols.act_on(op, self) | |||
|
|||
@abc.abstractmethod | |||
def copy(self: TSelfTarget) -> TSelfTarget: | |||
def copy(self: TSelfTarget, reuse_buffer: bool = False) -> TSelfTarget: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this comment old? You have updated the function in simulator_base
to apply the resuse_buffers=True
option.
cirq-core/cirq/sim/act_on_args.py
Outdated
warnings.warn( | ||
( | ||
'A new parameter reuse_buffer has been added to ActOnArgs._on_copy(). ' | ||
'The classes that inherit from ActOnArgs should support it before Cirq 0.25.' |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
0.15 should be sufficient everywhere :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Changed.
@daxfohl I have updated the code. There seems something wrong with the CI system. I hope the errors can automatically go away when merging a new commit from the master branch. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Mostly happy to defer to Dax's review on this, as he's far more familiar with the innards of Simulator than I am at this point. However, I have a couple of items to check.
@@ -226,6 +227,12 @@ def test_copy_succeeds(): | |||
assert copied.qubits == (q0, q1) | |||
|
|||
|
|||
def test_copy_deprecation_warning(): | |||
args = create_container(qs2, False) | |||
with cirq.testing.assert_deprecated('reuse_buffer', deadline='0.25'): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should this (and other deadlines) be 0.15
instead?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure, I missed them.
cirq-core/cirq/sim/act_on_args.py
Outdated
@@ -113,14 +115,34 @@ def _perform_measurement(self, qubits: Sequence['cirq.Qid']) -> List[int]: | |||
"""Child classes that perform measurements should implement this with | |||
the implementation.""" | |||
|
|||
def copy(self: TSelf) -> TSelf: | |||
"""Creates a copy of the object.""" | |||
def copy(self: TSelf, reuse_buffer: bool = False) -> TSelf: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There's definitely a breaking change here, in that any external subclass of ActOnArgs
(if one exists...) with its own definition of copy
will break when copy
is called on it by Cirq code that uses both arguments.
In hindsight, the right way to prevent this probably involves private classes (which Python is allergic to), but given how heavily internal this code is I think we can get away with this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We managed to make this non breaking with some reflection. Hence the test that still works. Warning messages are appropriately emitted when the new argument does not exist (and there are tests around that as well).
This could be a good reference for doing such things.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, if we check the signature before actually calling the copy
(like the case in SimulatorBase._run
), we can avoid errors.
def test_copy_deprecation_warning(): | ||
args = create_container(qs2, False) | ||
with cirq.testing.assert_deprecated('reuse_buffer', deadline='0.25'): | ||
args.copy(True) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Showing my python ignorance here, but how/why does this call work? EmptyActOnArgs.copy
doesn't accept an argument.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It is because args
is a ActOnArgsContainer
and in ActOnArgsContainer.copy
:
if 'reuse_buffer' in inspect.signature(act_on_args.copy).parameters:
copies[act_on_args] = act_on_args.copy(reuse_buffer)
else:
warnings.warn(
(
'A new parameter reuse_buffer has been added to ActOnArgs.copy(). The '
'classes that inherit from ActOnArgs should support it before Cirq 0.15.'
),
DeprecationWarning,
)
copies[act_on_args] = act_on_args.copy()
So, it does not directly call EmptyActOnArgs.copy
.
available_buffer: List[np.ndarray] = None, | ||
qid_shape: Tuple[int, ...] = None, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These need their types changes to Optional[...]
. (Not sure why mypy didn't catch this...)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was curious about this too. It's explained in the PEP. It works implicitly for now but is no longer recommended, and tooling will soon be changed to catch these. https://www.python.org/dev/peps/pep-0484/#union-types
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry, I will change them...
@@ -40,7 +40,7 @@ class ActOnStateVectorArgs(ActOnArgs): | |||
def __init__( | |||
self, | |||
target_tensor: np.ndarray, | |||
available_buffer: np.ndarray, | |||
available_buffer: np.ndarray = None, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As above, this needs to be Optional[np.ndarray]
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Changed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, just update docstring and it should be ready to merge.
cirq-core/cirq/sim/act_on_args.py
Outdated
"""Creates a copy of the object. | ||
|
||
Args: | ||
deep_copy_buffers: If True, buffers will also be copied. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's just be more thorough -- If True, buffers will also be deep-copied. Otherwise the copy will share a reference to the original object's buffers.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I finally understand the word reference and reuse now! Please check my last commit. I think it is what you intended to see. Sorry that I missed your point all this long time!
@@ -68,7 +68,7 @@ def apply_operation(self, op: 'cirq.Operation'): | |||
protocols.act_on(op, self) | |||
|
|||
@abc.abstractmethod | |||
def copy(self: TSelfTarget) -> TSelfTarget: | |||
def copy(self: TSelfTarget, deep_copy_buffers: bool = True) -> TSelfTarget: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Update the docstring here too since it's the base interface.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Updated.
LGTM, cc @95-martin-orion |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM. Thanks Jintao for the PR and Dax for the in-depth review!
Thanks Jintao, if you're looking for something related to work on, #4825 could be an option. The idea is that all ActOnArgs subclasses should be instantiable from an |
…tumlib#4789) This PR implements the task proposed in quantumlib#4779. In each iteration of a simulation, the entire `ActOnArgs` is copied, including buffers. It is not necessary and adds additional cost, especially for `DensityMatrixSimulator`. Therefore, a parameter `with_buffer` is added to the `copy` method to indicate whether buffers are also needed to be copied. For third-party simulators that have not added the parameter, a deprecation warning is raised. This PR also modifies the `__init__` method of `DensityMatrixSimulator` and `ActOnStateVectorArgs` to create the buffer and qid_shape parameters when they are not provided. close quantumlib#4779
…tumlib#4789) This PR implements the task proposed in quantumlib#4779. In each iteration of a simulation, the entire `ActOnArgs` is copied, including buffers. It is not necessary and adds additional cost, especially for `DensityMatrixSimulator`. Therefore, a parameter `with_buffer` is added to the `copy` method to indicate whether buffers are also needed to be copied. For third-party simulators that have not added the parameter, a deprecation warning is raised. This PR also modifies the `__init__` method of `DensityMatrixSimulator` and `ActOnStateVectorArgs` to create the buffer and qid_shape parameters when they are not provided. close quantumlib#4779
…tumlib#4789) This PR implements the task proposed in quantumlib#4779. In each iteration of a simulation, the entire `ActOnArgs` is copied, including buffers. It is not necessary and adds additional cost, especially for `DensityMatrixSimulator`. Therefore, a parameter `with_buffer` is added to the `copy` method to indicate whether buffers are also needed to be copied. For third-party simulators that have not added the parameter, a deprecation warning is raised. This PR also modifies the `__init__` method of `DensityMatrixSimulator` and `ActOnStateVectorArgs` to create the buffer and qid_shape parameters when they are not provided. close quantumlib#4779
This PR implements the task proposed in #4779. In each iteration of a simulation, the entire
ActOnArgs
is copied, including buffers. It is not necessary and adds additional cost, especially forDensityMatrixSimulator
. Therefore, a parameterwith_buffer
is added to thecopy
method to indicate whether buffers are also needed to be copied. For third-party simulators that have not added the parameter, a deprecation warning is raised.This PR also modifies the
__init__
method ofDensityMatrixSimulator
andActOnStateVectorArgs
to create the buffer and qid_shape parameters when they are not provided.close #4779