Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Avoid copying unnecessary buffers between simulation iterations #4789

Merged
merged 26 commits into from
Jan 14, 2022

Conversation

yjt98765
Copy link
Contributor

@yjt98765 yjt98765 commented Dec 31, 2021

This PR implements the task proposed in #4779. In each iteration of a simulation, the entire ActOnArgs is copied, including buffers. It is not necessary and adds additional cost, especially for DensityMatrixSimulator. Therefore, a parameter with_buffer is added to the copy method to indicate whether buffers are also needed to be copied. For third-party simulators that have not added the parameter, a deprecation warning is raised.

This PR also modifies the __init__ method of DensityMatrixSimulator and ActOnStateVectorArgs to create the buffer and qid_shape parameters when they are not provided.

close #4779

@yjt98765 yjt98765 requested review from cduck, vtomole and a team as code owners December 31, 2021 01:50
@CirqBot CirqBot added the size: S 10< lines changed <50 label Dec 31, 2021
args._log_of_measurement_results = self.log_of_measurement_results.copy()
return args

def _on_copy(self: TSelf, args: TSelf):
def _on_copy(self: TSelf, args: TSelf, with_buffer: bool = True):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since this is meant to be overridden, I'm not sure we can add a parameter to it without breaking external implementations of ActOnArgs, unless there's some Python magic that allows it. We may have to add a separate _on_copy_reusing_buffer (which has a default implementation calling _on_copy()), and have copy call the appropriate one according to the input.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also I think I prefer the arg name reuse_buffer with default value False.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree that adding a parameter will break external implementations of ActOnArgs, but adding a new method has drawbacks, too. The most challenging part is to connect the new method with copy. I noticed that you have updated the issue. So do you now support the way of adding a parameter to copy?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, that's a tough one. Technically changing OperationTarget.copy's signature is a breaking change too, though I never intended OperationTarget to be implemented directly by third parties.

I'd probably go about it the way you've done so far. But in ActOnArgs.copy, I'd suggest adding an if 'reuse_buffer' in inspect.signature(_on_copy) check, then either pass or don't pass the argument accordingly. Then also if reuse_buffer is not in the signature, emit a deprecation warning. We could do the same thing when calling a.copy(reuse_buffer) in ActOnArgsContainer if we want to be especially cautious. (Sorry about changing the issue mid-flight, I didn't think anybody would be actively working on it).

@95-martin-orion do you have an opinion here?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@daxfohl I am just trying to learn new things, and I have learned a lot so far! Thanks for your patient explanations. I will look into the suggestions you list here and in the issue.

target.target_tensor = self.target_tensor.copy()
if with_buffer:
if reuse_buffer:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if not reuse_buffer here and above

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My understanding is: if the buffer is copied to the new object, then it can be "reused"; otherwise, the object might need to create its own buffer (from scratch). The point is, the context of the parameter is only the copy method. Its semantics should not be attached to a specific usage such as _run. What do you think?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In act_on_args.copy in line 118, it does a shallow copy, meaning that if we don't do anything, then the buffer instance itself is not copied. Thus if reuse_buffer is set here, that means we don't have to do anything else. If reuse_buffer is false, then that means the calling function wants to forcibly deep-copy the buffer. So this condition needs changed to if not reuse_buffer.

I'm not quite sure I catch the question about semantics, but whether-or-not-to-copy-the-buffer is context-dependent, not data-structure-dependent. If the calling function is running repetitions serially (as _run does), then the buffer can be safely reused from one repetition to another. If the user is running them in parallel, then all referencing the same buffer is obviously unsafe. So it depends on the use case.

Copy link
Contributor

@daxfohl daxfohl left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looking good, a few more comments

@@ -68,7 +68,7 @@ def apply_operation(self, op: 'cirq.Operation'):
protocols.act_on(op, self)

@abc.abstractmethod
def copy(self: TSelfTarget) -> TSelfTarget:
def copy(self: TSelfTarget, reuse_buffer: bool = False) -> TSelfTarget:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Make sure to add Args to the docstring with an explanation of reuse_buffer. (Not just "True to reuse the buffer". Something like "If True, any buffers will be reused by the copy. This will save time by avoiding a deep copy of the buffers, but should only be used when there is no chance that the two objects will be writing to the buffers simultaneously.")

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have added the paragraph to the docstring, but I do not fully understand it. It seems that there are no buffers shared between the iterations of the simulation.

for i in range(repetitions):
all_step_results = self._core_iterator(
general_suffix,
sim_state=act_on_args.copy() if i < repetitions - 1 else act_on_args,
)
for step_result in all_step_results:
pass
for k, v in step_result.measurements.items():
if k not in measurements:
measurements[k] = []
measurements[k].append(np.array(v, dtype=np.uint8))

I cannot find where the shared buffer is stored. It is not in ActOnDensityMatrixArgs because not copying the buffer does not affect the simulation. And it is not in DensityMatrixSimulator either.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this comment old? You have updated the function in simulator_base to apply the resuse_buffers=True option.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We have not discussed it yet and I still hold the same opinion. I agree that without copying buffers would improve performance. I am just not sure about the name reuse_buffer and the explanation in the docstring.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure, how would you phrase it? My wording probably wasn't the best.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we say skip_buffer? The docstring can be something like True to skip copying buffers. I think we can just describe the behavior related to the parameter instead of the purpose.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm, that sounds to me like it would just set the new object buffer to null. But I think I understand your point in that reuse_buffers could be confusing too. Maybe something more direct like shallow_copy_buffers?

Copy link
Contributor Author

@yjt98765 yjt98765 Jan 12, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, I support a more direct way. But, is the buffer shallow copied? I only write

        if not reuse_buffer:
            target.available_buffer = [b.copy() for b in self.available_buffer]

in act_on_density_matrix_args.py. Is this what you expected? I did not assign anything to target.available_buffer if reuse_buffer is True, because I think it will simply not be used.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well by "shallow copied", I mean the reference to available_buffer is copied, which occurs in ActOnArgs.copy. So documentation-wise, I guess it depends on whether you're viewing it from the perspective of the user who is calling copy or the perspective of the user who is implementing a new subclass of ActOnArgs.

Maybe clearer in both situations would be to call it deep_copy_buffers, and default that to True. I always prefer naming booleans such that the default is False, but perhaps it's more clear to use deep_copy_buffers=True as default here. (There are plenty of other places where we default a boolean to True, so it wouldn't be an oddball or anything). WDYT?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agree. I will update the code soon.

"""Creates a copy of the object."""
args = copy.copy(self)
self._on_copy(args)
self._on_copy(args, reuse_buffer)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Make sure to do a if 'reuse_buffer' in inspect.signature(_on_copy) check here for backwards compatibility, and output a deprecation warning in the else branch.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have added the deprecation warning and a test for it, but I am not sure whether I am doing it in the right way. I can continue improving it based on your feedback.

@@ -268,7 +268,7 @@ def _run(
for i in range(repetitions):
all_step_results = self._core_iterator(
general_suffix,
sim_state=act_on_args.copy() if i < repetitions - 1 else act_on_args,
sim_state=act_on_args.copy(True) if i < repetitions - 1 else act_on_args,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Prefer named arg style .copy(reuse_buffer=True), for boolean args.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see. The code is changed

@yjt98765
Copy link
Contributor Author

yjt98765 commented Jan 5, 2022

@daxfohl I think I have implemented all the points you mentioned in this PR and in the issue. Please let me know if I missed something.

Copy link
Contributor

@daxfohl daxfohl left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good so far, just some test changes left I think. BTW you can ignore the part of the issue where it says don't emit the buffer in the repr. I'm still not sure about that one.

@@ -226,6 +228,12 @@ def test_copy_succeeds():
assert copied.qubits == (q0, q1)


def test_copy_deprecation_warning():
args = create_container(qs2, False)
with pytest.warns(DeprecationWarning, match='reuse_buffer'):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

use with cirq.testing.assert_deprecated

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Got it. The code has been updated.

cirq-core/cirq/sim/act_on_args_test.py Outdated Show resolved Hide resolved
@yjt98765
Copy link
Contributor Author

yjt98765 commented Jan 6, 2022

Thanks. I have changed the way of testing and added back the buffer in __prep__.

Copy link
Contributor

@daxfohl daxfohl left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think these look good and will be a nice improvement. Just two small things remaining. I'm curious about your earlier comment though https://github.com/quantumlib/Cirq/pull/4789/files#r778747253. Am I missing something?

self.available_buffer = available_buffer
if qid_shape is None:
target_shape = target_tensor.shape
assert len(target_shape) % 2 == 0
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should raise ValueError. (Typically users wouldn't call this constructor directly so assertion kind of makes sense here, but it's still a public function of a public class, so raising explicit errors is preferred).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agree. It has been updated in the new commit. A test case is added as well.

@@ -68,7 +68,7 @@ def apply_operation(self, op: 'cirq.Operation'):
protocols.act_on(op, self)

@abc.abstractmethod
def copy(self: TSelfTarget) -> TSelfTarget:
def copy(self: TSelfTarget, reuse_buffer: bool = False) -> TSelfTarget:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this comment old? You have updated the function in simulator_base to apply the resuse_buffers=True option.

warnings.warn(
(
'A new parameter reuse_buffer has been added to ActOnArgs._on_copy(). '
'The classes that inherit from ActOnArgs should support it before Cirq 0.25.'
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

0.15 should be sufficient everywhere :)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Changed.

@yjt98765
Copy link
Contributor Author

yjt98765 commented Jan 12, 2022

@daxfohl I have updated the code. There seems something wrong with the CI system. I hope the errors can automatically go away when merging a new commit from the master branch.

Copy link
Collaborator

@95-martin-orion 95-martin-orion left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Mostly happy to defer to Dax's review on this, as he's far more familiar with the innards of Simulator than I am at this point. However, I have a couple of items to check.

@@ -226,6 +227,12 @@ def test_copy_succeeds():
assert copied.qubits == (q0, q1)


def test_copy_deprecation_warning():
args = create_container(qs2, False)
with cirq.testing.assert_deprecated('reuse_buffer', deadline='0.25'):
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should this (and other deadlines) be 0.15 instead?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure, I missed them.

@@ -113,14 +115,34 @@ def _perform_measurement(self, qubits: Sequence['cirq.Qid']) -> List[int]:
"""Child classes that perform measurements should implement this with
the implementation."""

def copy(self: TSelf) -> TSelf:
"""Creates a copy of the object."""
def copy(self: TSelf, reuse_buffer: bool = False) -> TSelf:
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There's definitely a breaking change here, in that any external subclass of ActOnArgs (if one exists...) with its own definition of copy will break when copy is called on it by Cirq code that uses both arguments.

In hindsight, the right way to prevent this probably involves private classes (which Python is allergic to), but given how heavily internal this code is I think we can get away with this.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We managed to make this non breaking with some reflection. Hence the test that still works. Warning messages are appropriately emitted when the new argument does not exist (and there are tests around that as well).

This could be a good reference for doing such things.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, if we check the signature before actually calling the copy (like the case in SimulatorBase._run), we can avoid errors.

def test_copy_deprecation_warning():
args = create_container(qs2, False)
with cirq.testing.assert_deprecated('reuse_buffer', deadline='0.25'):
args.copy(True)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Showing my python ignorance here, but how/why does this call work? EmptyActOnArgs.copy doesn't accept an argument.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is because args is a ActOnArgsContainer and in ActOnArgsContainer.copy:

            if 'reuse_buffer' in inspect.signature(act_on_args.copy).parameters:
                copies[act_on_args] = act_on_args.copy(reuse_buffer)
            else:
                warnings.warn(
                    (
                        'A new parameter reuse_buffer has been added to ActOnArgs.copy(). The '
                        'classes that inherit from ActOnArgs should support it before Cirq 0.15.'
                    ),
                    DeprecationWarning,
                )
                copies[act_on_args] = act_on_args.copy()

So, it does not directly call EmptyActOnArgs.copy.

Comment on lines 39 to 40
available_buffer: List[np.ndarray] = None,
qid_shape: Tuple[int, ...] = None,
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These need their types changes to Optional[...]. (Not sure why mypy didn't catch this...)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was curious about this too. It's explained in the PEP. It works implicitly for now but is no longer recommended, and tooling will soon be changed to catch these. https://www.python.org/dev/peps/pep-0484/#union-types

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry, I will change them...

@@ -40,7 +40,7 @@ class ActOnStateVectorArgs(ActOnArgs):
def __init__(
self,
target_tensor: np.ndarray,
available_buffer: np.ndarray,
available_buffer: np.ndarray = None,
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As above, this needs to be Optional[np.ndarray].

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Changed.

@95-martin-orion 95-martin-orion added BREAKING CHANGE For pull requests that are important to mention in release notes. and removed BREAKING CHANGE For pull requests that are important to mention in release notes. labels Jan 12, 2022
Copy link
Contributor

@daxfohl daxfohl left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, just update docstring and it should be ready to merge.

"""Creates a copy of the object.

Args:
deep_copy_buffers: If True, buffers will also be copied.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's just be more thorough -- If True, buffers will also be deep-copied. Otherwise the copy will share a reference to the original object's buffers.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I finally understand the word reference and reuse now! Please check my last commit. I think it is what you intended to see. Sorry that I missed your point all this long time!

@@ -68,7 +68,7 @@ def apply_operation(self, op: 'cirq.Operation'):
protocols.act_on(op, self)

@abc.abstractmethod
def copy(self: TSelfTarget) -> TSelfTarget:
def copy(self: TSelfTarget, deep_copy_buffers: bool = True) -> TSelfTarget:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Update the docstring here too since it's the base interface.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated.

@daxfohl
Copy link
Contributor

daxfohl commented Jan 14, 2022

LGTM, cc @95-martin-orion

Copy link
Collaborator

@95-martin-orion 95-martin-orion left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. Thanks Jintao for the PR and Dax for the in-depth review!

@95-martin-orion 95-martin-orion added the automerge Tells CirqBot to sync and merge this PR. (If it's running.) label Jan 14, 2022
@CirqBot CirqBot added the front_of_queue_automerge CirqBot uses this label to indicate (and remember) what's being merged next. label Jan 14, 2022
@CirqBot CirqBot merged commit 20b577c into quantumlib:master Jan 14, 2022
@CirqBot CirqBot removed automerge Tells CirqBot to sync and merge this PR. (If it's running.) front_of_queue_automerge CirqBot uses this label to indicate (and remember) what's being merged next. labels Jan 14, 2022
@daxfohl
Copy link
Contributor

daxfohl commented Jan 14, 2022

Thanks Jintao, if you're looking for something related to work on, #4825 could be an option. The idea is that all ActOnArgs subclasses should be instantiable from an int as the first argument if qubits is provided. Currently the logic for converting from int to, say, state vector, lies in the corresponding _create_partial_act_on_args and just has to be moved.

@yjt98765
Copy link
Contributor Author

@daxfohl Thank you so much for the help! You gave me a detailed tutorial in reviewing this pull request. I learned a lot from it!

Yes, I would like to work on #4825 and will try to perform better 😄

@yjt98765 yjt98765 deleted the actonarg branch January 15, 2022 01:10
MichaelBroughton pushed a commit to MichaelBroughton/Cirq that referenced this pull request Jan 22, 2022
…tumlib#4789)

This PR implements the task proposed in quantumlib#4779. In each iteration of a simulation, the entire `ActOnArgs` is copied, including buffers. It is not necessary and adds additional cost, especially for `DensityMatrixSimulator`. Therefore, a parameter `with_buffer` is added to the `copy` method to indicate whether buffers are also needed to be copied. For third-party simulators that have not added the parameter, a deprecation warning is raised.

This PR also modifies the `__init__` method of `DensityMatrixSimulator` and `ActOnStateVectorArgs` to create the buffer and qid_shape parameters when they are not provided.

close quantumlib#4779
rht pushed a commit to rht/Cirq that referenced this pull request May 1, 2023
…tumlib#4789)

This PR implements the task proposed in quantumlib#4779. In each iteration of a simulation, the entire `ActOnArgs` is copied, including buffers. It is not necessary and adds additional cost, especially for `DensityMatrixSimulator`. Therefore, a parameter `with_buffer` is added to the `copy` method to indicate whether buffers are also needed to be copied. For third-party simulators that have not added the parameter, a deprecation warning is raised.

This PR also modifies the `__init__` method of `DensityMatrixSimulator` and `ActOnStateVectorArgs` to create the buffer and qid_shape parameters when they are not provided.

close quantumlib#4779
harry-phasecraft pushed a commit to PhaseCraft/Cirq that referenced this pull request Oct 31, 2024
…tumlib#4789)

This PR implements the task proposed in quantumlib#4779. In each iteration of a simulation, the entire `ActOnArgs` is copied, including buffers. It is not necessary and adds additional cost, especially for `DensityMatrixSimulator`. Therefore, a parameter `with_buffer` is added to the `copy` method to indicate whether buffers are also needed to be copied. For third-party simulators that have not added the parameter, a deprecation warning is raised.

This PR also modifies the `__init__` method of `DensityMatrixSimulator` and `ActOnStateVectorArgs` to create the buffer and qid_shape parameters when they are not provided.

close quantumlib#4779
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
size: S 10< lines changed <50
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Buffers unnecessarily copied between repetitions
4 participants