Fix SPAM errors introducing overhead #529

dakk · 2023-05-30T15:51:35Z

Fixes SPAM errors introduce large simulation overhead #352
Handle eps_p = eps = 0 case

HGSilveri

Good start, but I think the improvement from this alone will be quite small. The main goal is to avoid unecessarily repeating the time-consuming part where the flips are calculated

pulser-simulation/pulser_simulation/simresults.py

HGSilveri

This is already quite a nice optimization, but I think we can go further. I'll write up another comment detailing how

pulser-simulation/pulser_simulation/simresults.py

HGSilveri · 2023-05-31T10:24:37Z

A question for you @dakk : How is the runtime scaling with samples_per_run now?

dakk · 2023-05-31T10:24:38Z

This is already quite a nice optimization, but I think we can go further. I'll write up another comment detailing how

Indeed, I made a new optimization in the last commit, by using tuple instead of strings for sample_dict indexing. Please take a look.

(I removed the screenshot from the issue, since I noticed there was an error in my profiling script)

dakk · 2023-05-31T10:32:35Z

A question for you @dakk : How is the runtime scaling with samples_per_run now?

Now scales sub-linearly; this is a test run with different samples_per_run:
Samples_per_run: 1 Elapsed: 2.8112339973449707 seconds
Samples_per_run: 10 Elapsed: 3.293823003768921 seconds
Samples_per_run: 100 Elapsed: 5.302839040756226 seconds
Samples_per_run: 500 Elapsed: 11.655674457550049 seconds
Samples_per_run: 1000 Elapsed: 19.241939544677734 seconds
Samples_per_run: 2000 Elapsed: 36.630615234375 seconds
Samples_per_run: 5000 Elapsed: 84.35432386398315 seconds
Samples_per_run: 10000 Elapsed: 156.6097002029419 seconds

import time
import sys
import numpy as np
from pulser import Register, Pulse, Sequence, waveforms, devices
from pulser_simulation import Simulation, SimConfig

reg = Register.square(side=2)
pulse = Pulse.ConstantDetuning(waveforms.BlackmanWaveform(duration=1000, area=np.pi),0,0)
seq = Sequence(reg, devices.MockDevice)
seq.declare_channel('ch', 'rydberg_global')
seq.add(pulse, 'ch')
sim = Simulation(seq)

def run(r):
        noise_config = SimConfig(noise=('SPAM', 'doppler', 'amplitude') if spam else ('doppler', 'amplitude'),
                             eta=0.005, epsilon=0.03, epsilon_prime=0.08,
                             temperature=30, laser_waist=148,
                             runs=10, samples_per_run=r)
        sim.set_config(noise_config)
        #sim.show_config()
        t = time.time()
        res = sim.run(False)
        tt = time.time() - t
        print(f'Samples_per_run: {r}\tElapsed: {tt} seconds')

for x in [1, 10, 100, 500, 1000, 2000, 5000, 10000]:
        run(x)

HGSilveri · 2023-05-31T10:47:23Z

Ok, here is the alternative that I think it's worth investigating: when CoherentResults has measurement error information, CoherentResults._calc_pseudo_density() generates a pseudo-density matrix (diagonal only) where the measurement errors are included. This means that if the _weights used in Result.get_samples() came from the diagonal of this pseudo-density matrix instead, there would be no need to post-process the samples to include the bitflips.

The potential downside of this approach is that calculating the density matrix is expensive, as you go from a state of size N to a matrix of size N**2, so I'm not sure it will be benefitial. I have a feeling this is the conclusion I arrived at a while ago, but I can't remember with certainty anymore. On the other hand, I suspect that calculating the full pseudo-density matrix might even not be necessary as we are only using the N elements in its diagonal.

HGSilveri · 2023-05-31T11:45:00Z

Let's take this step by step:

Isolate the contents of Result.get_samples() in a staticmethod so we can use it:

    def get_samples(self, n_samples: int) -> Counter[str]:
        """Takes multiple samples from the sampling distribution.

        Args:
            n_samples: Number of samples to return.

        Returns:
            Samples of bitstrings corresponding to measured quantum states.
        """
        return self.get_samples_from_weights(n_samples, self._weights())

    @staticmethod
    def get_samples_from_weights(n_samples: int, weights: np.ndarray):
        # TODO: Docstring
        dist = np.random.multinomial(n_samples, weights)
        return Counter(
            {
                np.binary_repr(i, self._size): dist[i]
                for i in np.nonzero(dist)[0]
            }
        )

Modifiy CoherentResults.sample_state() to use the weights from the pseudo-density matrix when there are measurement errors:

        ...
        if eps_p == 0.0 and eps == 0.0:
            return sampled_state
        t_index = self._get_index_from_time(t, t_tol)
        custom_weights = np.abs(self._calc_pseudo_density(t_index).diag())
        return Result.get_samples_from_weights(n_samples, custom_weights)

Compare the time it takes to run this solution with your current proposal. Try for different values of samples_per_run and also number of atoms (by changing the size of the register).

My guess is that there will be regimes where this performs better (ie large number of samples_per_run and or low number of atoms) and others where it will perform worse due to the cost of calculating the pseudo-density matrices. I think avoiding the pseudo-density matrix calculation is not straightforward at all, so let's drop that and stick to this.

PS: It's very likely that there are bugs in the code above, I just wanted to give some pointers so feel free to adjust it as you see fit

dakk · 2023-05-31T13:35:45Z

Let's take this step by step:

1. Isolate the contents of `Result.get_samples()` in a staticmethod so we can use it:

    def get_samples(self, n_samples: int) -> Counter[str]:
        """Takes multiple samples from the sampling distribution.

        Args:
            n_samples: Number of samples to return.

        Returns:
            Samples of bitstrings corresponding to measured quantum states.
        """
        return self.get_samples_from_weights(n_samples, self._weights())

    @staticmethod
    def get_samples_from_weights(n_samples: int, weights: np.ndarray):
        # TODO: Docstring
        dist = np.random.multinomial(n_samples, weights)
        return Counter(
            {
                np.binary_repr(i, self._size): dist[i]
                for i in np.nonzero(dist)[0]
            }
        )

2. Modifiy `CoherentResults.sample_state()` to use the weights from the pseudo-density matrix when there are measurement errors:

        ...
        if eps_p == 0.0 and eps == 0.0:
            return sampled_state
        t_index = self._get_index_from_time(t, t_tol)
        custom_weights = np.abs(self._calc_pseudo_density(t_index).diag())
        return Result.get_samples_from_weights(n_samples, custom_weights)

3. Compare the time it takes to run this solution with your current proposal. Try for different values of `samples_per_run` and also number of atoms (by changing the size of the register).

My guess is that there will be regimes where this performs better (ie large number of samples_per_run and or low number of atoms) and others where it will perform worse due to the cost of calculating the pseudo-density matrices. I think avoiding the pseudo-density matrix calculation is not straightforward at all, so let's drop that and stick to this.

PS: It's very likely that there are bugs in the code above, I just wanted to give some pointers so feel free to adjust it as you see fit

I integrated your code into Pulser, but it takes way more time to run simulations; for way more I mean several minutes instead of seconds even with the smallest samples_per_run. Maybe I'm missing something

dakk@8c5f519

HGSilveri · 2023-05-31T13:50:23Z

Your implementation seems correct, it's very likely that _calc_pseudo_density() is just prohibitively expensive. Let's drop that idea and go with your solution. Could you also post the elapsed time you were getting before your changes (for comparison sake)?

dakk · 2023-05-31T14:28:37Z

Your implementation seems correct, it's very likely that _calc_pseudo_density() is just prohibitively expensive. Let's drop that idea and go with your solution. Could you also post the elapsed time you were getting before your changes (for comparison sake)?

Sure; this is before:

Samples_per_run: 1	Elapsed: 2.9893434047698975 seconds
Samples_per_run: 10	Elapsed: 3.8784408569335938 seconds
Samples_per_run: 100	Elapsed: 8.337844133377075 seconds
Samples_per_run: 500	Elapsed: 23.297382354736328 seconds
Samples_per_run: 1000	Elapsed: 44.66521739959717 seconds
Samples_per_run: 2000	Elapsed: 80.20746278762817 seconds
Samples_per_run: 5000	Elapsed: 177.2137794494629 seconds
Samples_per_run: 10000	Elapsed: 367.4108419418335 seconds

and after:

Samples_per_run: 1	Elapsed: 3.863002061843872 seconds
Samples_per_run: 10	Elapsed: 4.330973863601685 seconds
Samples_per_run: 100	Elapsed: 6.834812164306641 seconds
Samples_per_run: 500	Elapsed: 14.479618549346924 seconds
Samples_per_run: 1000	Elapsed: 23.99378752708435 seconds
Samples_per_run: 2000	Elapsed: 42.217456579208374 seconds
Samples_per_run: 5000	Elapsed: 100.99924063682556 seconds
Samples_per_run: 10000	Elapsed: 203.1134431362152 seconds

a-corni

Hey @dakk !
This is already a nice accelleration !
I have another idea, that would be to vectorize the code.
I let you have a look to it, I hope it will be understoodable 😄

pulser-simulation/pulser_simulation/simresults.py

HGSilveri

Very nice, thank your for your work!

dakk · 2023-06-01T18:21:52Z

Very nice, thank your for your work!

It was a pleasure!

Handle eps_p = eps = 0 case

0192d5c

dakk changed the title ~~Handle eps_p = eps = 0 case~~ Fix SPAM errors introducing overhead May 30, 2023

HGSilveri reviewed May 30, 2023

View reviewed changes

pulser-simulation/pulser_simulation/simresults.py Outdated Show resolved Hide resolved

Improve SPAM simulation performance

0583d62

dakk marked this pull request as ready for review May 31, 2023 07:35

dakk requested a review from HGSilveri May 31, 2023 07:37

a-corni reviewed May 31, 2023

View reviewed changes

pulser-simulation/pulser_simulation/simresults.py Outdated Show resolved Hide resolved

HGSilveri reviewed May 31, 2023

View reviewed changes

pulser-simulation/pulser_simulation/simresults.py Outdated Show resolved Hide resolved

Improve sample_state performance using tuple indexing

ebf90cb

a-corni reviewed Jun 1, 2023

View reviewed changes

pulser-simulation/pulser_simulation/simresults.py Outdated Show resolved Hide resolved

Improvements of sample_state() performance

0a6f862

HGSilveri reviewed Jun 1, 2023

View reviewed changes

pulser-simulation/pulser_simulation/simresults.py Outdated Show resolved Hide resolved

pulser-simulation/pulser_simulation/simresults.py Outdated Show resolved Hide resolved

dakk added 2 commits June 1, 2023 17:05

Join meas_errors checks in simresults sample_state

8715b91

Update comments on simresults sample_state

92d6066

HGSilveri approved these changes Jun 1, 2023

View reviewed changes

HGSilveri merged commit 742c534 into pasqal-io:develop Jun 2, 2023

dakk deleted the spam_overhead branch June 2, 2023 11:45

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix SPAM errors introducing overhead #529

Fix SPAM errors introducing overhead #529

dakk commented May 30, 2023 •

edited

Loading

HGSilveri left a comment

HGSilveri left a comment

HGSilveri commented May 31, 2023

dakk commented May 31, 2023

dakk commented May 31, 2023

HGSilveri commented May 31, 2023

HGSilveri commented May 31, 2023 •

edited

Loading

dakk commented May 31, 2023 •

edited

Loading

HGSilveri commented May 31, 2023

dakk commented May 31, 2023

a-corni left a comment

HGSilveri left a comment

dakk commented Jun 1, 2023

Fix SPAM errors introducing overhead #529

Fix SPAM errors introducing overhead #529

Conversation

dakk commented May 30, 2023 • edited Loading

HGSilveri left a comment

Choose a reason for hiding this comment

HGSilveri left a comment

Choose a reason for hiding this comment

HGSilveri commented May 31, 2023

dakk commented May 31, 2023

dakk commented May 31, 2023

HGSilveri commented May 31, 2023

HGSilveri commented May 31, 2023 • edited Loading

dakk commented May 31, 2023 • edited Loading

HGSilveri commented May 31, 2023

dakk commented May 31, 2023

a-corni left a comment

Choose a reason for hiding this comment

HGSilveri left a comment

Choose a reason for hiding this comment

dakk commented Jun 1, 2023

dakk commented May 30, 2023 •

edited

Loading

HGSilveri commented May 31, 2023 •

edited

Loading

dakk commented May 31, 2023 •

edited

Loading