Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Dispatch/user distribution calculation using Kullback-Leibler divergence. Allow float weights. #2686

Merged
merged 12 commits into from
May 2, 2024

Conversation

tdadela
Copy link
Contributor

@tdadela tdadela commented Apr 21, 2024

  • simplified code
  • added/fixed support for float weights
  • yielding next value now has O(log n) complexity, instead of Θ(n) for roundrobin.smooth
  • one dependency less

@cyberw
Copy link
Collaborator

cyberw commented Apr 23, 2024

Btw, did you run the dispatch performance tests?

@tdadela
Copy link
Contributor Author

tdadela commented Apr 26, 2024

I reduced test matrix to:

    worker_count_cases = [10, 100, 1000, 5_000]
    user_count_cases = [100, 1000, 10_000, 20_000]
    number_of_user_classes_cases = [1, 10, 50, 100]
    spawn_rate_cases = [100, 1000, 5000, 20_000]

and removed IO operations (print, file.write).
I executed benchmark using: time python3.12 benchmarks/dispatch.py
Results:

  • old implementation:
    12.38s user 3.18s system 99% cpu 15.594 total
    12.40s user 2.99s system 99% cpu 15.420 total
    12.58s user 3.27s system 99% cpu 15.886 total
    12.11s user 3.11s system 99% cpu 15.253 total
    12.04s user 3.01s system 99% cpu 15.086 total
  • new implementation:
    11.07s user 2.61s system 99% cpu 13.713 total
    11.35s user 3.04s system 99% cpu 14.414 total
    11.17s user 3.05s system 99% cpu 14.261 total
    10.94s user 3.07s system 99% cpu 14.043 total
    11.29s user 2.64s system 99% cpu 13.964 total

@cyberw
Copy link
Collaborator

cyberw commented Apr 26, 2024

Nice. Not a huge improvement, but an improvement nonetheless.

@tdadela
Copy link
Contributor Author

tdadela commented Apr 27, 2024

I rerun the same benchmark for this dummy implementation:

cycle_fixed_gen = itertools.cycle([u.__name__ for u in fixed_users.values()])
cycle_weighted_gen = itertools.cycle([u.__name__ for u in self._user_classes if not u.fixed_count])

All results were above 11s. [So most time is consumed by other parts of code]
Mind you that for certain inputs, old implementation may be faster. Mostly due to "caching" in the line 412:

return itertools.cycle(gen() for _ in range(generation_length_to_get_proper_distribution))

I ran the following test a few times:

if __name__ == "__main__":

    input_data = [(u, u.weight) for u in USER_CLASSES]
    ts = time.perf_counter()
    gen = _kl_generator(input_data)
    for _ in range(1_000_000):
        next(gen)
    instantiate_duration = time.perf_counter() - ts
    print(instantiate_duration * 1000)

(USER_CLASSES is exactly the same users list as in benchmark/dispatch.py)
On my hardware, it consistently takes less than 600 ms.
Equivalent version for roundrobin.smooth takes more than 9000 ms.
So, I think we have strong confidence that this change will not cause performance regression.
Furthermore, I believe that more important than performance gains are: code simplification, removal of weird normalization which led to issues (ex. #2662) and float weight support.

@tdadela tdadela marked this pull request as ready for review April 27, 2024 20:46
@cyberw
Copy link
Collaborator

cyberw commented Apr 27, 2024

Great stuff! I’ll use it myself a couple times next week, and if I dont find any issues I’ll merge.

@cyberw
Copy link
Collaborator

cyberw commented Apr 28, 2024

I noticed your branch is based on a pretty old commit. Can you rebase on latest master?

@cyberw cyberw merged commit 34b0b81 into locustio:master May 2, 2024
14 checks passed
@cyberw
Copy link
Collaborator

cyberw commented May 2, 2024

Thanks!

@cyberw cyberw changed the title refactor: Kullback-Leibler user generator Dispatch/user distribution calculation using Kullback-Leibler divergence May 3, 2024
@cyberw cyberw changed the title Dispatch/user distribution calculation using Kullback-Leibler divergence Dispatch/user distribution calculation using Kullback-Leibler divergence. Allow float weights. May 13, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants