Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

before_scenario and parallel options #437

Open
cheerfulstoic opened this issue Nov 19, 2024 · 5 comments
Open

before_scenario and parallel options #437

cheerfulstoic opened this issue Nov 19, 2024 · 5 comments

Comments

@cheerfulstoic
Copy link

Is it by design that before_scenario runs multiple times if parallel is greater than 1? I'm trying to setup a single process which can be accessed by the parallel run scenarios so that I can test how processes scale under load. If this is by design I'll find a way to work around it, but it seems like maybe before_scenario should just run once for each scenario no matter the value of parallel 🤔

@PragTob
Copy link
Member

PragTob commented Nov 20, 2024

👋

Hello there - thanks for opening an issue! 💚

I think I gotta think on that a bit more. My first reaction was "yeah" then I was like "I'm not sure" then reading through the use cases in the docs I found this one:

Recording the PID of self() for use in your benchmark (each scenario is executed in its own process, so scenario PIDs aren't available in functions running before the suite)

And at least that one would need to execute for each one in parallel.

However, quite frankly, parallel is something I sometimes forget about in design as it is rarely used 😅 I'd agree that intuitively it'd make sense to just run once per scenario. Reading through the docs, interplay with parallel isn't even mentioned once soooo chances are this isn't intended/I might have forgotten.

Smells like we might need another hook type but boi would I not be looking forward to that (I wrote the hooks many years ago during a longer vacation and the complexity they bring is big vs. how often I think they're used)

Can you tell me what your use case is (broadly speaking)? I'd like to think through it - plus it's a feature I think is rarely used so reinforcement for its usefulness is appreciated 😁

@cheerfulstoic
Copy link
Author

🎉 Thanks!

So, I'm testing out different modules that all implement the same behavior. Each module starts up a process or a supervision tree of processes to do basically the same job, but I want to compare the performance of individual calls but also what happens as the processes queues get more and more loaded.

Here's the sanitized version of my benchee script:

definitions = [ ... ]
inputs =
  Map.new(definitions, fn definition ->
    {
      "#{definition[:module_under_test]}: #{inspect(definition[:module_under_test_opts])}",
      definition
    }
  end)

Benchee.run(
  %{
    "foo" => fn {%{
                   module_under_test: module_under_test,
                   module_under_test_opts: module_under_test_opts
                 }, %{pid: _pid, users: users}} ->
      user = Enum.random(users)

      module_under_test.foo(user.id)
    end,
    "update" => fn {%{
                      module_under_test: module_under_test,
                      module_under_test_opts: module_under_test_opts
                    }, %{pid: _pid, users: users}} ->
      user = Enum.random(users)

      module_under_test.bar(user.id, %{attr1: "Biz Buzz #{:random.uniform(5000)}"})
    end
  },
  warmup: 2,
  time: 5,
  inputs: inputs,
  parallel: 2,
  before_scenario: fn %{
                        module_under_test: module_under_test,
                        module_under_test_opts: module_under_test_opts
                      } = input ->
    {:ok, pid} = module_under_test.start_link(module_under_test_opts)

    Process.unlink(pid)

    users =
      Enum.map(0..20, fn i ->
        {:ok, user} =
          module_under_test.create(%{name: "User #{i}", email: "user#{i}@example.com"})

        user
      end)

    {input, %{users: users, pid: pid}}
  end,
  after_scenario: fn {_input, %{pid: pid}} ->
    Process.exit(pid, :kill)
    Process.sleep(500)
  end,
  formatters: [
    {Benchee.Formatters.HTML, file: "benchee_output.html"},
    {Benchee.Formatters.CSV, file: "benchee_output.csv"},
    Benchee.Formatters.Console
  ]
)

The Process.unlink was because I was too lazy to make a start function instead of start_link so that the script process didn't die when killing the processes for the modules under test.

I could use something like Task.async_stream(1..1_000, ... to do the parallel bits 🤷 I guess I would need to experiment with the number of iterations so that I get a good comparison of IPS for each scenario, but not so small that the finishing up of parallel jobs doesn't affect the results (e.g. if I'm running async_stream with max_concurrency of 4, then at the end of a benchee iteration there would be a small period of time where the last three things are being performed and so it's not running at full concurrency.... 🤷 )

All of that is also making me wonder, maybe this isn't a job for benchee? benchee is the tool that I reach for whenever benchmarking, but maybe I should just be doing different runs and seeing how long they take in total at different levels of parallelism and maybe have some telemetry to record things like the queue length to see if things are getting backed up 🤔

@PragTob
Copy link
Member

PragTob commented Nov 20, 2024

Thanks for sharing!

I mean the reason of existence for parellel is that we can stress test certain code - so I think this should be fine to do via benchee.

A workaround that might work for you right now would be to have each module_under_test be its own Benchee.run invocation and then you could save the results and load them for comparison. The before/after would then just be around Benchee.run invocations. Kinda like:

Enum.each(modules, fn module_under_test ->
  former_before_scenario()
  Benche.run(%{
   # using module_under_test
   save: "folder/cool_bench#{module_under_test}.benchee"
}
end)

Benchee.report(load: "folder/cool_benchee_*")

(written from memory, untested)


I asked @pablocostass quickly and with some time (this is a longer thing) I'll review wording. But most likely I'd change it so that before_scenario does what you thought it'd do and introduce a new thing... that I have yet to name before every parallel execution of the benchmark (what before_scenario does right now). But that's more involved and needs some time, not sure when i'd get to it.

That said, the initial changes to "fix" before_scenario should be rather easy and should (probably) all be contained in the following code/moving around before_scenario/after_scenario in here:

https://github.com/bencheeorg/benchee/blob/main/lib/benchee/benchmark/runner.ex#L81-L119

@cheerfulstoic
Copy link
Author

Thanks very much! I've been using your workaround and I've got a version that's been working fine. Now I have the harder task of interpreting my results 😅

Thanks again!

@PragTob
Copy link
Member

PragTob commented Nov 29, 2024

Good luck on interpreting those results, always fun. When in doubt, MOARE BENCHMARKS!

IMG_20180818_152929

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants