-
Notifications
You must be signed in to change notification settings - Fork 66
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Project benchmarks #312
Project benchmarks #312
Conversation
Hey @arikai, thanks for putting this together! We've had discussions about this kind of thing before, like here: #236 One of the things that I think Tobi and I both really love is that Benchee's API is "just a function." This both makes it really easy for folks to use, and makes interop with Erlang much easier (which is something we both care about as well). We do the same kind of thing you're talking about in this library already, but we do it with a bash script instead of a mix task. If what you're looking for is a way to run all your benchmarks with one command, I'd recommend going with a bash script (or a Makefile!). Let's see what @PragTob says about this, though, before making any decision. I'm ok with having this if he is, but my initial feeling is that it's a lot of rather tricky code to maintain and it's only purpose is to replace the need for a bash script. Of course, if you want to build something on top of benchee that offers this sort of DSL, you're welcome to do so! And I love that you would want this in Elixir core, but they've made it really clear that the only thing going in Elixir core is what's needed to develop the language itself, and this doesn't fall into that category, so I kind of agree with them on keeping this as a separate package. |
For note, I already have a |
From what I've read on the issue linked above, you do approve this idea in general, but have concerns on the maintenance and usecases of this addition. So here are my two cents. 1. Why not shell scripts?Of course you can do this with shell scripts. Hell, what can't you do with them? However, there are cons. Bash is basically restricted to commands provided by mix, its plugin tasks, and some one-liners for iex. Something more complex - write .exs. But why do we need to use 2 languages to just use convenient devtools? Why do we need to reimplement it over and over again (or copy) for this functionality? I believe, this is exactly the reason dev tools exist and mix in particular, and just prettty much libraries and frameworks in general: to encapsulate a pattern used over and over again. And to use bash in projects like this seems veeery hacky and not convenient at all. Bash is more like "use available tools for your particular case", not "add functionality to tools". And project-wide benchmarks is a very common pattern. 2. Maintaining second API seems like another burden.But please look closely at what this DSL and task do. But what run* functions actually do? They simply call Benchee.run with options defined with the DSL and options passed with the task (although there is none for now). And options defined with the DSL are exactly the same you would usually pass to Benchee.run! So when we talk about maintaining this "second API" we talk about just adding some switches for OptionParser in task module. That's it. 3. What are the pros of the DSL to simple API of one function?Let's talk about tests. It's surely possible to make them as simple .exs scripts. It's surely possible to run them via bash script. But why? Is it easier to use? No. Is it cleaner to use? No. Is it easier to write them? Heck no, DSL looks way cleaner and easier to read. Tests are easier to change and maintain. DSL is usually closer to the natural language (although, more domain specific) than tests written in two programming languages, one of them being some quirky shell script. I do not deny that single-function-API is way too easier to use for one-shot benchmarks. But when we're talking about libraries changes in which may degrade performance, and, what's by orders of magnitude worse, may cascade into projects that use them, causing to degrade them as well. Still waiting for the @PragTob 's thoughts about this. |
@OvermindDL1 Sure, especially if you have some interesting ideas and features that I missed to implement. |
I still can't get around this strange error from build with 1.6.6 + OTP21 (second build) that I can't even reproduce on my machine, so help needed. |
Mine is very specific to my sandbox project, but it's an easy start for something bigger. First my addons to defp aliases do
[
bench: &bench/1
]
end
defp bench(arglist) do
Mix.Tasks.Run.run(["bench/bench.exs" | arglist])
end Then System.argv()
|> Enum.each(fn bench ->
try do
Code.require_file("bench/#{bench}_bench.exs")
mod = String.to_atom("Elixir.#{Macro.camelize(bench)}Bench")
if :erlang.function_exported(mod, :module_info, 0) do
if(:erlang.function_exported(mod, :classifiers, 0), do: mod.classifiers(), else: [nil])
|> case do
nil -> [nil]
[] -> [nil]
clas when is_list(clas) -> clas
end
|> Enum.each(fn cla ->
if cla do
title = "Benchmarking Classifier: #{cla}"
sep = String.duplicate("=", String.length(title))
IO.puts("\n#{title}\n#{sep}\n")
end
setup = if(:erlang.function_exported(mod, :setup, 1), do: mod.setup(cla), else: nil)
m =
cond do
:erlang.function_exported(mod, :time, 2) -> mod.time(cla, setup)
:erlang.function_exported(mod, :time, 1) -> mod.time(cla)
true -> 5
end
inputs =
cond do
:erlang.function_exported(mod, :inputs, 2) -> mod.inputs(cla, setup)
:erlang.function_exported(mod, :inputs, 1) -> mod.inputs(cla)
true -> nil
end
actions =
cond do
:erlang.function_exported(mod, :actions, 2) -> mod.actions(cla, setup)
true -> mod.actions(cla)
end
Benchee.run(
actions,
[
time: m,
warmup: m,
memory_time: m,
print: %{fast_warning: false}
] ++ if(inputs, do: [inputs: inputs], else: [])
)
if(:erlang.function_exported(mod, :teardown, 2), do: mod.teardown(cla, setup))
end)
end
rescue
r -> IO.puts("Unknown exception: #{Exception.format(:error, r, __STACKTRACE__)}")
catch
{type, reason} when type in [:throw, :exit] -> IO.puts("Unknown error: #{Exception.format(type, reason, __STACKTRACE__)}")
e -> IO.puts("Unknown error: #{Exception.format(:throw, e, __STACKTRACE__)}")
end
end) And basically it can run bench benchmark modules (the naming pattern should probably also be changed, as stated this is my naming convention and is not at all generic). It can also run with a variety of input types as well with each being it's own benchee run, among other features. But for example here is my most recent benchmark module, it is not using all of the features even remotely, and I could leave out some of the no-op functions, but I like to keep them in for documentation: defmodule MapLookupOrThrowBench do
def classifiers(), do: nil
def time(_), do: 2
def inputs(_cla),
do: %{
"small-2-existing" => {2, 1..2 |> Map.new(&{&1, to_string(&1)})},
"medium-10-existing" => {10, 1..10 |> Map.new(&{&1, to_string(&1)})},
"large-10000-existing" => {10000, 1..10000 |> Map.new(&{&1, to_string(&1)})},
"small-1-not-existing" => {2, 1..1 |> Map.new(&{&1, to_string(&1)})},
"medium-9-not-existing" => {10, 1..9 |> Map.new(&{&1, to_string(&1)})},
"large-9999-not-existing" => {10000, 1..9999 |> Map.new(&{&1, to_string(&1)})},
}
def setup(_cla), do: nil
def teardown(_cla, _setup), do: nil
def actions(_cla, _setup),
do: %{
"Map.fetch" => fn {k, m} ->
Map.fetch(m, k)
end,
"catch-Map.fetch!" => fn {k, m} ->
try do Map.fetch!(m, k)
rescue _ -> :error
catch _ -> :error
end
end,
"Access" => fn {k, m} ->
m[k]
end,
"catch-direct-always-fail" => fn {_k, m} ->
try do m.a
rescue _ -> :error
catch _ -> :error
end
end,
"case" => fn {k, m} ->
case m do
%{^k => v} -> v
_ -> :error
end
end,
}
end Honestly, for a benchmark scaffolding thing I'd look more into how Rust's Criterion API does it, it's very clean and nice, especially with a few macro's tossed in. |
👋 Hi everyone and thanks for your contributions first and foremost 🎉 🙏 I mostly agree with @devonestes - I love it that benchee uses simple functions to do its job. I don't want to use macros where functions can do the job. That there's no default way to run all benchmarks I can see, however for that I'd rather suggest or include a mix task or an easy bash script. I also don't see that the requirement to run all of them nicely is enough to introduce a DSL. However, I see and know that people want a DSL (although I still don't really understand why - I don't believe that taking benchmarks more seriously is bound to there being a DSL, if you care you run them and if not you don't), and we build tools and I believe that they should also cater to the people who use them. And if a DSL makes people happier/somewhat likelier to benchmark I'm happy to provide one (at a certain cost). My opinion/solution for this is to not include it in "core" but as a separate package (initial name idea: In that project I'd have a few remarks though (likely in the README):
The last one is important to me, it's sometimes already not so great to upgrade all plugins. It's also another reason why I don't want this in "core" because in "core" everything should always be up to date and fully working. The DSL itself is remarkably similar to what I had in mind looking at it 👌 (mostly the benchmark <--> scenario relationship and how hooks are incorporated) I have some minor points but we can discuss those another time. I haven't looked at the implementation yet, but it'd be important to me that it's easy to add new options etc (highest goal: if we add a new option to benchee I don't even need to touch it, it just works). How does that sound for everyone? (gotta run now, sorry, here have a bunny) |
@PragTob Then I am really looking forward for you to check the implementation. And, as I already said, it's not some all-new custom API that changes the way Benchee works, the Benchee as it exists (single-function-API) is barely touched (on option is divided in two, you can barely call this a change). It is simply a wrapper around Benchee.run/2 with custom interface for passing options and benchmarks itself. As for the
It's easy to implement really. Even though I'd advocate for just a subset of them (for starters, I can imagine NO case where one might want to pass before_scenario callback as a COMMAND-LINE OPTION) |
@arikai I didn't mean command line options (I haven't checked the PR yet) but more about the DSL in the example file. Like how you specify I'll see when I'll have time to review it that week, due to my opinion that macros should only be used when functions absolutely don't work my macro isn't as fluent as it should be 😅 |
@PragTob
And basically that's it. Surely it wraps some blocks into functions and what else, but the options are exactly the same you are to support for Benchee.run options and per-scenario options. About your assumption
It's true for languages with rather simple macro preprocessing system (think C family, Erlang), but Elixir has more of a Lisp-like macro. It means that macro return somewhat of Elixir Intermidiate Representation that it works with during the actual compilation of the files. Please check the implementation as soon as you can. |
@arikai my main point wasn't that it should be the same options (that's a given) but that when I add new one to benchee I don't need to add it manually. So kinda that it builds the available DSL methods automatically from the available keys in So that when I introduce a new key in benchee like |
Please refer to my previous answer
The only thing is that it's just a parameter to a macro, not a function: exactly the same way Benchee.run/2 accepts options as an optional second argument. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hey,
thanks for all your work. I'd have preferred it though if we talked about the idea and the DSL design before we got in here.
Not sure where to best continue this conversation as I still don't want to have it in core but rather a benchee_dsl
project that still needs to be setup :) That also has the advantage that not everything has to work/be ready for release from the get go.
There are lots of linline comments, but the biggest one is probably the options in the DSL. I'd rather have them all in the same place:
benchmark "...", option1: value1, option2: value2 do
end
# or
benchmark "..." do
option1 value1
option2 value2
# ...
end
Right now input breaks from the all in the option thing.
Implementation wise I'd like to split it up a bit more to help with readability and maintainability.
Thanks!
Tobi
# | ||
# If defined, must return {:ok, state} for benchmarks to run | ||
setup do | ||
{:ok, nil} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why do we need {:ok, state}
? I know ExUnit does it, but why can't we just return state
? If something goes wrong it can just raise
or what am I missing?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's rather a matter of convention. raise
means some serious runtime error that should have never happened if the code were write.
Surely, we can, just throw
things, but it will be more implicit I think: if you just look on the result type of the setup, you already know what it should return to work correctly. To tell programmers to just throw their error would be less obvious IMO than that.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I just never quite understood it. In what scenario will my setup not work? Usually the code I call will already raise at some point. It might just be me, I've never seen setup
implementation that returns anything but {:ok, state}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Well, the code that returns ok/error pair usually should never throw (although might raise), at least with any other reason then MatchError, but that's on the caller side, so not function's problem, but programmer that's calling it :)
Surely it might be easier to write happy-path-code all the way, just a matter of taste I guess.
Share your final thoughts on this :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm fine with it for ExUnits sake 🤷♂️
|> Path.join() | ||
|> Path.wildcard() | ||
else | ||
[file] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
just file
should be enough here shouldn't it?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The function must return list of files found for the further use in iterator functions.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure? It's being flat_maped, so the wrapped list should be removed anyhow.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Uh, not sure if it was a question, but here's a piece of iex session:
iex(1)> [:a, :b, :c] |> Enum.flat_map(&[&1])
[:a, :b, :c]
iex(2)> [:a, :b, :c] |> Enum.flat_map(& &1)
** (Protocol.UndefinedError) protocol Enumerable not implemented for :a of type Atom
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think I meant more because we return [file]
here and alter flat_map
why can't we just return file
and then map
instead of flat_map
?
@arikai where do we want to go from here, shall I set up a new benchee_dsl repo for us to put this in and then PR against this one? PR the printing of system against this one? Thanks! |
@PragTob About separate repo: I do believe that these two rather small modules belong to Benchee, since there are no drawback for such decision. I would've considered separate repo if the project were to be included into e.g. final Please do inform me if there are any drawbacks. |
@arikai right back at you :) Sorry, I was busy with different stuff as well. Thanks for all the work 💚 There are some errors here, but I believe they can be fixed/figured out. Some of them seem to be formatter incompatibilities between versions which we could fix, along with some dialyzer warnings that also seem questionable. There might be no observable code/artifact drawbacks for such a decision, but there are others:
Anyhow, so long 👋 |
Just to chime in here - I agree with all of the points @PragTob mentions above, but the primary reason I don’t want to have macros in Benchee is that they can’t be used in Erlang/Gleam/LFE/etc. We’ve tried hard to make sure this library is useful for the entire BEAM community, and unfortunately macros - since they’re specific to Elixir - don’t play nicely with those other languages. I’m all for folks building language-specific libraries on top of the core we provide in Benchee if there are things (such as this DSL) that are possible in one language but not in another, but I would really like to see Benchee itself remain useful to any language on the BEAM and not to have sections of this library that are specific to one language or another. If one day Elixir macros become usable in those other languages, I’d be all for revisiting this, actually, but until that happens I would strongly prefer to see this functionality in a separate package. |
@devonestes ah I actually forgot about that, great point thank you 💚 I'm fine having the extra package in the benchee org and maintaining it. I just won't promise that it'll always support all the newest options/things in benchee "core" :) |
I recon our friends from @hrzndhrn have a dsl package out there for folks. I'll review it and then maybe start a convo with them if they'd want to move it to the main benchee org. |
Sometimes it's necessary for a project to measure it's performance, and however easy to use, Benchee.run is usually a one-shot solution.
I propose ExUnit-like task 'mix benchmark' and DSL to create benchmarks for a project.
These benchmarks are plain .exs Elixir scripts placed in the ./benchmarks directory of a project, and whenever 'mix benchmark' is run, it compiles all the modules within these scripts and runs benchmarks specified within them. It's also possible to pass individual files and directories as arguments to the task to run contained benchmarks, e.g. 'mix benchmark dir1 file1 file2 dir2'.
Changes:
Syntax for DSL is better described with an example. One is contained in the new './example' folder and can be found here: https://github.com/bencheeorg/benchee/blob/94979ee88a784664c2ee3c716e32c0f4fadb06be/example/project_benchmark.exs
Notes:
P.S. Kind of a wish: I would strongly recommend to propose this library as a part of standard Elixir distribution, along with ExUnit, since this functionality might be useful for projects of any kind.