Scale units #26

wasnotrice · 2016-09-15T06:16:17Z

Concept for addressing scaling units, see #2

Add scale_duration (hours, minutes, seconds, milliseconds, microseconds)
Add function to find "best fit" scale for a list of numbers
Integrate into console formatter (moved to Integrate auto-scaled units into console formatter #27)
Add option for long/short labels (M/million) (moved to Add option for displaying long labels for scaled units #28)

@PragTob Not done obviously, but I thought I'd open things up for your initial feedback 😄

Move float_precision/1 to Benchee.Units

PragTob

Really like the changes, basically only one question about unit_label. Otherwise looks great, also like the name Units (first my brain read it as Util and I was gonna complain but read again :D) Thanks a lot 🎉 !

PragTob · 2016-09-15T19:27:47Z

lib/benchee/formatters/units.ex

+  def scale_count(count) when count >= @one_billion,  do: {count / @one_billion, :billion}
+  def scale_count(count) when count >= @one_million,  do: {count / @one_million, :million}
+  def scale_count(count) when count >= @one_thousand, do: {count / @one_thousand, :thousand}
+  def scale_count(count), do: {count, :one}


PragTob · 2016-09-15T19:29:08Z

lib/benchee/formatters/units.ex

+  def scale_count(count) when count >= @one_thousand, do: {count / @one_thousand, :thousand}
+  def scale_count(count), do: {count, :one}
+
+  @spec format_count(number) :: String.t


Ha, was thinking about adding type specs (actually there should be a ticket somewhere...). Wasn't too happy about my first experience with dialyxir and co. if you wanna do some work there, doing it in another PR sounds cool :)

PragTob · 2016-09-15T19:30:57Z

lib/benchee/formatters/units.ex

+  def unit_label(:billion), do: "B"
+  def unit_label(:million), do: "M"
+  def unit_label(:thousand), do: "K"
+  def unit_label(_), do: ""


Hm, My thought would be that this sort of data would better be stored in a map and then maybe retrieved through a function (or online). What do you think? An upside I'm missing here? :)

@PragTob yeah that makes sense. This implementation is a TDD relic, where I haven't refactored yet :)

PragTob · 2016-09-15T19:32:21Z

test/benchee/formatters/units_test.exs

+
+  test ".format_count(1.234)" do
+    assert format_count(1.234) == "1.23"
+  end


Good tests!

wasnotrice · 2016-09-15T19:45:54Z

@PragTob glad you like the changes—I'll flesh it out a bit

PragTob · 2016-09-15T19:46:18Z

Ah, btw. the function for the option between M/Million can be another ticket and could also wait until someone complains :D imo it might be best to integrate it into the console formatter now and scale every value to the same unit (best fit) - that'd be one time through the relevant stack and achieve a user observable feature. Then scaling time can also be done in a separate PR to keep this one as small as possible. What do you think?

wasnotrice · 2016-09-15T19:49:50Z

Sure, that makes sense. We can wait on the M/Million.

I was thinking about "best fit". I think there are probably 3 strategies for finding the best fit, and my guess is that the "right" one will depend on the data and user preference. Let's play with it when we get that far and see how it feels with the samples.

Count and Duration implement the Unit behaviour

wasnotrice · 2016-09-16T17:10:23Z

This last batch of commits includes a big refactor to improve the public interface, and also to reduce internal duplication. Now Benchee.Unit defines a behaviour that is implemented in Benchee.Unit.Count and Benchee.Unit.Duration`.

This lets us have two modules with the same interface

# new
Benchee.Unit.Count.scale(1_000)
Benchee.Unit.Duration.scale(1_000)

instead of one module with two sets of related functions:

# old
Benchee.Unit.scale_count(1_000)
Benchee.Unit.scale_duration(1_000)

I like the way the behaviour turned out, but I'm not totally satisfied with the module naming. Maybe it would be better to just have Benchee.Count and Benchee.Duration.

Next step will be to integrate the units into the Console formatter 🎈

wasnotrice · 2016-09-16T18:17:22Z

I think this most recent failing check (commit 362d17b) is spurious (or at least unrelated to these code changes). Happened only on 1.2.6, in one of the console output regexes

PragTob · 2016-09-17T08:45:11Z

@wasnotrice yeah that one... need to get rid of that one and substitute another example, retry or something. Integration testing a benchmarking library on varying environments is an.. interesting topic :D I'll get to a review a bit later. Thanks a lot!

PragTob

Great stuff! Love the tests and usage of behaviour (I should really do that to formatters...) some questions and little formatting things that need some fixes or some explanation :)

PragTob · 2016-09-17T15:18:57Z

lib/benchee/formatters/unit.ex

+
+  # In 1.3, this could be declared as `keyword`, but use a custom type so it
+  # will also compile in 1.2
+  @type options ::[{atom, atom}]


👍 thanks for the comment, not sure about the support/upgrade policy yet (I really wanna have describe :D) but I guess once there is 1.4 dropping 1.2 becomes an option.

Yeah it was kind of a bummer to realize that I had to undo the describes (I forgot those wouldn't work on 1.2), but there were only 3, and the tests aren't actually that much harder to read.

PragTob · 2016-09-17T15:21:12Z

lib/benchee/formatters/unit.ex

+  def float_precision(float) when float < 0.01, do: 5
+  def float_precision(float) when float < 0.1, do: 4
+  def float_precision(float) when float < 0.2, do: 3
+  def float_precision(_float), do: 2


hm, any reason float_precision is on Unit and not in Common ? :)

Not really...it's imported by the console formatter for now, until I actually use the units code in the formatter, at which point it won't be necessary. But yes, it should go to Common, good catch!

PragTob · 2016-09-17T15:26:12Z

lib/benchee/formatters/unit.ex

+      |> Enum.reduce(%{}, &totals_by_unit/2)
+      |> Enum.into([])
+      |> Enum.sort(&(sort_by_total_and_magnitude(&1, &2, module)))
+      |> hd


we are sorting just to get the first element, couldn't we use Enum.min_by instead?

I didn't fix this one yet because the sort function with tiebreaker isn't trivial to express in terms of one element, like Enum.min_by wants. It might be worth doing though, as long as the resulting code isn't even more opaque than this :)

If it's too hard, we can always leave it for another day and another PR :)

I refactored this a little bit but still couldn't come up with anything more elegant than that final sort function. At least it's got a better name now 😄

PragTob · 2016-09-17T15:26:13Z

lib/benchee/formatters/unit.ex

+      case Keyword.get(opts, :strategy, :best) do
+        :best -> best_unit(list, module)
+        :largest -> largest_unit(list, module)
+        :smallest -> smallest_unit(list, module)


code style wise I like to align the arrows (just like in hashes), not mandatory though :)

fixed by 05aa9df

PragTob · 2016-09-17T15:28:28Z

lib/benchee/formatters/unit.ex

+      |> Enum.map(&(scale_unit(&1, module)))
+      |> Enum.sort(&(sort_by_magnitude(&1, &2, module)))
+      |> Enum.reverse
+      |> hd


Fixed by 05aa9df

PragTob · 2016-09-17T15:43:56Z

lib/benchee/formatters/unit/count.ex

+    million:  %{ magnitude: @one_million, short: "M", long: "Million"},
+    thousand: %{ magnitude: @one_thousand, short: "K", long: "Thousand"},
+    one:      %{ magnitude: 1, short: "", long: ""},
+  }


{} placement is a bit inconsistent here, the beginning has a space while the end doesn't. Either way is fine with me :)

Fixed by 05aa9df

PragTob · 2016-09-17T15:45:24Z

lib/benchee/formatters/unit/count.ex

+      {4.32109, :thousand}
+
+      iex> Benchee.Unit.Count.scale(0.0045)
+      {0.0045, :one}


❤️ doctests!

Me too! I was thinking it might make sense to move some other test cases to doctests at some point. What do you think?

I love doctests, however, that's why I also sometimes fear I might be overdoing them. I have entire modules that are just doctested, as I think that for users also the edge cases are interesting and more precise than text.

So, I'm in favor :)

Agree! But maybe in another PR

PragTob · 2016-09-17T15:47:30Z

test/benchee/formatters/unit/count_test.exs

+
+  test ".best when list is mostly thousands, strategy: :largest" do
+    assert best(@list_with_mostly_thousands, strategy: :largest) == :thousand
+  end


Great understandable tests 👍

PragTob · 2016-09-17T15:48:46Z

test/benchee/formatters/unit/count_test.exs

+
+  test ".format 0.001234567 scales to :one" do
+    assert scale(0.001234567) == {0.001234567, :one}
+  end


descriptions from here and up are wrong (say .format but mean .scale is done)

Fixed by 05aa9df

PragTob · 2016-09-17T15:52:15Z

lib/benchee/formatters/unit.ex

+      |> Enum.map(&(scale_unit(&1, module)))
+      |> Enum.sort(&(sort_by_magnitude(&1, &2, module)))
+      |> hd
+    end


docs with a general short description would be nice (ok smallest and largest are easy, best is the one that occurs the most, right?)

Fixed by 05aa9df

- spacing - wrong function name in test descriptions - use Enum.min_by/2 and Enum.max_by/2 :) - document best unit functions

PragTob · 2016-09-17T21:28:47Z

Thanks for all the fixups! Left a couple of smaller comments now, otherwise looks great.

I was thinking, I'd be fine with already merging this although it's not a complete user facing feature yet. All the additions don't interfere with the rest of the code base, this PR is already a bit longer and console integration could be a great next PR. What do you think?

- rename "total" to "frequency" - use a `group_by`/`map` to replace an awkward custom `reduce` function - use `fn` notation instead of `&` shorthand to improve readability

wasnotrice · 2016-09-18T04:05:09Z

I agree, let's merge this in if you are satisfied with the most recent changes. I'll open issues for integrating with console formatter and for supporting short/long labels

PragTob · 2016-09-18T07:52:03Z

💚 🎉 🎉 💚

wasnotrice added 6 commits September 15, 2016 00:19

Add scale_count for one to 100M

cbb4157

Add scale_count for billion

1e23d65

Add test for not scaling down below ones

0c1a26d

Add format_count

1fb0e42

Move float_precision/1 to Benchee.Units

Add more basic formatting tests

7019c82

Add moduledoc

bd8849a

PragTob reviewed Sep 15, 2016

View reviewed changes

wasnotrice added 8 commits September 15, 2016 16:22

Add .format_duration

a01f06d

Add .best_for_counts

6dda0c9

Add .best_for_duration

9739b5d

Major refactor, put units into separate modules

f173f61

More refactoring to eliminate redundancy

96a185a

Refactor to a Unit behaviour

f890b67

Count and Duration implement the Unit behaviour

Add docs for scale

30ce1cb

Remove keyword typespec for 1.2 compatibility

a3623e2

Remove describe blocks for 1.2 compatibility

362d17b

PragTob requested changes Sep 17, 2016

View reviewed changes

Fix code review issues

05aa9df

- spacing - wrong function name in test descriptions - use Enum.min_by/2 and Enum.max_by/2 :) - document best unit functions

Refactor for better naming

e11d8ed

- rename "total" to "frequency" - use a `group_by`/`map` to replace an awkward custom `reduce` function - use `fn` notation instead of `&` shorthand to improve readability

wasnotrice mentioned this pull request Sep 18, 2016

Integrate auto-scaled units into console formatter #27

Closed

PragTob approved these changes Sep 18, 2016

View reviewed changes

PragTob merged commit ee44a64 into bencheeorg:master Sep 18, 2016

wasnotrice mentioned this pull request Sep 19, 2016

Auto Scale units #2

Closed

wasnotrice deleted the scale_units branch September 26, 2016 15:03

Scale units #26

Scale units #26

Conversation

wasnotrice commented Sep 15, 2016 • edited Loading

PragTob left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

wasnotrice commented Sep 15, 2016

PragTob commented Sep 15, 2016

wasnotrice commented Sep 15, 2016

wasnotrice commented Sep 16, 2016

wasnotrice commented Sep 16, 2016 • edited Loading

PragTob commented Sep 17, 2016

PragTob left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

PragTob commented Sep 17, 2016

wasnotrice commented Sep 18, 2016

PragTob commented Sep 18, 2016

wasnotrice commented Sep 15, 2016 •

edited

Loading

wasnotrice commented Sep 16, 2016 •

edited

Loading