Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Scale units #26

Merged
merged 17 commits into from
Sep 18, 2016
Merged

Scale units #26

merged 17 commits into from
Sep 18, 2016

Conversation

wasnotrice
Copy link
Collaborator

@wasnotrice wasnotrice commented Sep 15, 2016

Concept for addressing scaling units, see #2

@PragTob Not done obviously, but I thought I'd open things up for your initial feedback 😄

Copy link
Member

@PragTob PragTob left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Really like the changes, basically only one question about unit_label. Otherwise looks great, also like the name Units (first my brain read it as Util and I was gonna complain but read again :D) Thanks a lot 🎉 !

def scale_count(count) when count >= @one_billion, do: {count / @one_billion, :billion}
def scale_count(count) when count >= @one_million, do: {count / @one_million, :million}
def scale_count(count) when count >= @one_thousand, do: {count / @one_thousand, :thousand}
def scale_count(count), do: {count, :one}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍

def scale_count(count) when count >= @one_thousand, do: {count / @one_thousand, :thousand}
def scale_count(count), do: {count, :one}

@spec format_count(number) :: String.t
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ha, was thinking about adding type specs (actually there should be a ticket somewhere...). Wasn't too happy about my first experience with dialyxir and co. if you wanna do some work there, doing it in another PR sounds cool :)

def unit_label(:billion), do: "B"
def unit_label(:million), do: "M"
def unit_label(:thousand), do: "K"
def unit_label(_), do: ""
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hm, My thought would be that this sort of data would better be stored in a map and then maybe retrieved through a function (or online). What do you think? An upside I'm missing here? :)

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@PragTob yeah that makes sense. This implementation is a TDD relic, where I haven't refactored yet :)


test ".format_count(1.234)" do
assert format_count(1.234) == "1.23"
end
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good tests!

@wasnotrice
Copy link
Collaborator Author

@PragTob glad you like the changes—I'll flesh it out a bit

@PragTob
Copy link
Member

PragTob commented Sep 15, 2016

Ah, btw. the function for the option between M/Million can be another ticket and could also wait until someone complains :D imo it might be best to integrate it into the console formatter now and scale every value to the same unit (best fit) - that'd be one time through the relevant stack and achieve a user observable feature. Then scaling time can also be done in a separate PR to keep this one as small as possible. What do you think?

@wasnotrice
Copy link
Collaborator Author

Sure, that makes sense. We can wait on the M/Million.

I was thinking about "best fit". I think there are probably 3 strategies for finding the best fit, and my guess is that the "right" one will depend on the data and user preference. Let's play with it when we get that far and see how it feels with the samples.

@wasnotrice
Copy link
Collaborator Author

This last batch of commits includes a big refactor to improve the public interface, and also to reduce internal duplication. Now Benchee.Unit defines a behaviour that is implemented in Benchee.Unit.Count and Benchee.Unit.Duration`.

This lets us have two modules with the same interface

# new
Benchee.Unit.Count.scale(1_000)
Benchee.Unit.Duration.scale(1_000)

instead of one module with two sets of related functions:

# old
Benchee.Unit.scale_count(1_000)
Benchee.Unit.scale_duration(1_000)

I like the way the behaviour turned out, but I'm not totally satisfied with the module naming. Maybe it would be better to just have Benchee.Count and Benchee.Duration.

Next step will be to integrate the units into the Console formatter 🎈

@wasnotrice
Copy link
Collaborator Author

wasnotrice commented Sep 16, 2016

I think this most recent failing check (commit 362d17b) is spurious (or at least unrelated to these code changes). Happened only on 1.2.6, in one of the console output regexes

@PragTob
Copy link
Member

PragTob commented Sep 17, 2016

@wasnotrice yeah that one... need to get rid of that one and substitute another example, retry or something. Integration testing a benchmarking library on varying environments is an.. interesting topic :D I'll get to a review a bit later. Thanks a lot!

Copy link
Member

@PragTob PragTob left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great stuff! Love the tests and usage of behaviour (I should really do that to formatters...) some questions and little formatting things that need some fixes or some explanation :)


# In 1.3, this could be declared as `keyword`, but use a custom type so it
# will also compile in 1.2
@type options ::[{atom, atom}]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍 thanks for the comment, not sure about the support/upgrade policy yet (I really wanna have describe :D) but I guess once there is 1.4 dropping 1.2 becomes an option.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah it was kind of a bummer to realize that I had to undo the describes (I forgot those wouldn't work on 1.2), but there were only 3, and the tests aren't actually that much harder to read.

def float_precision(float) when float < 0.01, do: 5
def float_precision(float) when float < 0.1, do: 4
def float_precision(float) when float < 0.2, do: 3
def float_precision(_float), do: 2
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

hm, any reason float_precision is on Unit and not in Common ? :)

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not really...it's imported by the console formatter for now, until I actually use the units code in the formatter, at which point it won't be necessary. But yes, it should go to Common, good catch!

|> Enum.reduce(%{}, &totals_by_unit/2)
|> Enum.into([])
|> Enum.sort(&(sort_by_total_and_magnitude(&1, &2, module)))
|> hd
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we are sorting just to get the first element, couldn't we use Enum.min_by instead?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I didn't fix this one yet because the sort function with tiebreaker isn't trivial to express in terms of one element, like Enum.min_by wants. It might be worth doing though, as long as the resulting code isn't even more opaque than this :)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If it's too hard, we can always leave it for another day and another PR :)

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I refactored this a little bit but still couldn't come up with anything more elegant than that final sort function. At least it's got a better name now 😄

case Keyword.get(opts, :strategy, :best) do
:best -> best_unit(list, module)
:largest -> largest_unit(list, module)
:smallest -> smallest_unit(list, module)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

code style wise I like to align the arrows (just like in hashes), not mandatory though :)

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fixed by 05aa9df

|> Enum.map(&(scale_unit(&1, module)))
|> Enum.sort(&(sort_by_magnitude(&1, &2, module)))
|> Enum.reverse
|> hd
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

max_by?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed by 05aa9df

million: %{ magnitude: @one_million, short: "M", long: "Million"},
thousand: %{ magnitude: @one_thousand, short: "K", long: "Thousand"},
one: %{ magnitude: 1, short: "", long: ""},
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

{} placement is a bit inconsistent here, the beginning has a space while the end doesn't. Either way is fine with me :)

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed by 05aa9df

{4.32109, :thousand}

iex> Benchee.Unit.Count.scale(0.0045)
{0.0045, :one}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

❤️ doctests!

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Me too! I was thinking it might make sense to move some other test cases to doctests at some point. What do you think?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I love doctests, however, that's why I also sometimes fear I might be overdoing them. I have entire modules that are just doctested, as I think that for users also the edge cases are interesting and more precise than text.

So, I'm in favor :)

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agree! But maybe in another PR


test ".best when list is mostly thousands, strategy: :largest" do
assert best(@list_with_mostly_thousands, strategy: :largest) == :thousand
end
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great understandable tests 👍


test ".format 0.001234567 scales to :one" do
assert scale(0.001234567) == {0.001234567, :one}
end
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

descriptions from here and up are wrong (say .format but mean .scale is done)

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed by 05aa9df

|> Enum.map(&(scale_unit(&1, module)))
|> Enum.sort(&(sort_by_magnitude(&1, &2, module)))
|> hd
end
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

docs with a general short description would be nice (ok smallest and largest are easy, best is the one that occurs the most, right?)

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed by 05aa9df

- spacing
- wrong function name in test descriptions
- use Enum.min_by/2 and Enum.max_by/2 :)
- document best unit functions
@PragTob
Copy link
Member

PragTob commented Sep 17, 2016

Thanks for all the fixups! Left a couple of smaller comments now, otherwise looks great.

I was thinking, I'd be fine with already merging this although it's not a complete user facing feature yet. All the additions don't interfere with the rest of the code base, this PR is already a bit longer and console integration could be a great next PR. What do you think?

- rename "total" to "frequency"
- use a `group_by`/`map` to replace an awkward custom `reduce` function
- use `fn` notation instead of `&` shorthand to improve readability
@wasnotrice
Copy link
Collaborator Author

I agree, let's merge this in if you are satisfied with the most recent changes. I'll open issues for integrating with console formatter and for supporting short/long labels

@PragTob PragTob merged commit ee44a64 into bencheeorg:master Sep 18, 2016
@PragTob
Copy link
Member

PragTob commented Sep 18, 2016

💚 🎉 :shipit: 🎉 :shipit: 💚

@wasnotrice wasnotrice mentioned this pull request Sep 19, 2016
@wasnotrice wasnotrice deleted the scale_units branch September 26, 2016 15:03
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants