-
Notifications
You must be signed in to change notification settings - Fork 174
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Added a more performant Hyperloglog. #190
Conversation
Hey @martinpinto! We saw that new library as well and were looking forward to trying this out. You beat us to it! Due to to it being a holiday week we might be a bit slow to evaluate this, but I will try and do so soon as I'm trying to burn down our PR/issue list this week. :) Thanks! |
Gerald Rule: Copy Observability on Veneur and Unilog pull requestscc @stripe/observability |
sure :) no stress! happy to be able to contribute :) |
Great! I'm winding up to do some profiling on my end for this and #189. |
I did some benchmarking this morning for this and #189 since they are related. The micro benchmark made a single func BenchmarkSet(b *testing.B) {
s := NewSet("ab.b.c", nil)
for n := 0; n < b.N; n++ {
s.Sample(strconv.Itoa(n), 1.0)
}
_ = s.Flush()
} Here are the results. I ran each 3 times so I could average them. That's probably not necessary considering how go handles benchmarks, but whatevs: ClarkDuvall, 18 (current implementation, precision):
ClarkDuvall, 14 (current implementation, lower precision suggested by @seiflotfy):
Axiom, 18 (new implementation, same precision)
Axiom, 14 (new implementation, new precision)
Assuming that the results of each of these implementation is Good Enough™ it's pretty clear that the new implementation + new precision are a net win showing a 47% reduction in B/op and 57% reduction in ns/op. I will touch base with @aditya-stripe in the next few days and get his feels on this, but it seems like a win to me! |
It's also worth nothing that this PR removes the use of the |
02a8bce
to
3b3dac4
Compare
Any updates? |
Nope! Pretty sure we'll merge this soon but we're a bit busy this week with some travel and team stuff. Sorry for the delay! I don't think we'll need anything extra except maybe a rebase in the future? :) |
Yeah no problem! We can do that. :) |
I'm doing a bunch of merges this morning, so I'll ping this when I'm done. Also, you can give me permission to rebase this myself! |
4e3d7e5
to
ab58c51
Compare
Allow edits from maintainers is checked. Feel free :) |
Perfect, thanks! |
585fd77
to
267a653
Compare
Update! Last week we decided that this PR would be held out while we merged the release we were already working on. That release — something we'd been working on for internal use — consumed most of our time. This week part of the team is OoO so this is most likely something we'll target for next week. |
any update? |
Hey @seiflotfy - thanks for following up! We're back fully in the office this week, so just having a chance to take a look now. This looks good to us, but it'd be nice for us to have some solid benchmarks (before/after) based on the actual runtime performance. We tried adding that a few months ago, but it turned out that we couldn't do this in 1.8; 1.9 includes a patch that fixes the problem. So we have a PR that gives us the GC metrics, and it's just waiting on 1.9 to hit stable. We also have a number of other changes regarding memory usage that are also waiting on that. Since this isn't urgent, and 1.9 is only a week or two away, we'd prefer to hold off until after 1.9 is released and #144 is merged. The upside is that, by waiting a few weeks, we'll actually be able to do a proper blue/green test of this change and measure its effects on our production data, with a nice before-and-after graph to accompany it! :) |
Love it ... Anything I can help with?
…On Fri, Jul 28, 2017 at 6:27 PM Aditya Mukerjee ***@***.***> wrote:
Hey @seiflotfy <https://github.com/seiflotfy> - thanks for following up!
We're back fully in the office this week, so just having a chance to take a
look now.
This looks good to us, but it'd be nice for us to have some solid
benchmarks (before/after) based on the actual runtime performance. We tried
adding that a few months ago, but it turned out that we couldn't do this in
1.8; 1.9 includes a patch that fixes the problem. So we have a PR that
gives us the GC metrics
<#144 (comment)>, and
it's just waiting on 1.9 to hit stable. We also have a number of other
changes regarding memory usage that are also waiting on that.
Since this isn't urgent, and 1.9 is only a week or two away, we'd prefer
to hold off until after 1.9 is released and #144
<#144> is merged. The upside is
that, by waiting a few weeks, we'll actually be able to do a proper
blue/green test of this change and measure its effects on our production
data! :)
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#190 (comment)>, or mute
the thread
<https://github.com/notifications/unsubscribe-auth/AAHMTZbPRu111Ftafp9Qj0c2PpBjOkDRks5sSgwPgaJpZM4OMWoD>
.
|
Same here, would love to help out :) |
Short of anything that would get the 1.9 stable release out faster, not really! ;) We also have some August OoOs on the team, so it may not be immediate, but we've got this bookmarked and are excited to try it out. Stay tuned! |
Hey folks! Just a note that we're now rolling out 1.9 in our local tests, and will be able to start looking at the stats + this merge in the near term. Thanks for your patience! |
Note that our GC metrics have been merged, so we should be able to get to this soon. |
@asf-stripe is the release captain for v1.7: Can you make sure we include this? Keep in mind that we'll need to rebase, and since this is before we migrated to |
267a653
to
1c6e773
Compare
I just rebased using |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🎉
Thank you so much for your contribution, @martinpinto!! This will be in 1.7 - I'll copy you on a pull request to update the CHANGELOG (: |
Summary
Changed the current implementation of the Hyperloglog by Clark Duvall to the Axiom one.
Motivation
Based on the intent on https://news.ycombinator.com/item?id=14636699 I took a stab at using the axiomhq implementation to reduce resource usage.
Test plan
I adapted and used the existing samplers_test.go test case.
Rollout/monitoring/revert plan
A rollout should be fairly easy as changing the import and changing a few method calls.