-
Notifications
You must be signed in to change notification settings - Fork 454
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Improve performance of generatings IDs from tags in M3Coordinator #1000
Conversation
richardartoul
commented
Oct 1, 2018
•
edited
Loading
edited
src/query/models/tag.go
Outdated
// IDLen returns the length of the ID that would be | ||
// generated from the tags. | ||
func (t Tags) IDLen() int { | ||
idLen := 0 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: might be a little better to have idLen := len(t) * 2
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think thats wrong though, because I want it to be the length of the generated ID, not twice the number of tags
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, sorry I mean initialize it to be twice the length of t
to account for all of the eq
and sep
rather than adding 2 each time through the loop
src/query/models/tag.go
Outdated
|
||
// WriteBytesID writes out the ID representation | ||
// of the tags into the provided buffer. | ||
func (t Tags) WriteBytesID(b []byte) []byte { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: why does this one need a buffer, rather than creating one like the stringID function?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The idea is that the caller controls the buffer which enables us to add pooling later without making an API change
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yeah I like the caller providing a buffer
Codecov Report
@@ Coverage Diff @@
## master #1000 +/- ##
==========================================
+ Coverage 77.55% 77.84% +0.29%
==========================================
Files 412 411 -1
Lines 34653 34539 -114
==========================================
+ Hits 26875 26887 +12
+ Misses 5903 5784 -119
+ Partials 1875 1868 -7
Continue to review full report at Codecov.
|
src/query/models/tag.go
Outdated
func (t Tags) ID() string { | ||
b := make([]byte, 0, len(t)) | ||
// StringID returns a string representation of the tags | ||
func (t Tags) StringID() string { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i think this is used at a few more places than just write so you would have to change all of them
func (t Tags) StringID() string { | ||
var ( | ||
idLen = t.IDLen() | ||
strBuilder = strings.Builder{} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
is this 1.10+ only feature ? If yes, maybe we should figure out if we are still planning to support 1.9
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
1.9 is over a year old at this point, 1.10 is over 6 months old. 1.11 just came out.
Any reason you want to support 1.9?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Probably ok to drop support, since we're only supporting built binaries really. We would like most people to use all this software as docker images or as binaries built from the tagged releases.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nah, i agree that we should drop 1.9 support. Even prom only supports 1.10+ prometheus/prometheus#3856. We should call it out in our README though.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah yup, @richardartoul want to update the README.md (in a minimal way, we can update formatting/etc of it later) that we only support 1.10 at this time?
src/query/models/tag.go
Outdated
|
||
// WriteBytesID writes out the ID representation | ||
// of the tags into the provided buffer. | ||
func (t Tags) WriteBytesID(b []byte) []byte { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yeah I like the caller providing a buffer
src/query/models/tag.go
Outdated
func (t Tags) ID() string { | ||
b := make([]byte, 0, len(t)) | ||
// StringID returns a string representation of the tags | ||
func (t Tags) StringID() string { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(unrelated to the pr)
Tags keys/values are allowed to have '=' and ','; this encoding isn't invertible - i.e. given a string representation of some tags, you can't guarantee that it comes from a single value.
How is this function used?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This determines what the ID of the metric will be in M3DB
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we should talk about this offline.
src/query/models/tag.go
Outdated
func (t Tags) ID() string { | ||
b := make([]byte, 0, len(t)) | ||
// StringID returns a string representation of the tags | ||
func (t Tags) StringID() string { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why do you need to change the method name?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
because I added a BytesID(), so seems good to distinguish the two methods so the caller is forced to think about what they are doing
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
fair. The goish (go-y?) way of doing this is to call WritesBytesID something like MarshalTo()/MarshalInto()
maybe call the two methods ID()
and IDMarshalTo()
?
src/query/storage/m3/storage.go
Outdated
identID := ident.StringID(id) | ||
// TODO: Consider caching id -> identID (requires setting NoFinalize()). | ||
var ( | ||
// TODO: Pool this once an ident pool is setup. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We'd probably want to instead read this directly as bytes once we eventually have the .proto
changes for them to come directly as bytes from the wire.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
would we even use t Tags
at that point? might as well implement the interface over the grpc generated type
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hm that would only be for tags though I guess, you still need a buffer to write out the ID concat'd together.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think that will work because prometheus will send over the tags as []byte but the ID is a concatenation of all the tags so you would still need to copy all the tags into a new buffer (which would be the ID)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you update this comment to say we'd have to remove NoFinalize()
if we started pooling this?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah although if we do that its gonna reintroduce the issue of copying which is annoying...Should we make the session API assume ownership of the ID? then it won't have to do a copy internally
@@ -287,7 +291,7 @@ func (s *m3storage) Write( | |||
datapoint := datapoint | |||
wg.Add(1) | |||
s.writeWorkerPool.Go(func() { | |||
if err := s.writeSingle(ctx, query, datapoint, identID, tagIter); err != nil { | |||
if err := s.writeSingle(ctx, query, datapoint, id, tagIter); err != nil { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could we add a special case here for when len(query.Datapoints) == 1
to just return s.writeSingle(...)
as discussed?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done