-
Notifications
You must be signed in to change notification settings - Fork 454
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Improve performance of generatings IDs from tags in M3Coordinator #1000
Changes from 7 commits
e00630c
01ef7ae
b77243c
2fde678
a24b31e
fd32544
8a85448
f823c5d
ff13c86
d528c01
317b87c
753c68f
a9919a6
3851b65
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -25,6 +25,7 @@ import ( | |
"hash/fnv" | ||
"regexp" | ||
"sort" | ||
"strings" | ||
) | ||
|
||
const ( | ||
|
@@ -145,17 +146,46 @@ func (m Matchers) ToTags() (Tags, error) { | |
return Normalize(tags), nil | ||
} | ||
|
||
// ID returns a string representation of the tags | ||
func (t Tags) ID() string { | ||
b := make([]byte, 0, len(t)) | ||
// StringID returns a string representation of the tags | ||
func (t Tags) StringID() string { | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. (unrelated to the pr) Tags keys/values are allowed to have '=' and ','; this encoding isn't invertible - i.e. given a string representation of some tags, you can't guarantee that it comes from a single value. How is this function used? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. This determines what the ID of the metric will be in M3DB There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. we should talk about this offline. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. why do you need to change the method name? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. because I added a BytesID(), so seems good to distinguish the two methods so the caller is forced to think about what they are doing There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. fair. The goish (go-y?) way of doing this is to call WritesBytesID something like maybe call the two methods |
||
var ( | ||
idLen = t.IDLen() | ||
strBuilder = strings.Builder{} | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. is this 1.10+ only feature ? If yes, maybe we should figure out if we are still planning to support 1.9 There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. 1.9 is over a year old at this point, 1.10 is over 6 months old. 1.11 just came out. Any reason you want to support 1.9? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Probably ok to drop support, since we're only supporting built binaries really. We would like most people to use all this software as docker images or as binaries built from the tagged releases. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Nah, i agree that we should drop 1.9 support. Even prom only supports 1.10+ prometheus/prometheus#3856. We should call it out in our README though. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Ah yup, @richardartoul want to update the README.md (in a minimal way, we can update formatting/etc of it later) that we only support 1.10 at this time? |
||
) | ||
|
||
strBuilder.Grow(idLen) | ||
for _, tag := range t { | ||
strBuilder.WriteString(tag.Name) | ||
strBuilder.WriteByte(eq) | ||
strBuilder.WriteString(tag.Value) | ||
strBuilder.WriteByte(sep) | ||
} | ||
|
||
return strBuilder.String() | ||
} | ||
|
||
// WriteBytesID writes out the ID representation | ||
// of the tags into the provided buffer. | ||
func (t Tags) WriteBytesID(b []byte) []byte { | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. nit: why does this one need a buffer, rather than creating one like the stringID function? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. The idea is that the caller controls the buffer which enables us to add pooling later without making an API change There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. yeah I like the caller providing a buffer |
||
for _, tag := range t { | ||
b = append(b, tag.Name...) | ||
b = append(b, eq) | ||
b = append(b, tag.Value...) | ||
b = append(b, sep) | ||
} | ||
|
||
return string(b) | ||
return b | ||
} | ||
|
||
// IDLen returns the length of the ID that would be | ||
// generated from the tags. | ||
func (t Tags) IDLen() int { | ||
idLen := 2 * len(t) // account for eq and sep | ||
for _, tag := range t { | ||
idLen += len(tag.Name) | ||
idLen += len(tag.Value) | ||
} | ||
return idLen | ||
} | ||
|
||
// IDWithExcludes returns a string representation of the tags excluding some tag keys | ||
|
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -269,11 +269,15 @@ func (s *m3storage) Write( | |
return errors.ErrNilWriteQuery | ||
} | ||
|
||
id := query.Tags.ID() | ||
// TODO: Consider caching id -> identID | ||
identID := ident.StringID(id) | ||
// TODO: Consider caching id -> identID (requires setting NoFinalize()). | ||
var ( | ||
// TODO: Pool this once an ident pool is setup. | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. We'd probably want to instead read this directly as bytes once we eventually have the There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. would we even use There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Hm that would only be for tags though I guess, you still need a buffer to write out the ID concat'd together. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I don't think that will work because prometheus will send over the tags as []byte but the ID is a concatenation of all the tags so you would still need to copy all the tags into a new buffer (which would be the ID) There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Can you update this comment to say we'd have to remove There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Yeah although if we do that its gonna reintroduce the issue of copying which is annoying...Should we make the session API assume ownership of the ID? then it won't have to do a copy internally |
||
buf = make([]byte, 0, query.Tags.IDLen()) | ||
idBuf = query.Tags.WriteBytesID(buf) | ||
id = ident.BytesID(idBuf) | ||
) | ||
// Set id to NoFinalize to avoid cloning it in write operations | ||
identID.NoFinalize() | ||
id.NoFinalize() | ||
tagIterator := storage.TagsToIdentTagIterator(query.Tags) | ||
|
||
var ( | ||
|
@@ -287,7 +291,7 @@ func (s *m3storage) Write( | |
datapoint := datapoint | ||
wg.Add(1) | ||
s.writeWorkerPool.Go(func() { | ||
if err := s.writeSingle(ctx, query, datapoint, identID, tagIter); err != nil { | ||
if err := s.writeSingle(ctx, query, datapoint, id, tagIter); err != nil { | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Could we add a special case here for when There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. done |
||
multiErr.add(err) | ||
} | ||
|
||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i think this is used at a few more places than just write so you would have to change all of them