Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve performance of generatings IDs from tags in M3Coordinator #1000

Merged
merged 14 commits into from
Oct 2, 2018

Conversation

richardartoul
Copy link
Contributor

@richardartoul richardartoul commented Oct 1, 2018

screen shot 2018-10-01 at 1 44 03 pm

screen shot 2018-10-01 at 1 44 09 pm

// IDLen returns the length of the ID that would be
// generated from the tags.
func (t Tags) IDLen() int {
idLen := 0
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: might be a little better to have idLen := len(t) * 2

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think thats wrong though, because I want it to be the length of the generated ID, not twice the number of tags

Copy link
Collaborator

@arnikola arnikola Oct 1, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, sorry I mean initialize it to be twice the length of t to account for all of the eq and sep rather than adding 2 each time through the loop


// WriteBytesID writes out the ID representation
// of the tags into the provided buffer.
func (t Tags) WriteBytesID(b []byte) []byte {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: why does this one need a buffer, rather than creating one like the stringID function?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The idea is that the caller controls the buffer which enables us to add pooling later without making an API change

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yeah I like the caller providing a buffer

@codecov
Copy link

codecov bot commented Oct 1, 2018

Codecov Report

Merging #1000 into master will increase coverage by 0.29%.
The diff coverage is 89.65%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master    #1000      +/-   ##
==========================================
+ Coverage   77.55%   77.84%   +0.29%     
==========================================
  Files         412      411       -1     
  Lines       34653    34539     -114     
==========================================
+ Hits        26875    26887      +12     
+ Misses       5903     5784     -119     
+ Partials     1875     1868       -7
Flag Coverage Δ
#dbnode 81.39% <ø> (-0.01%) ⬇️
#m3ninx 75.33% <ø> (+0.07%) ⬆️
#query 64.45% <89.65%> (+1.31%) ⬆️
#x 84.72% <ø> (ø) ⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 5b31831...3851b65. Read the comment docs.

func (t Tags) ID() string {
b := make([]byte, 0, len(t))
// StringID returns a string representation of the tags
func (t Tags) StringID() string {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i think this is used at a few more places than just write so you would have to change all of them

func (t Tags) StringID() string {
var (
idLen = t.IDLen()
strBuilder = strings.Builder{}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is this 1.10+ only feature ? If yes, maybe we should figure out if we are still planning to support 1.9

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

1.9 is over a year old at this point, 1.10 is over 6 months old. 1.11 just came out.

Any reason you want to support 1.9?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Probably ok to drop support, since we're only supporting built binaries really. We would like most people to use all this software as docker images or as binaries built from the tagged releases.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nah, i agree that we should drop 1.9 support. Even prom only supports 1.10+ prometheus/prometheus#3856. We should call it out in our README though.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah yup, @richardartoul want to update the README.md (in a minimal way, we can update formatting/etc of it later) that we only support 1.10 at this time?


// WriteBytesID writes out the ID representation
// of the tags into the provided buffer.
func (t Tags) WriteBytesID(b []byte) []byte {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yeah I like the caller providing a buffer

func (t Tags) ID() string {
b := make([]byte, 0, len(t))
// StringID returns a string representation of the tags
func (t Tags) StringID() string {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(unrelated to the pr)

Tags keys/values are allowed to have '=' and ','; this encoding isn't invertible - i.e. given a string representation of some tags, you can't guarantee that it comes from a single value.

How is this function used?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This determines what the ID of the metric will be in M3DB

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we should talk about this offline.

func (t Tags) ID() string {
b := make([]byte, 0, len(t))
// StringID returns a string representation of the tags
func (t Tags) StringID() string {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why do you need to change the method name?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

because I added a BytesID(), so seems good to distinguish the two methods so the caller is forced to think about what they are doing

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fair. The goish (go-y?) way of doing this is to call WritesBytesID something like MarshalTo()/MarshalInto()

maybe call the two methods ID() and IDMarshalTo()?

identID := ident.StringID(id)
// TODO: Consider caching id -> identID (requires setting NoFinalize()).
var (
// TODO: Pool this once an ident pool is setup.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We'd probably want to instead read this directly as bytes once we eventually have the .proto changes for them to come directly as bytes from the wire.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

would we even use t Tags at that point? might as well implement the interface over the grpc generated type

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hm that would only be for tags though I guess, you still need a buffer to write out the ID concat'd together.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think that will work because prometheus will send over the tags as []byte but the ID is a concatenation of all the tags so you would still need to copy all the tags into a new buffer (which would be the ID)

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you update this comment to say we'd have to remove NoFinalize() if we started pooling this?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah although if we do that its gonna reintroduce the issue of copying which is annoying...Should we make the session API assume ownership of the ID? then it won't have to do a copy internally

@@ -287,7 +291,7 @@ func (s *m3storage) Write(
datapoint := datapoint
wg.Add(1)
s.writeWorkerPool.Go(func() {
if err := s.writeSingle(ctx, query, datapoint, identID, tagIter); err != nil {
if err := s.writeSingle(ctx, query, datapoint, id, tagIter); err != nil {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could we add a special case here for when len(query.Datapoints) == 1 to just return s.writeSingle(...) as discussed?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

@richardartoul richardartoul merged commit d36bf78 into master Oct 2, 2018
@prateek prateek deleted the ra/m3-coord-perf-id branch October 13, 2018 06:27
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants