Improve performance of generatings IDs from tags in M3Coordinator #1000

richardartoul · 2018-10-01T17:42:07Z

arnikola · 2018-10-01T17:48:42Z

src/query/models/tag.go

+// IDLen returns the length of the ID that would be
+// generated from the tags.
+func (t Tags) IDLen() int {
+	idLen := 0


nit: might be a little better to have idLen := len(t) * 2

I think thats wrong though, because I want it to be the length of the generated ID, not twice the number of tags

Yeah, sorry I mean initialize it to be twice the length of t to account for all of the eq and sep rather than adding 2 each time through the loop

arnikola · 2018-10-01T17:51:21Z

src/query/models/tag.go

+
+// WriteBytesID writes out the ID representation
+// of the tags into the provided buffer.
+func (t Tags) WriteBytesID(b []byte) []byte {


nit: why does this one need a buffer, rather than creating one like the stringID function?

The idea is that the caller controls the buffer which enables us to add pooling later without making an API change

yeah I like the caller providing a buffer

codecov · 2018-10-01T17:58:54Z

Codecov Report

Merging #1000 into master will increase coverage by 0.29%.
The diff coverage is 89.65%.

@@            Coverage Diff             @@
##           master    #1000      +/-   ##
==========================================
+ Coverage   77.55%   77.84%   +0.29%     
==========================================
  Files         412      411       -1     
  Lines       34653    34539     -114     
==========================================
+ Hits        26875    26887      +12     
+ Misses       5903     5784     -119     
+ Partials     1875     1868       -7

Flag	Coverage Δ
#dbnode	`81.39% <ø> (-0.01%)`	⬇️
#m3ninx	`75.33% <ø> (+0.07%)`	⬆️
#query	`64.45% <89.65%> (+1.31%)`	⬆️
#x	`84.72% <ø> (ø)`	⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 5b31831...3851b65. Read the comment docs.

nikunjgit · 2018-10-01T17:56:49Z

src/query/models/tag.go

-func (t Tags) ID() string {
-	b := make([]byte, 0, len(t))
+// StringID returns a string representation of the tags
+func (t Tags) StringID() string {


i think this is used at a few more places than just write so you would have to change all of them

nikunjgit · 2018-10-01T17:59:53Z

src/query/models/tag.go

+func (t Tags) StringID() string {
+	var (
+		idLen      = t.IDLen()
+		strBuilder = strings.Builder{}


is this 1.10+ only feature ? If yes, maybe we should figure out if we are still planning to support 1.9

1.9 is over a year old at this point, 1.10 is over 6 months old. 1.11 just came out.

Any reason you want to support 1.9?

Probably ok to drop support, since we're only supporting built binaries really. We would like most people to use all this software as docker images or as binaries built from the tagged releases.

Nah, i agree that we should drop 1.9 support. Even prom only supports 1.10+ prometheus/prometheus#3856. We should call it out in our README though.

Ah yup, @richardartoul want to update the README.md (in a minimal way, we can update formatting/etc of it later) that we only support 1.10 at this time?

nikunjgit · 2018-10-01T18:00:09Z

src/query/models/tag.go

+
+// WriteBytesID writes out the ID representation
+// of the tags into the provided buffer.
+func (t Tags) WriteBytesID(b []byte) []byte {


yeah I like the caller providing a buffer

prateek · 2018-10-01T18:11:27Z

src/query/models/tag.go

-func (t Tags) ID() string {
-	b := make([]byte, 0, len(t))
+// StringID returns a string representation of the tags
+func (t Tags) StringID() string {


(unrelated to the pr)

Tags keys/values are allowed to have '=' and ','; this encoding isn't invertible - i.e. given a string representation of some tags, you can't guarantee that it comes from a single value.

How is this function used?

This determines what the ID of the metric will be in M3DB

we should talk about this offline.

prateek · 2018-10-01T18:11:36Z

src/query/models/tag.go

-func (t Tags) ID() string {
-	b := make([]byte, 0, len(t))
+// StringID returns a string representation of the tags
+func (t Tags) StringID() string {


why do you need to change the method name?

because I added a BytesID(), so seems good to distinguish the two methods so the caller is forced to think about what they are doing

fair. The goish (go-y?) way of doing this is to call WritesBytesID something like MarshalTo()/MarshalInto()

maybe call the two methods ID() and IDMarshalTo()?

robskillington · 2018-10-01T18:23:34Z

src/query/storage/m3/storage.go

-	identID := ident.StringID(id)
+	// TODO: Consider caching id -> identID (requires setting NoFinalize()).
+	var (
+		// TODO: Pool this once an ident pool is setup.


We'd probably want to instead read this directly as bytes once we eventually have the .proto changes for them to come directly as bytes from the wire.

would we even use t Tags at that point? might as well implement the interface over the grpc generated type

Hm that would only be for tags though I guess, you still need a buffer to write out the ID concat'd together.

I don't think that will work because prometheus will send over the tags as []byte but the ID is a concatenation of all the tags so you would still need to copy all the tags into a new buffer (which would be the ID)

Can you update this comment to say we'd have to remove NoFinalize() if we started pooling this?

Yeah although if we do that its gonna reintroduce the issue of copying which is annoying...Should we make the session API assume ownership of the ID? then it won't have to do a copy internally

robskillington · 2018-10-01T19:48:21Z

src/query/storage/m3/storage.go

@@ -287,7 +291,7 @@ func (s *m3storage) Write(
 		datapoint := datapoint
 		wg.Add(1)
 		s.writeWorkerPool.Go(func() {
-			if err := s.writeSingle(ctx, query, datapoint, identID, tagIter); err != nil {
+			if err := s.writeSingle(ctx, query, datapoint, id, tagIter); err != nil {


Could we add a special case here for when len(query.Datapoints) == 1 to just return s.writeSingle(...) as discussed?

Richard Artoul added 3 commits October 1, 2018 13:39

Improve performance of generating IDs from tags

e00630c

Fix

01ef7ae

update comments

b77243c

richardartoul requested review from nikunjgit, benraskin92, arnikola, prateek and robskillington and removed request for nikunjgit and benraskin92 October 1, 2018 17:43

arnikola reviewed Oct 1, 2018

View reviewed changes

arnikola approved these changes Oct 1, 2018

View reviewed changes

Fix error

2fde678

nikunjgit reviewed Oct 1, 2018

View reviewed changes

Richard Artoul added 2 commits October 1, 2018 14:02

Refactor IDLen function

a24b31e

Fix stuff

fd32544

prateek reviewed Oct 1, 2018

View reviewed changes

fix compulation isuse

8a85448

robskillington reviewed Oct 1, 2018

View reviewed changes

Richard Artoul added 5 commits October 1, 2018 14:39

Fix compilation issue

f823c5d

Add minimum go version

ff13c86

Rename

d528c01

Rename value in test

317b87c

Improve test

753c68f

robskillington reviewed Oct 1, 2018

View reviewed changes

Richard Artoul added 2 commits October 1, 2018 15:54

Special case single datapoint write

a9919a6

fix compilation issue;

3851b65

nikunjgit approved these changes Oct 1, 2018

View reviewed changes

richardartoul merged commit d36bf78 into master Oct 2, 2018

prateek deleted the ra/m3-coord-perf-id branch October 13, 2018 06:27

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Improve performance of generatings IDs from tags in M3Coordinator #1000

Improve performance of generatings IDs from tags in M3Coordinator #1000

richardartoul commented Oct 1, 2018 •

edited

Loading

arnikola Oct 1, 2018

richardartoul Oct 1, 2018

arnikola Oct 1, 2018 •

edited

Loading

arnikola Oct 1, 2018

richardartoul Oct 1, 2018

nikunjgit Oct 1, 2018

codecov bot commented Oct 1, 2018 •

edited

Loading

nikunjgit Oct 1, 2018

nikunjgit Oct 1, 2018

prateek Oct 1, 2018

robskillington Oct 1, 2018

nikunjgit Oct 1, 2018

robskillington Oct 1, 2018

nikunjgit Oct 1, 2018

prateek Oct 1, 2018

richardartoul Oct 1, 2018

prateek Oct 1, 2018

prateek Oct 1, 2018

richardartoul Oct 1, 2018

prateek Oct 1, 2018

robskillington Oct 1, 2018

prateek Oct 1, 2018

robskillington Oct 1, 2018

richardartoul Oct 1, 2018

robskillington Oct 1, 2018

richardartoul Oct 1, 2018

robskillington Oct 1, 2018

richardartoul Oct 1, 2018

Improve performance of generatings IDs from tags in M3Coordinator #1000

Improve performance of generatings IDs from tags in M3Coordinator #1000

Conversation

richardartoul commented Oct 1, 2018 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

arnikola Oct 1, 2018 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

codecov bot commented Oct 1, 2018 • edited Loading

Codecov Report

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

richardartoul commented Oct 1, 2018 •

edited

Loading

arnikola Oct 1, 2018 •

edited

Loading

codecov bot commented Oct 1, 2018 •

edited

Loading