Add jaeger/opentracing tracing to m3query #1321

andrewmains12 · 2019-01-28T17:58:54Z

This diff adds opentracing support to m3query. Spans are started for every HTTP request (courtesy middleware from github.com/opentracing-contrib/go-stdlib). I then have child spans started for compilation + execution, with drilldowns into each node of the query. Here's a sample trace:

Query is sum(increase(coordinator_engine_datapoints{type="fetched"}[10s]))

One implementation note: node instrumentation is a bit less clean than I would have liked; since all nodes call their children before exiting, i.e.:

nextBlock := doStuff()
n.controller.Process(nextBlock)

I instrumented them like:

startSpan()
nextBlock := doStuff()
stopSpan()

n.controller.Process(nextBlock)

The alternative would be to do:

startSpan()
nextBlock := doStuff()

n.controller.Process(nextBlock)
stopSpan()

but that gives unnecessary nesting imo:

codecov · 2019-01-28T21:32:40Z

Codecov Report

❗ No coverage uploaded for pull request base (andrewmains12/cost_accounting@35d47f1). Click here to learn what that means.
The diff coverage is 47.8%.

@@                       Coverage Diff                       @@
##             andrewmains12/cost_accounting   #1321   +/-   ##
===============================================================
  Coverage                                 ?   64.3%           
===============================================================
  Files                                    ?     771           
  Lines                                    ?   64218           
  Branches                                 ?       0           
===============================================================
  Hits                                     ?   41333           
  Misses                                   ?   20009           
  Partials                                 ?    2876

Flag	Coverage Δ
#aggregator	`73.6% <ø> (?)`
#cluster	`74.9% <ø> (?)`
#collector	`78.4% <ø> (?)`
#dbnode	`75.5% <ø> (?)`
#m3em	`68.3% <ø> (?)`
#m3ninx	`73.5% <ø> (?)`
#m3nsch	`28.4% <ø> (?)`
#metrics	`17.8% <ø> (?)`
#msg	`74.9% <ø> (?)`
#query	`47.5% <43.9%> (?)`
#x	`75.3% <77%> (?)`

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 35d47f1...207d695. Read the comment docs.

benraskin92 · 2019-01-29T15:08:01Z

src/query/DATAMODEL.md

@@ -0,0 +1,5 @@
+


Whoops this was pretty much just my own notes--removing.

src/query/api/v1/handler/prometheus/native/read_common.go

src/query/config/m3query-dev-etcd.yml

benraskin92 · 2019-01-29T15:59:25Z

src/query/executor/request.go

-func (r *Request) execute(ctx context.Context, pp plan.PhysicalPlan) (*ExecutionState, error) {
-	sp := startSpan(r.engine.metrics.executingHist, r.engine.metrics.executing)
+func (r *Request) generateExecutionState(ctx context.Context, pp plan.PhysicalPlan) (*ExecutionState, error) {
+	sp, ctx := opentracing.StartSpanFromContext(ctx, "generateExecutionState")


Not sure about the naming semantics for opentracing, but do they recommend camel case or using _ or something else?

Yuri says _--I'll change it. Good catch!

src/query/executor/transform/controller.go

benraskin92 · 2019-01-29T16:21:39Z

src/query/executor/transform/exec.go

+	ProcessBlock(queryCtx *models.QueryContext, ID parser.NodeID, b block.Block) (block.Block, error)
+}
+
+// ProcessSimpleBlock is a utility for OpNode instances which simply propagate their data after doing their own


Nice comment!

src/query/executor/request.go

benraskin92 · 2019-01-29T16:25:41Z

src/query/executor/transform/exec_test.go

+
+func TestProcessSimpleBlock(t *testing.T) {
+	t.Run("closes next block", func(t *testing.T) {
+


remove extra line?

Sorry I should have filled these tests in before pushing. Doing that now.

src/query/executor/transform/exec_test.go

benraskin92 · 2019-01-29T16:28:49Z

src/query/functions/fetch.go

+// Execute runs the fetch node operation
+func (n *FetchNode) Execute(queryCtx *models.QueryContext) error {
+	ctx := queryCtx.Ctx
+	blockResult, err := n.fetch(queryCtx.Ctx, queryCtx)


use ctx instead of queryCtx.Ctx?

src/query/functions/utils/params.go

docs/operational_guide/monitoring.md

src/query/util/opentracing/context_test.go

arnikola · 2019-02-12T20:06:53Z

src/query/api/v1/handler/prometheus/native/read.go

+				Error: respErr.Err.Error(),
+			},
+			RqID: logging.ReadContextID(ctx),
+		}, logging.WithContext(ctx))


Do we need to log id in the error returned to the client? The failing request ID is still logged since the logger has the context id on it

Good question. So the idea here is to help aid in debugging; by returning the request id to the client, we make it easy to tie an error to logs + traces. Basically, we can just ask the user for the RqID from grafana's query inspector, and go directly from that to traces + logs.

Does that seem reasonable?

Sure, doesn't seem like it would cause issues. We've only got tracing on read right? Might want to add the same thing to read_instantaneous, too... Would it make sense to take this block and put it into xhttp as a method like WriteJSONResponseWithRequestID?

Yeah good call, I shouldn't have been lazy about it here :). I'll factor it out.

arnikola · 2019-02-12T20:41:36Z

src/query/functions/aggregation/count_values.go

@@ -88,6 +88,10 @@ type countValuesNode struct {
 	controller *transform.Controller
 }

+func (n *countValuesNode) Params() parser.Params {


Shouldn't this function (and other snowflakes existing outside of various base.go files) be refactored to the ProcessBlock(queryCtx *models.QueryContext, ID parser.NodeID, b block.Block) model?

Ah yeah good call, missed these. This is one reason it's too bad we can't put tracing etc into Controller (at least, not without having weird traces); it makes the fanout of this change (and potentially other similar ones) much wider. That said I don't have a good alternative model immediately handy so I'll go through and do the switch.

Yeah it's a bit annoying unfortunately; I guess it's a downside that we have to deal with in return for the nice properties of functions calling their own downstream processing

Yeah for a brief moment I was like "we should just explicitly walk the graph instead" and then I realized that would be a large rewrite 😄

Yeah, was looking at making a few nicety fixes to DAG generation/execution/mapping, and it's a pretty hefty beast unfortunately to make it more usable for things like this (although it would definitely not go astray!)

arnikola · 2019-02-12T20:46:53Z

src/query/executor/transform/exec.go

+	ProcessBlock(queryCtx *models.QueryContext, ID parser.NodeID, b block.Block) (block.Block, error)
+}
+
+// ProcessSimpleBlock is a utility for OpNode instances which simply propagate their data after doing their own


Doesn't this describe every block type?

Haha yeah true. The "exception" is nodes which need to wait for multiple blocks before actually executing, e.g. binary nodes. I'll update the comment to clarify that.

arnikola · 2019-02-12T20:48:24Z

src/query/executor/transform/exec.go

+	"github.com/m3db/m3/src/query/util/opentracing"
+)
+
+type simpleOpNode interface {


It's a bit weird having an interface that a bunch of nodes satisfy be private

So this is definitely a golang quirk, but I'd argue that having it be private makes reasonable sense. I don't necessarily want other packages using the interface directly; its only reason for existence is ProcessSimpleBlock. I would even make it anonymous if that were a thing go supported.

We can also always make it public later :)

arnikola · 2019-02-12T20:49:35Z

src/query/executor/transform/exec_test.go

+		require.NoError(t, doCall(tctx))
+	})
+
+	t.Run("errors on process error", func(t *testing.T) {


nit: same comment about the t.Run, seems to swallow up useful information for failing tests

Now that I think about it, mostly had problems with it in tabled tests, when iterating over them... when one of twenty fails it's sometimes a pain to find exactly which test failed exactly where.

Ah yes, totally agreed on that! I always try to use searchable names in those cases, which helps, but it can still be annoying.

src/query/functions/temporal/base.go

docs/operational_guide/monitoring.md

arnikola · 2019-02-15T19:14:42Z

src/query/executor/transform/lazy.go

@@ -32,6 +33,12 @@ type sinkNode struct {
 	block block.Block
 }

+// Params for sinkNode returns an params object for lazy sinks. It doesn't appear in queries (since sink nodes are


Think this one may have fallen through the cracks (thanks github for collapsing comments for some reason 🙄 )

Ah yeah good catch--lint was complaining at me earlier about not having a comment on this; turns out that's because I don't actually need it to implement the interface. Removed.

arnikola · 2019-02-15T19:18:17Z

src/query/functions/aggregation/take.go

+	return transform.ProcessSimpleBlock(n, n.controller, queryCtx, ID, b)
+}
+
+func (n *takeNode) ProcessBlock(queryCtx *models.QueryContext, ID parser.NodeID, b block.Block) (block.Block, error) {


arnikola · 2019-02-15T19:20:33Z

src/query/functions/binary/base.go

@@ -142,6 +147,14 @@ func (n *baseNode) Process(queryCtx *models.QueryContext, ID parser.NodeID, b bl
 	return n.controller.Process(queryCtx, nextBlock)
 }

+func (n *baseNode) processWithTracing(queryCtx *models.QueryContext, lhs block.Block, rhs block.Block) (block.Block, error) {


Correct me if I'm wrong, but this will show up on the span as two function calls, one of which will be pretty much instantaneous, right? Should be fine, and probably not worth overengineering here to avoid that case (like starting the span only when both LHS and RHS satisfied), but just trying to get an idea of how it will look :)

That being said, it might be necessary to add a parent span to NewScalarOp, since there is a potential (highly unlikely, but still) scenario where you have a binary function between a series and a scalar, e.g. up + 1, and somehow the up block resolves first, which may cut off your trace early if I'm understanding the flow correctly?

Correct me if I'm wrong, but this will show up on the span as two function calls, one of which will be pretty much instantaneous, right?

Ah yeah good question. It actually shows up as a single trace, since I am in fact starting the span only when both LHS and RHS are satisfied.

Here's a sample:

arnikola · 2019-02-15T19:30:05Z

src/query/functions/temporal/base.go

+	blocks, err := c.processCompletedBlocks(queryCtx, processRequests, maxBlocks)
+	defer closeBlocks(blocks)
+
+	if err != nil {


So probably(definitely!) out of scope for this PR, but would be good to revisit temporal functions at some point and add in some logic that if any incoming block errored, it should drop any incoming blocks with the same error instead of trying to add them and starting processing before hitting it... May add an issue regarding this?

Sure! What do you mean by "any incoming block errored"? As in, there was an error processing it or an error upstream? I'm definitely not opposed to exiting earlier when possible though.

The general function of temporal functions is that blocks come in, and are added to a cache. When enough blocks come in to fully calculate a downstream block, it is processed (which can error). At the moment, if the processing fails, that particular request will fail but there will still be in-flight blocks coming into the temporal function, which do not necessarily need to be computed since the entire query will error anyway

Ah ok cool, thanks for the clarification. That seems like a kind of generic problem--other parts of the query should be cancelled once one part fails. Do we have any general cancellation mechanism already? I suppose we could check the context periodically during each node's Process; that would probably be the idiomatic way to do it. No idea how often "periodically" should be though haha.

src/query/functions/temporal/base.go

Adds a `queryContext` argument to `OpNode.Process` to hold any per query state. I use this in both my [cost accounting](#1207) and [tracing](#1321) PR's. At the time, I based my tracing branch off of the cost accounting branch. Tracing is closer to landing though, so I've now factored out the common changes, and rebased them both against this branch.

arnikola · 2019-02-21T20:04:59Z

src/query/api/v1/handler/prometheus/native/read.go

@@ -167,6 +170,9 @@ func (h *PromReadHandler) ServeHTTPWithEngine(

 	result, err := read(ctx, engine, h.tagOpts, w, params)
 	if err != nil {
+		sp := opentracingutil.SpanFromContextOrRoot(ctx)
+		sp.LogFields(opentracinglog.Error(err))
+		opentracingext.Error.Set(sp, true)


Should you have sp.Finish() here?

In this case, no; that function is intended to extract the span from context if it exists, and return a noop/dummy if not. In the first case, the calling function is already in charge of calling Finish(); in the second, it shouldn't be needed. Does that make sense?

I updated the function to be opentracingutil.SpanFromContextOrNoop, making it return a noop span, which should ensure that it doesn't need to be closed even if called with a context without a span.

src/query/api/v1/handler/prometheus/native/read_common.go

src/query/api/v1/httpd/handler.go

arnikola · 2019-02-21T20:10:17Z

docs/operational_guide/monitoring.md

+insight into query performance and errors.
+
+### Configuration
+Currently, only https://www.jaegertracing.io/ is supported as a backend.


nit: maybe only [Jaeger](https://www.jaegertracing.io/) is supported as a tracing backend.

docs/operational_guide/monitoring.md

arnikola · 2019-02-21T20:23:34Z

src/query/util/opentracing/context.go

+		return sp
+	}
+
+	return opentracing.StartSpan("SpanFromContextOrRoot - dummy")


Where does this string appear re: trace? Could be good to make it a constant if nothing else

Ah yeah good question. So in properly configured production usage, this branch will never be hit; the general expectation is that functions calling this aren't at the root level, and will already have a span passed in. Having the dummy span case instead of panicking is basically a convenience for tests and callers who don't really care about tracing. Does that make sense?

One other thing I could do is have this function take in the root name as an argument instead--that might help folks track down any misconfigured call paths if they're hooking up tracing.

src/query/util/logging/log.go

arnikola · 2019-02-21T20:27:35Z

src/query/util/httperrors/errors.go

+	RqID string `json:"rqID"`
+}
+
+// ErrorWithReqID writes an xhttp.ErrorResponse with an added request id (RqId) field read from the request


nit1: could be better to call it ErrorWithReq rather than specifying it must be a reqID?

nit2: Split the NB out

// context. // // NB: ...

r.e. 1--sure, changed it to ErrorWithReqInfo (more generic + still accurate)

nit2: Split the NB out

Sure, done.

arnikola · 2019-02-21T20:33:43Z

docs/operational_guide/monitoring.md

+
+## Logs
+
+TODO


Probably better to remove these?

Eh I don't see much harm in having stubs. I switched it to "TODO: document how to retrieve metrics for M3DB components." so that people don't think the metrics themselves are TODO.

benraskin92 · 2019-02-21T22:27:34Z

src/query/api/v1/httpd/handler.go

@@ -139,6 +136,23 @@ func NewHandler(
 	return h, nil
 }

+func applyMiddleware(base http.Handler) http.Handler {
+	rtn := http.Handler(&cors.Handler{


Better var name?

src/query/api/v1/httpd/handler.go

src/query/util/opentracing/context.go

src/query/util/opentracing/context_test.go

src/query/util/httperrors/errors.go

andrewmains12 requested review from nikunjgit, benraskin92 and arnikola January 28, 2019 17:58

andrewmains12 force-pushed the andrewmains12/tracing branch 2 times, most recently from d17e4a7 to 4ca756b Compare January 28, 2019 21:21