Skip to content
This repository has been archived by the owner on Aug 30, 2019. It is now read-only.

internal/api: submit service default rates in /v0.4 endpoint #546

Merged
merged 7 commits into from
Dec 7, 2018

Conversation

gbbr
Copy link
Contributor

@gbbr gbbr commented Dec 5, 2018

This change fixes a problem where integrations have trouble matching
against the map of rates sent by the agent. For example, when a trace
has no "env" tag set, the agent will use the default env (which is
"none" or whatever is in "datadog.yaml"). Tracers have no knowledge of
this default so they will not be able to match against the map of rates.

To further explain, if the agent sends the map:

service:mongo,env:none => 0.5

A tracer will try to use the key service:mongo,env: because it will
have no env information, failing to match against the map.

Using this change, tracers can use correct defaults.

@gbbr gbbr added this to the 6.9.0 milestone Dec 5, 2018
@gbbr gbbr requested review from palazzem and LotharSee December 5, 2018 17:18
@gbbr
Copy link
Contributor Author

gbbr commented Dec 5, 2018

@palazzem this requires changes in tracers to fix the problem. Tracers need to be updated such that when no env. setting is found in tags, they will use this value to find the rate key.

Copy link
Member

@rochdev rochdev left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This pushes complexity in all tracers for a simple fix in the agent: send the default rate in service:my-service,env:.

@gbbr
Copy link
Contributor Author

gbbr commented Dec 5, 2018

That could be one solution but would involve sending duplicates for each service. Not sure if better. Seems a bit hacky to me. I’m not against other solutions but maybe we can find a better one.

@rochdev
Copy link
Member

rochdev commented Dec 5, 2018

I tend to consider these entries "yet another env entry" instead of seeing them as a duplicate. If the default would not be configurable, we would have to send this value anyway. This is an implementation detail of the backend.

@gbbr
Copy link
Contributor Author

gbbr commented Dec 5, 2018

I’m fine with either. I like yours because it requires no changes to the tracers but the down side is that the duplication is confusing, it adds network overhead (not sure if this is a real concern) and the implementation in the agent would not be very clean.

I’d like to hear from the other reviewers too.

@willgittoes-dd
Copy link

I like this, although updating all the tracers (that have priority sampling implemented) will be a minor hassle. I feel like it is more explicit, but I have another reason too. Currently, there is no uniform way of setting the default env for each tracer. Sometimes it can be set by tag on each trace, by environment variable or config file.

It would be nice if the user just puts it in their datadog.yaml. Then we can move something that previously would have to be in N number of configuration files/locations, and instead it just goes in one.

@rochdev
Copy link
Member

rochdev commented Dec 5, 2018

@willgittoes-dd Why do you want to know the default env at all? Not sending an env and letting the agent/backend figure it out is way safer.

@willgittoes-dd
Copy link

@rochdev (one day I hope Github adds threaded messaging here)

Thanks for responding, I think your comment makes it easier to understand for me :) It pushes me to care less which of these two solutions we pick.

What happens when a tracer creates traces with no/default env, and also receives propagated traces that also have no env? It seems like the local-root and propagated traces should both have the same env, but maybe not because the originating tracer for the propagated traces might be in a different env. Would it be better to have explicit envs everywhere, and "default" is just what is explicitly assigned by default?

@rochdev
Copy link
Member

rochdev commented Dec 5, 2018

@willgittoes-dd Not sure I understand. Your concern is basically cross-env requests? I'm not sure this would work well either way, unless the env is only used from the top level span.

@willgittoes-dd
Copy link

@rochdev is cross-env requests aren't possible, then why is env even in the sample-rate map? If a tracer/agent pair can assume the env is always the same, then perhaps we could remove 'env' from the map entirely?

@LotharSee
Copy link

LotharSee commented Dec 5, 2018

@willgittoes-dd A single Agent can receive traces from different env (where the env is set as a span tag).

I agree with @rochdev 's idea to simply duplicate the entries: also return service:A,env: with the same value as service:A,env:%DEFAULT_ENV. It is both legitimate (fair to cover both cases) and nicer as it requires no Agent tracer changes.

@gbbr gbbr force-pushed the gbbr/default-env branch from e60012d to 1de6189 Compare December 6, 2018 09:54
@gbbr
Copy link
Contributor Author

gbbr commented Dec 6, 2018

I've implemented it as suggested. It needed a small refactor to avoid hacky assumption making string matching. PTAL.

@gbbr gbbr changed the title internal/api: submit default environment in /v0.4 endpoint internal/api: submit service default rates in /v0.4 endpoint Dec 6, 2018
@gbbr
Copy link
Contributor Author

gbbr commented Dec 6, 2018

Benchmark results for GetAll (pretty much no change):

name                      old time/op    new time/op    delta
RateByService/GetAll/1-4     225ns ± 0%     269ns ± 0%  +19.56%
RateByService/GetAll/2-4     273ns ± 0%     294ns ± 0%   +7.69%
RateByService/GetAll/3-4     288ns ± 0%     321ns ± 0%  +11.46%
RateByService/GetAll/4-4     338ns ± 0%     371ns ± 0%   +9.76%
RateByService/GetAll/5-4     351ns ± 0%     417ns ± 0%  +18.80%
RateByService/GetAll/6-4     370ns ± 0%     631ns ± 0%  +70.54%

name                      old alloc/op   new alloc/op   delta
RateByService/GetAll/1-4      256B ± 0%      256B ± 0%    0.00%
RateByService/GetAll/2-4      256B ± 0%      256B ± 0%    0.00%
RateByService/GetAll/3-4      256B ± 0%      256B ± 0%    0.00%
RateByService/GetAll/4-4      256B ± 0%      256B ± 0%    0.00%
RateByService/GetAll/5-4      256B ± 0%      256B ± 0%    0.00%
RateByService/GetAll/6-4      256B ± 0%      464B ± 0%  +81.25%

name                      old allocs/op  new allocs/op  delta
RateByService/GetAll/1-4      2.00 ± 0%      2.00 ± 0%    0.00%
RateByService/GetAll/2-4      2.00 ± 0%      2.00 ± 0%    0.00%
RateByService/GetAll/3-4      2.00 ± 0%      2.00 ± 0%    0.00%
RateByService/GetAll/4-4      2.00 ± 0%      2.00 ± 0%    0.00%
RateByService/GetAll/5-4      2.00 ± 0%      2.00 ± 0%    0.00%
RateByService/GetAll/6-4      2.00 ± 0%      2.00 ± 0%    0.00%

Benchmark results for SetAll (insignificantly slower due to converting ServiceSignature to string):

name                      old time/op    new time/op    delta
RateByService/SetAll/1-4     199ns ± 0%     227ns ± 0%  +14.07%
RateByService/SetAll/2-4     448ns ± 0%     430ns ± 0%   -4.02%
RateByService/SetAll/3-4     643ns ± 0%     639ns ± 0%   -0.62%
RateByService/SetAll/4-4     748ns ± 0%     656ns ± 0%  -12.30%
RateByService/SetAll/5-4     966ns ± 0%     958ns ± 0%   -0.83%
RateByService/SetAll/6-4    1.05µs ± 0%    1.10µs ± 0%   +4.08%

name                      old alloc/op   new alloc/op   delta
RateByService/SetAll/1-4     0.00B         32.00B ± 0%    +Inf%
RateByService/SetAll/2-4     0.00B         96.00B ± 0%    +Inf%
RateByService/SetAll/3-4     0.00B        128.00B ± 0%    +Inf%
RateByService/SetAll/4-4     0.00B        144.00B ± 0%    +Inf%
RateByService/SetAll/5-4     0.00B        224.00B ± 0%    +Inf%
RateByService/SetAll/6-4     0.00B        256.00B ± 0%    +Inf%

name                      old allocs/op  new allocs/op  delta
RateByService/SetAll/1-4      0.00           1.00 ± 0%    +Inf%
RateByService/SetAll/2-4      0.00           3.00 ± 0%    +Inf%
RateByService/SetAll/3-4      0.00           5.00 ± 0%    +Inf%
RateByService/SetAll/4-4      0.00           5.00 ± 0%    +Inf%
RateByService/SetAll/5-4      0.00           8.00 ± 0%    +Inf%
RateByService/SetAll/6-4      0.00           9.00 ± 0%    +Inf%

Let me know what you think. Number after benchmark represents entry count in returned (or set) map.

@gbbr
Copy link
Contributor Author

gbbr commented Dec 6, 2018

@LotharSee PTAL. Note that I've also edited your last comment (hope that was on spot).

@rochdev
Copy link
Member

rochdev commented Dec 6, 2018

@willgittoes-dd From the explanation above from @LotharSee, I think the idea is that the service will end up in its own environment if it's different than the root. It makes sense from a tracing perspective, but not sure how it would affect the service map. Please correct me if my understanding is wrong.

Copy link
Member

@rochdev rochdev left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. Someone with more Go knowledge should definitely review as well.

@gbbr gbbr merged commit 36d48d1 into master Dec 7, 2018
@gbbr gbbr deleted the gbbr/default-env branch December 7, 2018 11:57
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants