contrib/go-redis/redis.v8: Allow Option To Remove Raw Tag #1022

sjmtan · 2021-09-29T16:53:11Z

I believe there's a case where if the command is sufficiently large, this will cause serious memory issues on the client. Not only that, I believe this should be optional because in certain cases, this could be a security issue. Therefore, we should allow an option to remove the raw tag.

To refresh, see snippet from the contrib hook. See the raw_command part.

	raw := cmd.String()
	length := strings.Count(raw, " ")
	p := ddh.params
	opts := make([]ddtrace.StartSpanOption, 0, 5+len(ddh.additionalTags)+1) // 5 options below + for additionalTags + for analyticsRate
	opts = append(opts,
		tracer.SpanType(ext.SpanTypeRedis),
		tracer.ServiceName(p.config.serviceName),
		tracer.ResourceName(raw[:strings.IndexByte(raw, ' ')]),
		tracer.Tag("redis.raw_command", raw),
		tracer.Tag("redis.args_length", strconv.Itoa(length)),
	)

The implementation of cmd.String() (I believe) boils down to this. You can see that it contains args.

func cmdString(cmd Cmder, val interface{}) string {
	b := make([]byte, 0, 64)

	for i, arg := range cmd.Args() {
		if i > 0 {
			b = append(b, ' ')
		}
		b = internal.AppendArg(b, arg)
	}

	if err := cmd.Err(); err != nil {
		b = append(b, ": "...)
		b = append(b, err.Error()...)
	} else if val != nil {
		b = append(b, ": "...)
		b = internal.AppendArg(b, val)
	}

	return internal.String(b)
}

Interestingly, there's been a similar feature request and subsequent revert for memcache:

See request in contrib/bradfitz/gomemcache: trace the processing keys #640. I think a bit indicative that the issue reporter blacked out the raw command in the screenshot + implemented it in a way that replicates that behavior in contrib/bradfitz/gomemcache: trace item info for memcached operations #642.
It was reverted in Revert "contrib/bradfitz/gomemcache: trace item info for memcached operations" #664 for reasons that I believe would apply here as well.

We've been able to isolate the problem to tracing by removing Redis tracing from the codepath that OOM'ed our servers without issue. But we haven't been able to dive into it in a more granular fashion than that. Happy to be told I'm wrong.

The text was updated successfully, but these errors were encountered:

sjmtan · 2021-10-07T22:25:16Z

@knusbaum, @felixge - only raising because this actually could produce security issues for customers.

gbbr · 2021-11-05T09:18:51Z

Sorry for the late reply. This is fine. Thanks for opening the PR.

gbbr · 2021-11-05T09:23:11Z

Can you explain in more detail how this causes problems? The agent quantizes and obfuscates these values, so ultimately a command will look exactly like this (including the ellipsis): SET .... Everything is truncated in the resource, and the redis.raw_command tag is obfuscated to eliminate all sensitive information. Is this not the case for you? What is the security issue you are seeing?

Regarding long commands being a problem: the only place where a "long" command would travel is locally between the tracing client (app) and the agent. Is that an issue? Or, is the problem memory usage in the application because of this string? Can you share a snippet of code? It might be that we can obtain the span from context and change it's raw command or resource today, programmatically, without introducing new options and having you wait for a new release.

Regardless, it is fine if you want to skip the setting of this tag by means of an option, but I don't think we should skip the resource name. If we do that, everything will just be called redis.command and I'm unsure how useful that will be for you when looking at data.

gbbr · 2021-11-08T06:29:59Z

From @shannontan-bolt in #1026:

The real issue for us was that we were trying to save an object of data in a Redis cache and that meant this large blob was putting put as a tag with tracing.

Let's solve this issue then, if this is the real issue. First, I'd like to understand what about it is the issue:

Is it your concern about sensitive data? In which case, it shouldn't be because this blob should get obfuscated by the agent. You should've already noticed if you looked at this data in Datadog. Is this not the case? If not, that's a bug in the agent and we can fix it after some debugging. Basically, a command like SET key 123 should show up as SET key ? in the UI.
Is your concern that this tag name is too big? Does it cause any performance problems like memory usage?

Let's better define the issue and then we can find the right solution.

sjmtan · 2021-11-08T06:36:50Z

From @shannontan-bolt in #1026:

The real issue for us was that we were trying to save an object of data in a Redis cache and that meant this large blob was putting put as a tag with tracing.

Let's solve this issue then, if this is the real issue. First, I'd like to understand what about it is the issue:

Is it your concern about sensitive data, in which case, it shouldn't be because this blob should get obfuscated by the agent. You should've already noticed if you looked at this data in Datadog. Is this not the case? If not, that's a bug in the agent and we can fix it after some debugging.

Is your concern that this tag name is too big? Does it cause any performance problems like memory usage?

Let's better define the issue and then we can find the right solution.

Both are/were issues.

I'm on my phone, apologies for the terseness. To be clear, if it is obfuscated for resource name, that's fine. I'm not sure how it would intelligently obfuscate based on the raw command, but if it does great.

The current release took down multiple instances with tracing on, and there were no memory issues once we disabled this specifically redis tracing.
I saw the presence of IP addresses within the Datadog UI because we use Redis for rate limiting by IP. I can imagine that if we were to rate limit by more sensitive data, this would be problematic.

So both are very real issues to me, and I've both disabled the tracing library and filtered out ingested spans to avoid any security issues. I only included the resource name change because it uses the same full command whereas I believe the operation name just uses the type of Redis command.

sjmtan · 2021-11-08T06:45:37Z

I didn't get notified for your comment a few days ago.

Within the UI, I suppose it is possible that the resource name is fine with the raw command, but I'm not certain now since I've had to disable the Redis trace. Maybe it was Redis.raw_command that was the source of both memory issues and presenting sensitive data.

I also can't provide a snippet but I would believe that a sufficiently large JSON should trigger this issue. Our code caches responses of variable size from an HTTP call we make to a vendor, so not really much to share other than these fit just fine within our Redis instance.

gbbr · 2021-11-08T07:02:45Z

Ok. Thanks for explaining. I am also sorry for taking so long to respond. I can assure you it was not intentional.

I'd really appreciate it if we could try a few things first so we can clearly define the issue before we start thinking about the right solution.

The resource name should be quantized in the Datadog Agent, basically if you have something like SET key value, the UI will only show SET .... Regardless of what the command is, you will only see the command name, followed by ellipsis (...). This works by default.

However, for the raw command (I was wrong here), it only gets obfuscated if this setting is enabled in the Datadog Agent. The config is:

apm_config:
  obfuscation:
    redis:
      enabled: true

Can you give this a try? Unfortunately this setting can not be applied via an env. var. right now, so I'm hoping you can alter the datadog.yaml setting of your Datadog Agent. If not, and you need an env. var., we could potentially include it into the next release which goes into RC testing next week.

That should resolve the security issue.

The current release took down multiple instances with tracing on, and there were no memory issues once we disabled this specifically redis tracing.

This is the concerning part that I'd like to better understand. I can only imagine this could've been due to memory usage. In which case, we could introduce something like you already have in your PR: DisableRawCommand setting which causes that tag to not be set, to resolve the redis.raw_command issue.

As for the resource, we could try something like WithResourceNamer, where there would be a callback intercepting the command and allowing it to be renamed before it gets assigned as a resource. There, you could do something like (naive example): strings.Split(cmd, " ")[0], getting only the command name. This will still allow you to get separation between various commands and also skip the long parts.

WDYT?

sjmtan · 2021-11-08T16:42:57Z

@gbbr, I can't test the resource name part without the change to the DisableRawCommand, at least not in any live env. I suppose I can make it work in a testing/local env but that requires work, so first I'd like to ask the following:

tracer.ResourceName(raw[:strings.IndexByte(raw, ' ')]), is already in the code snippet I linked above and is unchanged in my PR. Why do these two have different behaviors? They seem to update the same tag too.

gbbr · 2021-11-09T13:32:56Z

@gbbr, I can't test the resource name part without the change to the DisableRawCommand, at least not in any live env. I suppose I can make it work in a testing/local env but that requires work, so first I'd like to ask the following:

What I'm proposing is that your PR adds ~~two~~ one options:

DisableRawCommand which completely removes the setting of redis.raw_command
WithResourceNamer which takes a callback allowing to modify the resource that gets set, before it is set. Similar to the ones we already have in our other integrations (you can get inspired from any of them). Sorry, I just realised what you said. It seems like you don't need this, because only the first token (up to the first space) is taken.

I suspect this would solve the issue for you. What do you think?

Why do these two have different behaviors? They seem to update the same tag too.

They evolved differently as Datadog grew and developed as a product. We use the resource name as an aggregate as you may have already seen. To reduce cardinality and to make more sense, it was quantized by truncating everything after the command and adding the ellipsis (...). Later on, the need came for the raw command to be obfuscated due to security reasons, so we introduced that too, along with the intention to reveal more than just the command itself and only hide user data. This behaviour was not reproduced in the resource name because it would be considered a breaking change and we would have risked breaking time series and stats continuity for existing customers using these integrations. I hope this makes sense.

gbbr · 2021-11-09T13:36:23Z

Sorry if my previous answer is confusing, to summarise: I'm proposing to change your PR to only remove the raw command, but leave the resource as it is. Would that be fine for you?

sjmtan · 2021-11-26T20:10:16Z

This is still confusing :)

ResourceName tag appears to be set twice.

tracer.ResourceName is:

// ResourceName sets the given resource name on the started span. A resource could
// be an SQL query, a URL, an RPC method or something else.
func ResourceName(name string) StartSpanOption {
	return Tag(ext.ResourceName, name)
}

I think my end state here would be:

Only include the resource set in L141 since that only includes up to the first space. L144 already doesn't make sense as is.
Have an option to exclude the raw command

sjmtan · 2021-12-08T18:40:17Z

@gbbr, I reread your comment - maybe I'm confused because you edited your comment. Are you saying that we should still remove the second ResourceName set completely and use the flag as originally proposed on whether to send the raw command?

gbbr · 2021-12-09T10:20:51Z

ResourceName tag appears to be set twice.

Oh my! I think we've likely missed this and it is most definitely a (dangerous) error! I agree with your end state proposal. I think we're on the same page. To reiterate, what I'm understanding:

We remove L144
We add DisableRawCommand

IIUC, let's do it 🚀

sjmtan · 2021-12-10T01:59:13Z

I agree - would it be possible if you could put the PR together or edit my PR? I still find it a bit of struggle to get my local system working and it keeps insisting on a ton of changes to go.mod/sum + not running tests. I'm not sure why but don't quite have the time to address it whereas I'm sure it is trivial for you.

…urce This change adds a new option called `WithSkippedRawCommand` which determines the instrumentation code to skip adding the `redis.raw_command` tag to spans. This is useful in cases when the Datadog Agent Redis obfuscation is disabled (which it is, by default) and commands may contain sensitive information. Additionally, duplicate entries and an incosistency was found when setting the resource name for Redis spans, which was fixed as part of this PR. Closes #1022

gbbr · 2021-12-14T11:00:59Z

@shannontan-bolt please review it: #1091

…urce This change adds a new option called `WithSkippedRawCommand` which determines the instrumentation code to skip adding the `redis.raw_command` tag to spans. This is useful in cases when the Datadog Agent Redis obfuscation is disabled (which it is, by default) and commands may contain sensitive information. Additionally, duplicate entries and an incosistency was found when setting the resource name for Redis spans, which was fixed as part of this PR. Closes #1022

…urce (#1091) This change adds a new option called `WithSkippedRawCommand` which determines the instrumentation code to skip adding the `redis.raw_command` tag to spans. This is useful in cases when the Datadog Agent Redis obfuscation is disabled (which it is, by default) and commands may contain sensitive information. Additionally, duplicate entries and an incosistency was found when setting the resource name for Redis spans, which was fixed as part of this PR. Closes #1022

zball · 2023-02-08T21:10:29Z

@gbbr per this comment above: #1022 (comment)

Unfortunately this setting can not be applied via an env. var. right now, so I'm hoping you can alter the datadog.yaml setting of your Datadog Agent. If not, and you need an env. var., we could potentially include it into the next release which goes into RC testing next week.

Has an env var been added yet for this? I haven't yet been able to find one. If it has been added is it possible to point me in the direction for it?

sjmtan mentioned this issue Oct 8, 2021

contrib/go-redis/redis.v8: add Option to Exclude Raw Command #1026

Closed

gbbr mentioned this issue Dec 14, 2021

contrib/go-redis/redis.v8: add WithSkipRawCommand option and fix resource #1091

Merged

gbbr closed this as completed in #1091 Dec 16, 2021

This was referenced Aug 26, 2022

Redisotel should allow an option to not set the raw cmd as an attribute on spans redis/go-redis#2195

Closed

[Discussion] Enable the ability to filter attributes for use with third party libraries open-telemetry/opentelemetry-go#3123

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

contrib/go-redis/redis.v8: Allow Option To Remove Raw Tag #1022

contrib/go-redis/redis.v8: Allow Option To Remove Raw Tag #1022

sjmtan commented Sep 29, 2021 •

edited

Loading

sjmtan commented Oct 7, 2021

gbbr commented Nov 5, 2021

gbbr commented Nov 5, 2021 •

edited

Loading

gbbr commented Nov 8, 2021 •

edited

Loading

sjmtan commented Nov 8, 2021

sjmtan commented Nov 8, 2021

gbbr commented Nov 8, 2021

sjmtan commented Nov 8, 2021 •

edited

Loading

gbbr commented Nov 9, 2021 •

edited

Loading

gbbr commented Nov 9, 2021

sjmtan commented Nov 26, 2021

sjmtan commented Dec 8, 2021

gbbr commented Dec 9, 2021 •

edited

Loading

sjmtan commented Dec 10, 2021

gbbr commented Dec 14, 2021

zball commented Feb 8, 2023 •

edited

Loading

contrib/go-redis/redis.v8: Allow Option To Remove Raw Tag #1022

contrib/go-redis/redis.v8: Allow Option To Remove Raw Tag #1022

Comments

sjmtan commented Sep 29, 2021 • edited Loading

sjmtan commented Oct 7, 2021

gbbr commented Nov 5, 2021

gbbr commented Nov 5, 2021 • edited Loading

gbbr commented Nov 8, 2021 • edited Loading

sjmtan commented Nov 8, 2021

sjmtan commented Nov 8, 2021

gbbr commented Nov 8, 2021

sjmtan commented Nov 8, 2021 • edited Loading

gbbr commented Nov 9, 2021 • edited Loading

gbbr commented Nov 9, 2021

sjmtan commented Nov 26, 2021

sjmtan commented Dec 8, 2021

gbbr commented Dec 9, 2021 • edited Loading

sjmtan commented Dec 10, 2021

gbbr commented Dec 14, 2021

zball commented Feb 8, 2023 • edited Loading

sjmtan commented Sep 29, 2021 •

edited

Loading

gbbr commented Nov 5, 2021 •

edited

Loading

gbbr commented Nov 8, 2021 •

edited

Loading

sjmtan commented Nov 8, 2021 •

edited

Loading

gbbr commented Nov 9, 2021 •

edited

Loading

gbbr commented Dec 9, 2021 •

edited

Loading

zball commented Feb 8, 2023 •

edited

Loading