-
Notifications
You must be signed in to change notification settings - Fork 1.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix the issue that we cannot modify apm_config.obfuscation.*
through env vars
#32318
Conversation
8dcb3a2
to
88bebf8
Compare
Test changes on VMUse this command from test-infra-definitions to manually test this PR changes on a VM: inv aws.create-vm --pipeline-id=51758475 --os-family=ubuntu Note: This applies to commit 83c4fc04 |
Uncompressed package size comparisonComparison with ancestor Diff per package
Decision |
b92db61
to
136f97b
Compare
136f97b
to
6caa631
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think I understand what the issue was or how your changes fix it, can you add some details in the PR description ?
@pgimalac Thanks for reviewing this PR. I put the detail here.
The goal is to solve the issue that we cannot change For example, when we try to change docker run --name datadog-agent -d --cgroupns host --pid host \
-v /var/run/docker.sock:/var/run/docker.sock:ro \
-v /proc/:/host/proc/:ro -v /sys/fs/cgroup/:/host/sys/fs/cgroup:ro \
-e DD_API_KEY=xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx \
-e DD_APM_OBFUSCATION_MEMCACHED_ENABLED=false \
-e DD_APM_OBFUSCATION_HTTP_REMOVE_PATHS_WITH_DIGITS=false \
datadog/agent:7.60.0 Then, we would check if configs are reflected by env vars like We can expect output like this. apm_config:
obfuscation:
memcached:
enabled: false
http:
remove_paths_with_digits: false BUT, the actual output is like this. apm_config:
obfuscation.memcached.enabled: "false"
obfuscation.http.remove_paths_with_digits: "false"
To solve this issue, I opened this PR. |
Thanks for the PR, we'll look into the bug in the config |
5b3a84e
to
b547803
Compare
Regression DetectorRegression Detector ResultsMetrics dashboard Baseline: a6860d5 Optimization Goals: ✅ No significant changes detected
|
perf | experiment | goal | Δ mean % | Δ mean % CI | trials | links |
---|---|---|---|---|---|---|
➖ | uds_dogstatsd_to_api_cpu | % cpu utilization | +1.62 | [+0.93, +2.31] | 1 | Logs |
➖ | tcp_syslog_to_blackhole | ingress throughput | +0.32 | [+0.26, +0.39] | 1 | Logs |
➖ | file_to_blackhole_0ms_latency_http2 | egress throughput | +0.20 | [-0.71, +1.11] | 1 | Logs |
➖ | file_to_blackhole_100ms_latency | egress throughput | +0.19 | [-0.52, +0.90] | 1 | Logs |
➖ | file_to_blackhole_500ms_latency | egress throughput | +0.16 | [-0.61, +0.93] | 1 | Logs |
➖ | file_tree | memory utilization | +0.13 | [+0.01, +0.25] | 1 | Logs |
➖ | quality_gate_idle_all_features | memory utilization | +0.11 | [+0.03, +0.19] | 1 | Logs bounds checks dashboard |
➖ | file_to_blackhole_1000ms_latency_linear_load | egress throughput | +0.10 | [-0.37, +0.57] | 1 | Logs |
➖ | file_to_blackhole_0ms_latency | egress throughput | +0.10 | [-0.80, +1.00] | 1 | Logs |
➖ | tcp_dd_logs_filter_exclude | ingress throughput | -0.00 | [-0.02, +0.01] | 1 | Logs |
➖ | uds_dogstatsd_to_api | ingress throughput | -0.01 | [-0.12, +0.10] | 1 | Logs |
➖ | file_to_blackhole_0ms_latency_http1 | egress throughput | -0.02 | [-0.87, +0.82] | 1 | Logs |
➖ | file_to_blackhole_300ms_latency | egress throughput | -0.04 | [-0.68, +0.61] | 1 | Logs |
➖ | file_to_blackhole_1000ms_latency | egress throughput | -0.21 | [-0.99, +0.58] | 1 | Logs |
➖ | quality_gate_idle | memory utilization | -0.87 | [-0.90, -0.84] | 1 | Logs bounds checks dashboard |
➖ | quality_gate_logs | % cpu utilization | -1.30 | [-4.48, +1.89] | 1 | Logs |
Bounds Checks: ✅ Passed
perf | experiment | bounds_check_name | replicates_passed | links |
---|---|---|---|---|
✅ | file_to_blackhole_0ms_latency | lost_bytes | 10/10 | |
✅ | file_to_blackhole_0ms_latency | memory_usage | 10/10 | |
✅ | file_to_blackhole_0ms_latency_http1 | lost_bytes | 10/10 | |
✅ | file_to_blackhole_0ms_latency_http1 | memory_usage | 10/10 | |
✅ | file_to_blackhole_0ms_latency_http2 | lost_bytes | 10/10 | |
✅ | file_to_blackhole_0ms_latency_http2 | memory_usage | 10/10 | |
✅ | file_to_blackhole_1000ms_latency | memory_usage | 10/10 | |
✅ | file_to_blackhole_1000ms_latency_linear_load | memory_usage | 10/10 | |
✅ | file_to_blackhole_100ms_latency | lost_bytes | 10/10 | |
✅ | file_to_blackhole_100ms_latency | memory_usage | 10/10 | |
✅ | file_to_blackhole_300ms_latency | lost_bytes | 10/10 | |
✅ | file_to_blackhole_300ms_latency | memory_usage | 10/10 | |
✅ | file_to_blackhole_500ms_latency | lost_bytes | 10/10 | |
✅ | file_to_blackhole_500ms_latency | memory_usage | 10/10 | |
✅ | quality_gate_idle | memory_usage | 10/10 | bounds checks dashboard |
✅ | quality_gate_idle_all_features | memory_usage | 10/10 | bounds checks dashboard |
✅ | quality_gate_logs | lost_bytes | 10/10 | |
✅ | quality_gate_logs | memory_usage | 10/10 |
Explanation
Confidence level: 90.00%
Effect size tolerance: |Δ mean %| ≥ 5.00%
Performance changes are noted in the perf column of each table:
- ✅ = significantly better comparison variant performance
- ❌ = significantly worse comparison variant performance
- ➖ = no significant change in performance
A regression test is an A/B test of target performance in a repeatable rig, where "performance" is measured as "comparison variant minus baseline variant" for an optimization goal (e.g., ingress throughput). Due to intrinsic variability in measuring that goal, we can only estimate its mean value for each experiment; we report uncertainty in that value as a 90.00% confidence interval denoted "Δ mean % CI".
For each experiment, we decide whether a change in performance is a "regression" -- a change worth investigating further -- if all of the following criteria are true:
-
Its estimated |Δ mean %| ≥ 5.00%, indicating the change is big enough to merit a closer look.
-
Its 90.00% confidence interval "Δ mean % CI" does not contain zero, indicating that if our statistical model is accurate, there is at least a 90.00% chance there is a difference in performance between baseline and comparison variants.
-
Its configuration does not mark it "erratic".
CI Pass/Fail Decision
✅ Passed. All Quality Gates passed.
- quality_gate_logs, bounds check lost_bytes: 10/10 replicas passed. Gate passed.
- quality_gate_logs, bounds check memory_usage: 10/10 replicas passed. Gate passed.
- quality_gate_idle_all_features, bounds check memory_usage: 10/10 replicas passed. Gate passed.
- quality_gate_idle, bounds check memory_usage: 10/10 replicas passed. Gate passed.
Overall this change looks fine to me, if I understand the issue correctly is this env var configuration not working an issue affecting traces such that they are still being obfuscated when you don't expect them to be? This is surprising to me since it looks like the old code worked alright / I'm fairly sure I've used env vars to configure these in the past 🤔 |
Thank you for reviewing this PR! I am documenting the testing method and results here. I will add the Unit Test later. Background: For investigation, I added debug logs that dump obfuscation config to the code before this PR (= main branch). TL;DR: We can modify obfuscation config through env vars for trace-agent, but not core-agent. git diffgit diff HEAD~1 | cat -
diff --git a/comp/trace/config/setup.go b/comp/trace/config/setup.go
index 06a157dc64..0c26271855 100644
--- a/comp/trace/config/setup.go
+++ b/comp/trace/config/setup.go
@@ -508,6 +508,7 @@ func applyDatadogConfig(c *config.AgentConfig, core corecompcfg.Component) error
if pkgconfigsetup.Datadog().IsSet("apm_config.obfuscation.cache.enabled") {
c.Obfuscation.Cache.Enabled = pkgconfigsetup.Datadog().GetBool("apm_config.obfuscation.cache.enabled")
}
+ log.Debugf("obfuscation config: %#v", c.Obfuscation)
}
if core.IsSet("apm_config.filter_tags.require") {
diff --git a/pkg/collector/python/datadog_agent.go b/pkg/collector/python/datadog_agent.go
index 95a0b2f69f..af02378d79 100644
--- a/pkg/collector/python/datadog_agent.go
+++ b/pkg/collector/python/datadog_agent.go
@@ -258,6 +258,7 @@ func lazyInitObfuscator() *obfuscate.Obfuscator {
log.Errorf("Failed to unmarshal apm_config.obfuscation: %s", err.Error())
cfg = obfuscate.Config{}
}
+ log.Debugf("obfuscation config: %#v", cfg)
if !cfg.SQLExecPlan.Enabled {
cfg.SQLExecPlan = defaultSQLPlanObfuscateSettings
}
diff --git a/pkg/trace/agent/obfuscate.go b/pkg/trace/agent/obfuscate.go
index 25042e3fd6..228c2433f3 100644
--- a/pkg/trace/agent/obfuscate.go
+++ b/pkg/trace/agent/obfuscate.go
@@ -138,6 +138,7 @@ func (a *Agent) lazyInitObfuscator() *obfuscate.Obfuscator {
if a.obfuscator == nil {
if a.obfuscatorConf != nil {
+ log.Debugf("obfuscation config: %#v", *a.obfuscatorConf)
a.obfuscator = obfuscate.NewObfuscator(*a.obfuscatorConf)
} else {
a.obfuscator = obfuscate.NewObfuscator(obfuscate.Config{}) Then, I checked debug logs with the following containers. docker-compose.yamlservices:
agent:
container_name: datadog-agent
image: datadog/agent-dev:keisku-apm-config-obfuscation-investigation-py3
volumes:
- /sys/fs/cgroup/:/host/sys/fs/cgroup:ro
- /proc/:/host/proc/:ro
- /var/run/docker.sock:/var/run/docker.sock:ro
- /var/lib/cloud/data/instance-id:/var/lib/cloud/data/instance-id:ro
environment:
DD_LOG_LEVEL: debug
DD_HOSTNAME_FILE: /var/lib/cloud/data/instance-id
DD_APM_OBFUSCATION_MEMCACHED_ENABLED: false
DD_APM_OBFUSCATION_CACHE_ENABLED: true
DD_APM_OBFUSCATION_SQL_EXEC_PLAN_ENABLED: true
DD_APM_OBFUSCATION_SQL_EXEC_PLAN_KEEP_VALUES: value1 value2 value3
env_file:
- ~/sandbox.docker.env
mysql:
container_name: mysql
image: mysql:8
restart: always
labels:
com.datadoghq.ad.checks: |
{
"mysql": {
"instances": [
{
"host": "%%host%%",
"port": "3306",
"username": "datadog",
"password": "datadog",
"dbm": "true"
}
]
}
}
environment:
MYSQL_ROOT_PASSWORD: password
MYSQL_USER: datadog
MYSQL_PASSWORD: datadog
volumes:
- ./init.sh:/docker-entrypoint-initdb.d/init.sh We can modify obfuscation config through env vars for trace-agent, but not core-agent. The result of this test case, I am sure env vars don't change obfuscation config for CORE.
docker logs datadog-agent | grep 'obfuscation\ config\:'
2024-12-21 09:49:38 UTC | TRACE | DEBUG | (comp/trace/config/setup.go:511 in applyDatadogConfig) | obfuscation config: &config.ObfuscationConfig{ES:obfuscate.JSONConfig{Enabled:true, KeepValues:[]string(nil), ObfuscateSQLValues:[]string(nil)}, OpenSearch:obfuscate.JSONConfig{Enabled:true, KeepValues:[]string(nil), ObfuscateSQLValues:[]string(nil)}, Mongo:obfuscate.JSONConfig{Enabled:true, KeepValues:[]string(nil), ObfuscateSQLValues:[]string(nil)}, SQLExecPlan:obfuscate.JSONConfig{Enabled:true, KeepValues:[]string{"value1", "value2", "value3"}, ObfuscateSQLValues:[]string(nil)}, SQLExecPlanNormalize:obfuscate.JSONConfig{Enabled:false, KeepValues:[]string(nil), ObfuscateSQLValues:[]string(nil)}, HTTP:obfuscate.HTTPConfig{RemoveQueryString:false, RemovePathDigits:false}, RemoveStackTraces:false, Redis:obfuscate.RedisConfig{Enabled:true, RemoveAllArgs:false}, Memcached:obfuscate.MemcachedConfig{Enabled:false, KeepCommand:false}, CreditCards:obfuscate.CreditCardsConfig{Enabled:true, Luhn:false, KeepValues:[]string(nil)}, Cache:obfuscate.CacheConfig{Enabled:true}}
2024-12-21 09:49:42 UTC | CORE | DEBUG | (pkg/collector/python/datadog_agent.go:261 in func1) | obfuscation config: obfuscate.Config{SQL:obfuscate.SQLConfig{DBMS:"", TableNames:false, CollectCommands:false, CollectComments:false, CollectProcedures:false, ReplaceDigits:false, KeepSQLAlias:false, DollarQuotedFunc:false, ObfuscationMode:"", RemoveSpaceBetweenParentheses:false, KeepNull:false, KeepBoolean:false, KeepPositionalParameter:false, KeepTrailingSemicolon:false, KeepIdentifierQuotation:false, KeepJSONPath:false, Cache:false}, ES:obfuscate.JSONConfig{Enabled:false, KeepValues:[]string(nil), ObfuscateSQLValues:[]string(nil)}, OpenSearch:obfuscate.JSONConfig{Enabled:false, KeepValues:[]string(nil), ObfuscateSQLValues:[]string(nil)}, Mongo:obfuscate.JSONConfig{Enabled:false, KeepValues:[]string(nil), ObfuscateSQLValues:[]string(nil)}, SQLExecPlan:obfuscate.JSONConfig{Enabled:false, KeepValues:[]string(nil), ObfuscateSQLValues:[]string(nil)}, SQLExecPlanNormalize:obfuscate.JSONConfig{Enabled:false, KeepValues:[]string(nil), ObfuscateSQLValues:[]string(nil)}, HTTP:obfuscate.HTTPConfig{RemoveQueryString:false, RemovePathDigits:false}, Redis:obfuscate.RedisConfig{Enabled:false, RemoveAllArgs:false}, Memcached:obfuscate.MemcachedConfig{Enabled:false, KeepCommand:false}, CreditCard:obfuscate.CreditCardsConfig{Enabled:false, Luhn:false, KeepValues:[]string(nil)}, Statsd:obfuscate.StatsClient(nil), Logger:obfuscate.Logger(nil), Cache:obfuscate.CacheConfig{Enabled:false}} |
bf4e79e
to
f03a4de
Compare
f03a4de
to
1fa31ff
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
looks good, thanks for the added details and tests here!
I am going to update this branch. |
if len(obfuscaterConfig.Mongo.KeepValues) == 0 { | ||
obfuscaterConfig.Mongo.KeepValues = defaultMongoObfuscateSettings.KeepValues |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Before this PR, cfg
here is not reflected by default values because DD_APM_OBFUSCATION_*
are not bound to default values. This is the issue that I want to solve with this PR.
After this PR, cfg.Mongo.Enabled
is true
since DD_APM_OBFUSCATION_MONGODB_ENABLED
is bound to true
. This follows this documentation.
datadog-agent/pkg/config/config_template.yaml
Lines 1212 to 1215 in 62c67db
# mongodb: | |
## @param DD_APM_OBFUSCATION_MONGODB_ENABLED - boolean - optional | |
## Enables obfuscation rules for spans of type "mongodb". Enabled by default. | |
# enabled: true |
So, we need this change to keep backward compatibility.
cfg.SQLExecPlan.Enabled
and cfg.SQLExecPlanNormalize.Enabled
are false
by default because DD_APM_OBFUSCATION_SQL_EXEC_PLAN_ENABLED
and DD_APM_OBFUSCATION_SQL_EXEC_PLAN_NORMALIZE_ENABLED
are bound to false
. Therefore, we can keep backward compatibility without code change here.
/merge |
Devflow running:
|
What does this PR do?
Fix the issue that we cannot modify
apm_config.obfuscation.*
through env vars.Set default values based on this file.
datadog-agent/pkg/config/config_template.yaml
Lines 1159 to 1267 in 26052f9
Motivation
There’s no special meaning behind choosing
apm_config.obfuscatio.memcached.enabled
andapm_config.obfuscatio.http.remove_paths_with_digits
. They’re just examples.For example, when we try to change
apm_config.obfuscatio.memcached.enabled
andapm_config.obfuscatio.http.remove_paths_with_digits
in container environment, we would try to set env vars like this.The following outputs are not expected. They should be hierarchical structure.
obfuscation.memcached.enabled: "false"
obfuscation.http.remove_paths_with_digits: "false"
Workaround
Change configs through a static file.
Describe how you validated your changes
Possible Drawbacks / Trade-offs
Additional Notes