Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Send telemetry events when a cluster agent mutates a remote config #17663

Merged
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
186 commits
Select commit Hold shift + click to select a range
e47b9e3
fix windows nanoserver crash on glog v1.1.x (#17340)
paulcacheux Jun 2, 2023
9ce4675
[CWS] fix activity tree for busybox utils (#17415)
YoannGh Jun 2, 2023
32a7ba9
Fix reporting of conflicting telemetry metrics (#17417)
vickenty Jun 2, 2023
3ad4612
Update last stable version to 7.44.1 (#17438)
Kaderinho Jun 2, 2023
c23dc78
update packages to fix vulnerabilities in dependencies (#17418)
AliDatadog Jun 2, 2023
3723c72
do not use reflection for shallow copy (#17421)
knusbaum Jun 2, 2023
2ecf2ae
fix auto multi-line integration config (#17447)
gh123man Jun 4, 2023
0c98c9c
Update release.json and Go modules for 6/7.46.0-rc.2 (#17452)
kacper-murzyn Jun 4, 2023
14234aa
[CWS] reset events_stats to a PERCPU_ARRAY instead of a HASHMAP (#17473)
Gui774ume Jun 6, 2023
37df2b8
Bump ncurses to 6.4 to fix CVE-2023-29491 (#17493)
amenasria Jun 6, 2023
7be2f31
Kacper murzyn/7.45.0 changelog backport (#17489)
kacper-murzyn Jun 6, 2023
8e96de8
Update latest stable agent version to 7.45.0 (#17491)
kacper-murzyn Jun 6, 2023
85c58dc
fix subscriptionId fetching on azure (#17495)
paulcacheux Jun 6, 2023
baf054a
[SBOM] Remove `DeleteBlobs` from the sbom cache (#17465)
AliDatadog Jun 7, 2023
3df6d83
[CWS] fix duration suffix parsing (#17476)
paulcacheux Jun 7, 2023
3d3ca12
convert remaining users of old `golang-lru` to new generics based ver…
paulcacheux Jun 7, 2023
f947780
[CWS] pre-alloc msg tags (#17434)
paulcacheux Jun 7, 2023
1ab475f
silence error log about `DD_API_KEY` in internal profiler (#17371)
paulcacheux Jun 7, 2023
f81e4cd
Bump golang.org/x/sys from 0.3.0 to 0.8.0 in /pkg/gohai (#17106)
dependabot[bot] Jun 7, 2023
2e9f9b0
[Gohai] Add common elements of the future new API (#17221)
pgimalac Jun 7, 2023
6c439d8
CWS: sync BTFhub constants (#17498)
github-actions[bot] Jun 7, 2023
ba710b2
Bump golang.org/x/tools from 0.9.1 to 0.9.3 in /pkg/security/secl (#1…
dependabot[bot] Jun 7, 2023
9115720
Bump github.com/stretchr/testify from 1.8.3 to 1.8.4 in /pkg/security…
dependabot[bot] Jun 7, 2023
50c85b6
Bump requests from 2.30.0 to 2.31.0 in /test/e2e/cws-tests (#17428)
dependabot[bot] Jun 7, 2023
37496bd
Bump docker from 6.1.2 to 6.1.3 in /test/e2e/cws-tests (#17427)
dependabot[bot] Jun 7, 2023
9eb0602
Bump datadog-api-client from 2.12.0 to 2.13.1 in /test/e2e/cws-tests …
dependabot[bot] Jun 7, 2023
27585c5
[system-probe] only increment unregisters metric if delete actually o…
akarpz Jun 7, 2023
41fe369
Bump github.com/prometheus/procfs from 0.10.0 to 0.10.1 (#17347)
dependabot[bot] Jun 7, 2023
04562ae
Fix duplicate prebuilt module in use during tests (#17472)
brycekahle Jun 7, 2023
7061f42
Add way to log trace_pipe from tests (#17339)
brycekahle Jun 7, 2023
3b6621c
Bump github.com/vektra/mockery/v2 from 2.26.1 to 2.28.1 in /internal/…
dependabot[bot] Jun 7, 2023
e83efae
[CWS][SEC-3735] Check self tests results in e2e tests (#17387)
mftoure Jun 7, 2023
e7ed1d5
[CSPM] Resolve process env variables only if required (#17461)
jinroh Jun 7, 2023
4723d21
[system-probe] Handle/reduce stat cookie collisions (#17197)
hmahmood Jun 7, 2023
aad63bc
[system-probe] Add internal_profiling.delta_profiles option to system…
hmahmood Jun 7, 2023
556361d
[CSPM] Fix flakyness of TestProcessInput/Sleeps (#17399)
jinroh Jun 7, 2023
a30f9af
system-probe: Remove redundant call for IsAdjusted (#17345)
guyarb Jun 7, 2023
eb0d828
npm: Remove connection entry from tcpStats map if the connection is T…
guyarb Jun 7, 2023
d03351c
deprecate usm configuration values (#17216)
guyarb Jun 7, 2023
8dd7bc3
Cloud Service implementation for Azure App Service (#17483)
avedmala Jun 7, 2023
af9aca5
[CWS] avoid exec bomb (#17435)
safchain Jun 7, 2023
b679009
[CWS] fix process schema (#17422)
safchain Jun 7, 2023
b203c03
Bump github.com/open-policy-agent/opa from 0.53.0 to 0.53.1 (#17505)
dependabot[bot] Jun 7, 2023
82ac88d
[CSPM] Do not allow http.send and opa.runtime rego builtins (#17409)
jinroh Jun 7, 2023
769962c
Bump github.com/hashicorp/golang-lru/v2 from 2.0.2 to 2.0.3 (#17503)
dependabot[bot] Jun 7, 2023
8c758bf
[system-probe] Fix race in Stop() for tcp close consumer (#17511)
hmahmood Jun 7, 2023
d46892d
[e2e] target agent-sandbox account by default with e2e tests (#17484)
pducolin Jun 7, 2023
50a5623
typo (#17430)
modernplumbing Jun 7, 2023
661a20f
process-monitor: Change owner (#17510)
guyarb Jun 7, 2023
d4764da
npm: Spare copying of active connection twice (#17351)
guyarb Jun 7, 2023
aee3fed
process-monitor: Change loading order. (#17401)
guyarb Jun 7, 2023
0e4453e
[e2e] bump test-infra-definition to v0.0.0-20230607143804-fef23444c9d…
pducolin Jun 7, 2023
1db5090
npm: Remove redundant err return (#17520)
guyarb Jun 7, 2023
098e806
system-probe: Avoid unnecessary allocations for trace logs in hot-cod…
guyarb Jun 8, 2023
ec5212c
npm: Changed dns resolution to get a set of IPs rather than a list. (…
guyarb Jun 8, 2023
c85186e
Fix potentital use of uninitialized memory (#17490)
vickenty Jun 8, 2023
2d11f7b
Bump github.com/stretchr/testify from 1.8.2 to 1.8.4 in /pkg/gohai (#…
dependabot[bot] Jun 8, 2023
05b9829
allow snapshot to fail (#17386)
Gui774ume Jun 8, 2023
d33dc70
add JSON decoder for activity dumps (#17444)
Gui774ume Jun 8, 2023
5a063e7
add activity tree stats in activity dump list command (#17369)
Gui774ume Jun 8, 2023
8fbe997
fix secprofile unstable guards (#17509)
spikat Jun 8, 2023
0412fa5
use the remote storage from a command line (#17525)
Gui774ume Jun 8, 2023
846ba2b
Adding shared pool monitoring for Oracle databases (#17360)
nenadnoveljic Jun 8, 2023
97ce071
Adding more sysmetrics to Oracle monitoring (#17466)
nenadnoveljic Jun 8, 2023
98acb70
Revert "[CWS][SEC-3735] Check self tests results in e2e tests (#17387…
mftoure Jun 8, 2023
f4ed6fc
MetricSecurityProfileAnomalyDetectionGenerated tracks the number of g…
Gui774ume Jun 8, 2023
d23cd14
[CWS] fix race when playing snapshot process data (#17527)
YoannGh Jun 8, 2023
30faf6f
AP-2099 Prevent jobs that trigger child pipelines to download artefac…
chouetz Jun 8, 2023
d1cccaf
Fix broken loop (#17534)
amenasria Jun 8, 2023
ad72a19
Report conntrack ebpf module loading telemetry (#17539)
hmahmood Jun 8, 2023
91fbd33
[fakeintake] add godoc (#17474)
pducolin Jun 8, 2023
e79ba62
usm: process monitor: Call heavy operation only if needed (#17457)
guyarb Jun 8, 2023
ef820b7
Update java integration tests to use latest layers. (#17194)
purple4reina Jun 8, 2023
6e493bb
Add workaround for database connection loss (#17486)
nenadnoveljic Jun 9, 2023
0eba841
[CWS] remove load controller (#17220)
safchain Jun 9, 2023
a50b46f
[CWS] rework secprofile warmup tests (#17377)
spikat Jun 9, 2023
04b77a7
(rcm) simplify the RC thin client (#17468)
arbll Jun 9, 2023
46ab12a
CWS: sync BTFhub constants (#17550)
github-actions[bot] Jun 9, 2023
d71aaa4
https java tests use local https server (#17067)
nplanel Jun 9, 2023
334192c
[CWS] revert snapshot event playing (#17553)
paulcacheux Jun 9, 2023
3419bc6
deprecate more usm values (#17342)
guyarb Jun 9, 2023
09f72d2
Adds DD_RESOURCE_GROUP and DD_SUBSCRIPTION_ID to env vars (#17558)
IvanTopolcic Jun 9, 2023
2d93fc1
rtloader: Use execinfo only on glibc (#15256)
at-wat Jun 9, 2023
5c8ef80
Remove a no more used SBOM check config parameter (#17405)
L3n41c Jun 10, 2023
951af70
Adjust default value for Oracle check interval (#17551)
nenadnoveljic Jun 12, 2023
409a88a
Add new invoke task to test buildimage update (#17241)
chouetz Jun 12, 2023
ead21b2
Bump emoji from 2.2.0 to 2.4.0 in /test/e2e/cws-tests (#17425)
dependabot[bot] Jun 12, 2023
4df269d
Bump github.com/itchyny/gojq from 0.12.12 to 0.12.13 (#17442)
dependabot[bot] Jun 12, 2023
4709dcf
[CWS] remove unused arg from `fill_exec_context` (#17579)
paulcacheux Jun 12, 2023
1c735fe
chore(gohai): update gopsutil/v3 to 3.23.2 (#17500)
pgimalac Jun 12, 2023
6576f41
mount docker socket to dev container (#17385)
AliDatadog Jun 12, 2023
09b9e0d
add semver to requirements.txt (#17384)
AliDatadog Jun 12, 2023
5c3e0e7
[DCA][Autodiscovery] Add more context to error log (#17464)
AliDatadog Jun 12, 2023
f2afd95
USMO-259 - Support Java Async frameworks (#16346)
val06 Jun 12, 2023
b7c1e7b
[PROC-2913] Create protobuf definitions for process workload stream s…
just-chillin Jun 12, 2023
350a6f7
[RCM] Fix rc config deletion (#17581)
coignetp Jun 12, 2023
496c5cd
bump `ebpf-manager` to latest (#17585)
paulcacheux Jun 12, 2023
04eaeb0
[Gohai][ASC-471] implement cpu collection using sysctl syscall (#17556)
pgimalac Jun 12, 2023
7a3c8c3
Add tests to CI (#17541)
KevinFairise2 Jun 12, 2023
00458be
[USM] don't flood logs when a process is not java (#17590)
nplanel Jun 12, 2023
d54fc6c
Fix the formating for debug log in SetAgentMetadata (#17382)
ogaca-dd Jun 12, 2023
67531f2
[process-agent] Create WorkloadMetaExtractor v1 (#17448)
just-chillin Jun 12, 2023
ed7117b
[usm] Add ability to report payload telemetry (#17544)
p-lambert Jun 12, 2023
5e67642
Update the `test-infra-definitions` dependency in `test/new-e2e` (#17…
L3n41c Jun 12, 2023
bde9b66
Revert "[usm] Improve `incompleteBuffer` (#17164)" (#17593)
p-lambert Jun 12, 2023
14bd94d
DD_SERVICE_MAPPING in extension (#17189)
zARODz11z Jun 12, 2023
a2da46a
Improves python check docs to use virtualenv and sort out PYTHONPATH …
scottopell Jun 12, 2023
5bf403d
Bump github.com/hashicorp/golang-lru/v2 in /pkg/security/secl (#17599)
dependabot[bot] Jun 13, 2023
4dd3ab9
CWS: sync BTFhub constants (#17608)
github-actions[bot] Jun 13, 2023
4c91442
Upgrade to OpenSSL 3 in Agent 7, upgrade Python 3 to 3.9.17 (#17501)
Jun 13, 2023
e31680c
Bump golang.org/x/sys from 0.8.0 to 0.9.0 in /pkg/security/secl (#17600)
dependabot[bot] Jun 13, 2023
a381ad6
Fix username generation on windows (#17547)
vboulineau Jun 13, 2023
42a742e
[USM] tests RunDockerServer/RunHostServer log pid (#17587)
nplanel Jun 13, 2023
76324ee
[pkg/netflow] Collect `flow_process_nf_errors_count` metric from gofl…
TCheruy Jun 13, 2023
6fd2c6e
[CWS] remove unused mount group id field (#17222)
YoannGh Jun 13, 2023
9359f56
[CWS] cgroup resolver: use hashmap instead of LRU to track workload P…
YoannGh Jun 13, 2023
d5bc5df
[AP-2139] Add amazonlinux2023 to the kitchen tests (#17548)
chouetz Jun 13, 2023
83605a5
[workloadmeta][process] Bootstrap process entities in workloadmeta (#…
AliDatadog Jun 13, 2023
3924bc4
[CSPM] Make sure we do not create zombie processes in our tests (#17609)
jinroh Jun 13, 2023
550167a
feat: support provisioned concurrency and proactive initialization (#…
astuyve Jun 13, 2023
d96677a
Bump snowflake-connector-python to 3.0.4 (#17445)
yzhan289 Jun 13, 2023
aff639f
[CWS] remote use of internal pointer (#16731)
safchain Jun 13, 2023
4f1370b
Include AAS metadata in span tags (#17591)
avedmala Jun 13, 2023
fa1dcac
[RCM] Add rc client in flare (#17094)
coignetp Jun 13, 2023
783a60e
[secrets][tests] properly reset secrets backend timeout after test (#…
pgimalac Jun 13, 2023
2f26f0e
[CWS] fix prerm scripts error logs (#17383)
spikat Jun 13, 2023
ab428da
Handle missing result json file (#17537)
iliakur Jun 13, 2023
b6a286a
[CWS] move arithmetic secl test to the secl package (#17610)
safchain Jun 13, 2023
468a903
[CWS] decouple a bit AD/Profile from probe (#17131)
safchain Jun 13, 2023
19818d6
[CWS] cleanup runner before running btfhub sync job (#17629)
paulcacheux Jun 14, 2023
adbe511
CWS: sync BTFhub constants (#17633)
github-actions[bot] Jun 14, 2023
13a61ef
[corechecks/snmp] Refactor Profile Config (#17618)
AlexandreYang Jun 14, 2023
82329a0
[CWS] rework secprofile tryAutolearn (#17535)
spikat Jun 14, 2023
6678980
[Fix] Agent version cache not correctly loaded in multiple CI jobs (#…
Pythyu Jun 14, 2023
c0aeb95
http2: remove packed enum values (#17586)
Yumasi Jun 14, 2023
1bffa4e
Make `nettop` available (#17458)
keisku Jun 14, 2023
aec3c81
[CWS] support kernel with usernamespaces arguments for security funct…
paulcacheux Jun 14, 2023
5bbb72d
[CWS] add unknown source for process entry (#17636)
safchain Jun 14, 2023
9e19e2c
[CWS] update fallback constants for recent kernels (#17639)
paulcacheux Jun 14, 2023
c35abde
move `kitchen_test_dummy_job_tmp` to k8s runners (#17641)
paulcacheux Jun 14, 2023
d89d1ae
[gitlab] Migration of unit tests CI jobs to k8s Gitlab runners (#17179)
KSerrania Jun 14, 2023
03ecfe9
[gitlab] Migrate docker publish jobs to k8s runners (#17270)
KSerrania Jun 14, 2023
0f702cb
Add mutex to runtime settings (#17640)
coignetp Jun 14, 2023
456b3ad
Process BTF archive nightly (#17621)
brycekahle Jun 14, 2023
2d013c2
Minor fixes to system-probe (#17622)
brycekahle Jun 14, 2023
02fa371
[CWS Agent] RC rules override local rules if IDs conflict (#17573)
modernplumbing Jun 14, 2023
563ae46
pkg/flare: add missing APM variables to envvars (#17597)
hannahkm Jun 14, 2023
65a17c4
[Serverless] Use prebuilt opentelemetry lambda layers in integration …
purple4reina Jun 14, 2023
73a073a
Add encoding info to tailer info for the agent status verbose page (#…
DDuongNguyen Jun 14, 2023
c856b5e
[usm] Intern Kafka topic names (#17648)
p-lambert Jun 14, 2023
a931b69
config/apm: fix parsing DD_APM_FEATURES (#17630)
ahmed-mez Jun 15, 2023
9705660
[CWS] do not handle broken lineage during snapshot (#17624)
safchain Jun 15, 2023
1ad8886
Revert "[CWS] do not handle broken lineage during snapshot (#17624)" …
paulcacheux Jun 15, 2023
6c3e85b
[CWS] Improve tryAutolearn unit tests by making fake events to have a…
spikat Jun 15, 2023
511932c
[CWS] fix overlayfs inode read on kernel 5.19 and higher (#17644)
paulcacheux Jun 15, 2023
086ee8c
Revert "Revert "[CWS] do not handle broken lineage during snapshot (#…
paulcacheux Jun 15, 2023
f767e4e
Report config mutation events from the agent
yshapiro-57 Jun 15, 2023
d98dec4
Create initial config for DDQA (#15675)
ofek Jun 15, 2023
08851c9
dump silent workloads (#17412)
Gui774ume Jun 15, 2023
4f2dd17
[CWS] constantify `vm_flags` access in `vm_area_struct` (#17662)
paulcacheux Jun 15, 2023
0475eb9
regression detector: change baseline variant from latest main to merg…
goxberry Jun 15, 2023
190787c
stop dumping workloads with a stable event type (#17536)
Gui774ume Jun 15, 2023
a461ac1
[CWS] run functional tests on al2023 (#17612)
paulcacheux Jun 15, 2023
dfab821
Run system-probe tests using kernel matrix testing scenario (#16406)
usamasaqib Jun 15, 2023
05eb2b0
Expose agent telemetry on system-probe UDS (#17652)
brycekahle Jun 15, 2023
f90e87a
[new-e2e] use standard-verbose format when verbose is True (#17660)
pducolin Jun 15, 2023
a39f415
add missing filter_tag envvars to config (#17653)
hannahkm Jun 15, 2023
faad228
[serverless] add `peer.service` to inferred spans (#17414)
duncanista Jun 15, 2023
941acba
Double Agent replicate counts (#17664)
blt Jun 15, 2023
28efbc1
Remove dependency on github.com/iovisor/gobpf for single function (#1…
brycekahle Jun 15, 2023
d36fff9
More fixes from CWS module name change (#17650)
brycekahle Jun 15, 2023
fa1aa17
Fix cyclical import
yshapiro-57 Jun 15, 2023
aaa8d4a
Correctly handle empty error messages
yshapiro-57 Jun 15, 2023
c6c4f83
CWS: sync BTFhub constants (#17679)
github-actions[bot] Jun 16, 2023
ab37437
AP-2062 Change version of builders image and change kitchen cleanup t…
KevinFairise2 Jun 16, 2023
2c1bfc7
[USM] Monitor & HTTP refactor (#17283)
Yumasi Jun 16, 2023
8f6bdc7
[CWS] get rid of invalidate_dentry (#17543)
safchain Jun 16, 2023
b721c29
Add a unit test
yshapiro-57 Jun 16, 2023
6385e90
Merge branch 'main' into yakov.shapiro/send-mutate-event
yshapiro-57 Jun 16, 2023
851fe89
Fix the unit test
yshapiro-57 Jun 16, 2023
0781c81
Revert "Merge branch 'main' into yakov.shapiro/send-mutate-event"
yshapiro-57 Jun 16, 2023
0dcd526
Fix another failing unit test
yshapiro-57 Jun 16, 2023
f760a14
Fix one more unit test failure
yshapiro-57 Jun 21, 2023
65a76f0
Formatting change
yshapiro-57 Jun 21, 2023
bbb3c21
Address the lint errors
yshapiro-57 Jun 21, 2023
84bd6ff
Update the CODEOWNERS file per comment
yshapiro-57 Jun 23, 2023
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 2 additions & 0 deletions .github/CODEOWNERS
Original file line number Diff line number Diff line change
Expand Up @@ -220,6 +220,7 @@
/pkg/obfuscate/ @DataDog/agent-apm
/pkg/trace/ @DataDog/agent-apm
/pkg/trace/api/otlp*.go @DataDog/opentelemetry
/pkg/trace/telemetry/ @DataDog/telemetry-and-analytics
/pkg/autodiscovery/ @DataDog/container-integrations @DataDog/agent-metrics-logs
/pkg/autodiscovery/listeners/ @DataDog/container-integrations
/pkg/autodiscovery/listeners/cloudfoundry*.go @DataDog/platform-integrations
Expand All @@ -231,6 +232,7 @@
/pkg/cloudfoundry @Datadog/platform-integrations
/pkg/clusteragent/ @DataDog/container-integrations
/pkg/clusteragent/orchestrator/ @DataDog/container-app
/pkg/clusteragent/telemetry/ @DataDog/telemetry-and-analytics
/pkg/collector/ @DataDog/agent-metrics-logs
/pkg/collector/corechecks/cluster/ @DataDog/container-integrations
/pkg/collector/corechecks/cluster/orchestrator @DataDog/container-app
Expand Down
33 changes: 33 additions & 0 deletions pkg/clusteragent/admission/patch/patch_request.go
Original file line number Diff line number Diff line change
Expand Up @@ -12,6 +12,7 @@ import (
"fmt"

"github.com/DataDog/datadog-agent/pkg/clusteragent/admission/common"
"github.com/DataDog/datadog-agent/pkg/clusteragent/telemetry"
)

// TargetObjKind represents the supported k8s object kinds
Expand All @@ -38,6 +39,7 @@ const (
type PatchRequest struct {
ID string `json:"id"`
Revision int64 `json:"revision"`
RcVersion uint64 `json:"rc_version"`
SchemaVersion string `json:"schema_version"`
Action Action `json:"action"`

Expand All @@ -59,6 +61,37 @@ func (pr PatchRequest) Validate(clusterName string) error {
return pr.K8sTarget.validate(clusterName)
}

func (pr PatchRequest) getApmRemoteConfigEvent(err error, errorCode int) telemetry.ApmRemoteConfigEvent {
env := ""
if pr.LibConfig.Env != nil {
env = *pr.LibConfig.Env
}
errorMessage := ""
if err != nil {
errorMessage = err.Error()
}
return telemetry.ApmRemoteConfigEvent{
RequestType: "apm-remote-config-event",
ApiVersion: "v2",
Payload: telemetry.ApmRemoteConfigEventPayload{
Tags: telemetry.ApmRemoteConfigEventTags{
Env: env,
RcId: pr.ID,
RcRevision: pr.Revision,
RcVersion: pr.RcVersion,
KubernetesCluster: pr.K8sTarget.Cluster,
KubernetesNamespace: pr.K8sTarget.Namespace,
KubernetesKind: string(pr.K8sTarget.Kind),
KubernetesName: pr.K8sTarget.Name,
},
Error: telemetry.ApmRemoteConfigEventError{
Code: errorCode,
Message: errorMessage,
},
},
}
}

// K8sTarget represent the targetet k8s object
type K8sTarget struct {
Cluster string `json:"cluster"`
Expand Down
20 changes: 12 additions & 8 deletions pkg/clusteragent/admission/patch/patcher.go
Original file line number Diff line number Diff line change
Expand Up @@ -11,9 +11,9 @@ import (
"context"
"encoding/json"
"fmt"

"github.com/DataDog/datadog-agent/pkg/clusteragent/admission/common"
"github.com/DataDog/datadog-agent/pkg/clusteragent/admission/metrics"
"github.com/DataDog/datadog-agent/pkg/clusteragent/telemetry"
k8sutil "github.com/DataDog/datadog-agent/pkg/util/kubernetes"
"github.com/DataDog/datadog-agent/pkg/util/log"

Expand All @@ -25,16 +25,18 @@ import (
)

type patcher struct {
k8sClient kubernetes.Interface
isLeader func() bool
deploymentsQueue chan PatchRequest
k8sClient kubernetes.Interface
isLeader func() bool
deploymentsQueue chan PatchRequest
telemetryCollector telemetry.TelemetryCollector
}

func newPatcher(k8sClient kubernetes.Interface, isLeaderFunc func() bool, pp patchProvider) *patcher {
func newPatcher(k8sClient kubernetes.Interface, isLeaderFunc func() bool, telemetryCollector telemetry.TelemetryCollector, pp patchProvider) *patcher {
return &patcher{
k8sClient: k8sClient,
isLeader: isLeaderFunc,
deploymentsQueue: pp.subscribe(KindDeployment),
k8sClient: k8sClient,
isLeader: isLeaderFunc,
deploymentsQueue: pp.subscribe(KindDeployment),
telemetryCollector: telemetryCollector,
}
}

Expand Down Expand Up @@ -102,8 +104,10 @@ func (p *patcher) patchDeployment(req PatchRequest) error {
}
log.Infof("Patching %s with patch %s", req.K8sTarget, string(patch))
if _, err = p.k8sClient.AppsV1().Deployments(req.K8sTarget.Namespace).Patch(context.TODO(), req.K8sTarget.Name, types.StrategicMergePatchType, patch, metav1.PatchOptions{}); err != nil {
p.telemetryCollector.SendRemoteConfigMutateEvent(req.getApmRemoteConfigEvent(err, telemetry.FailedToMutateConfig))
return err
}
p.telemetryCollector.SendRemoteConfigMutateEvent(req.getApmRemoteConfigEvent(nil, telemetry.Success))
metrics.PatchCompleted.Inc()
return nil
}
Expand Down
6 changes: 4 additions & 2 deletions pkg/clusteragent/admission/patch/patcher_test.go
Original file line number Diff line number Diff line change
Expand Up @@ -12,6 +12,7 @@ import (
"testing"

"github.com/DataDog/datadog-agent/pkg/clusteragent/admission/common"
"github.com/DataDog/datadog-agent/pkg/clusteragent/telemetry"

"github.com/stretchr/testify/require"
appsv1 "k8s.io/api/apps/v1"
Expand All @@ -34,8 +35,9 @@ func TestPatchDeployment(t *testing.T) {

// Create patcher
p := patcher{
k8sClient: client,
isLeader: func() bool { return true },
k8sClient: client,
isLeader: func() bool { return true },
telemetryCollector: telemetry.NewNoopCollector(),
}

// Create request skeleton
Expand Down
5 changes: 3 additions & 2 deletions pkg/clusteragent/admission/patch/provider.go
Original file line number Diff line number Diff line change
Expand Up @@ -10,6 +10,7 @@ package patch
import (
"errors"

"github.com/DataDog/datadog-agent/pkg/clusteragent/telemetry"
"github.com/DataDog/datadog-agent/pkg/config"
"github.com/DataDog/datadog-agent/pkg/config/remote"
)
Expand All @@ -19,9 +20,9 @@ type patchProvider interface {
subscribe(kind TargetObjKind) chan PatchRequest
}

func newPatchProvider(rcClient *remote.Client, isLeaderNotif <-chan struct{}, clusterId string, clusterName string) (patchProvider, error) {
func newPatchProvider(rcClient *remote.Client, isLeaderNotif <-chan struct{}, telemetryCollector telemetry.TelemetryCollector, clusterName string) (patchProvider, error) {
if config.Datadog.GetBool("remote_configuration.enabled") {
return newRemoteConfigProvider(rcClient, isLeaderNotif, clusterId, clusterName)
return newRemoteConfigProvider(rcClient, isLeaderNotif, telemetryCollector, clusterName)
}
if config.Datadog.GetBool("admission_controller.auto_instrumentation.patcher.fallback_to_file_provider") {
// Use the file config provider for e2e testing only (it replaces RC as a source of configs)
Expand Down
40 changes: 6 additions & 34 deletions pkg/clusteragent/admission/patch/rc_provider.go
Original file line number Diff line number Diff line change
Expand Up @@ -23,24 +23,22 @@ type remoteConfigProvider struct {
client *remote.Client
isLeaderNotif <-chan struct{}
subscribers map[TargetObjKind]chan PatchRequest
clusterId string
clusterName string
telemetryCollector telemetry.TelemetryCollector
}

var _ patchProvider = &remoteConfigProvider{}

func newRemoteConfigProvider(client *remote.Client, isLeaderNotif <-chan struct{}, clusterId string, clusterName string) (*remoteConfigProvider, error) {
func newRemoteConfigProvider(client *remote.Client, isLeaderNotif <-chan struct{}, telemetryCollector telemetry.TelemetryCollector, clusterName string) (*remoteConfigProvider, error) {
if client == nil {
return nil, errors.New("remote config client not initialized")
}
return &remoteConfigProvider{
client: client,
isLeaderNotif: isLeaderNotif,
subscribers: make(map[TargetObjKind]chan PatchRequest),
clusterId: clusterId,
clusterName: clusterName,
telemetryCollector: telemetry.NewCollector(),
telemetryCollector: telemetryCollector,
}, nil
}

Expand Down Expand Up @@ -77,52 +75,26 @@ func (rcp *remoteConfigProvider) process(update map[string]state.APMTracingConfi
err := json.Unmarshal(config.Config, &req)
if err != nil {
invalid++
rcp.telemetryCollector.SendRemoteConfigPatchEvent(req.getApmRemoteConfigEvent(err, telemetry.ConfigParseFailure))
log.Errorf("Error while parsing config: %v", err)
continue
}
req.RcVersion = config.Metadata.Version
log.Debugf("Patch request parsed %+v", req)
if err := req.Validate(rcp.clusterName); err != nil {
invalid++
rcp.telemetryCollector.SendRemoteConfigPatchEvent(req.getApmRemoteConfigEvent(err, telemetry.InvalidPatchRequest))
log.Errorf("Skipping invalid patch request: %s", err)
continue
}
if ch, found := rcp.subscribers[req.K8sTarget.Kind]; found {
valid++
// Log a telemetry event indicating a remote config patch to the Datadog backend
patchEvent := rcp.getRemoteConfigPatchEvent(config, req)
rcp.telemetryCollector.SendEvent(&patchEvent)
rcp.telemetryCollector.SendRemoteConfigPatchEvent(req.getApmRemoteConfigEvent(nil, telemetry.Success))
log.Debugf("Publishing patch request for target %s", req.K8sTarget)
ch <- req
}
}
metrics.RemoteConfigs.Set(valid)
metrics.InvalidRemoteConfigs.Set(invalid)
}

// getRemoteConfigPatchEvent fills out the fields of a telemetry event that can be sent
// to the Datadog backend to indicate that a remote config has been successfully patched
func (rcp *remoteConfigProvider) getRemoteConfigPatchEvent(config state.APMTracingConfig, req PatchRequest) telemetry.ApmRemoteConfigEvent {
env := ""
if req.LibConfig.Env != nil {
env = *req.LibConfig.Env
}
return telemetry.ApmRemoteConfigEvent{
RequestType: "apm-remote-config-event",
ApiVersion: "v2",
Payload: telemetry.ApmRemoteConfigEventPayload{
EventName: "agent.k8s.patch",
Tags: telemetry.ApmRemoteConfigEventTags{
Env: env,
RcId: req.ID,
RcClientId: rcp.client.ID,
RcRevision: req.Revision,
RcVersion: config.Metadata.Version,
KubernetesClusterId: rcp.clusterId,
KubernetesCluster: req.K8sTarget.Cluster,
KubernetesNamespace: req.K8sTarget.Namespace,
KubernetesKind: string(req.K8sTarget.Kind),
KubernetesName: req.K8sTarget.Name,
},
},
}
}
3 changes: 2 additions & 1 deletion pkg/clusteragent/admission/patch/rc_provider_test.go
Original file line number Diff line number Diff line change
Expand Up @@ -9,6 +9,7 @@ package patch

import (
"fmt"
"github.com/DataDog/datadog-agent/pkg/clusteragent/telemetry"
"testing"

"github.com/DataDog/datadog-agent/pkg/config/remote"
Expand Down Expand Up @@ -38,7 +39,7 @@ func TestProcess(t *testing.T) {
`
return []byte(fmt.Sprintf(base, cluster, kind))
}
rcp, err := newRemoteConfigProvider(&remote.Client{}, make(chan struct{}), "0090c771-add5-4313-948f-d1ab99b471d6", "dev")
rcp, err := newRemoteConfigProvider(&remote.Client{}, make(chan struct{}), telemetry.NewNoopCollector(), "dev")
require.NoError(t, err)
notifs := rcp.subscribe(KindDeployment)
in := map[string]state.APMTracingConfig{
Expand Down
6 changes: 4 additions & 2 deletions pkg/clusteragent/admission/patch/start.go
Original file line number Diff line number Diff line change
Expand Up @@ -8,6 +8,7 @@
package patch

import (
"github.com/DataDog/datadog-agent/pkg/clusteragent/telemetry"
"github.com/DataDog/datadog-agent/pkg/config/remote"
"github.com/DataDog/datadog-agent/pkg/util/log"

Expand All @@ -28,11 +29,12 @@ type ControllerContext struct {
// StartControllers starts the patch controllers
func StartControllers(ctx ControllerContext) error {
log.Info("Starting patch controllers")
provider, err := newPatchProvider(ctx.RcClient, ctx.LeaderSubscribeFunc(), ctx.ClusterId, ctx.ClusterName)
telemetryCollector := telemetry.NewCollector(ctx.RcClient.ID, ctx.ClusterId)
provider, err := newPatchProvider(ctx.RcClient, ctx.LeaderSubscribeFunc(), telemetryCollector, ctx.ClusterName)
if err != nil {
return err
}
patcher := newPatcher(ctx.K8sClient, ctx.IsLeaderFunc, provider)
patcher := newPatcher(ctx.K8sClient, ctx.IsLeaderFunc, telemetryCollector, provider)
go provider.start(ctx.StopCh)
go patcher.start(ctx.StopCh)
return nil
Expand Down
Loading