Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Upgrade to OpenSSL 3 in Agent 7, upgrade Python 3 to 3.9.17 #17501

Merged
merged 30 commits into from
Jun 13, 2023

Conversation

bkabrda
Copy link
Contributor

@bkabrda bkabrda commented Jun 7, 2023

What does this PR do?

This PR upgrades the embedded Python 3 interpreter to Python 3.9.17 and also upgrades the embedded OpenSSL to 3.0.9 in Agent 7. Agent 7 is only shipped with OpenSSL 3, Agent 6 is only shipped with OpenSSL 1.

Furthermore, this branch requires changes from:

Motivation

OpenSSL 1.1.1 reaches its end of life on September 11, 2023, so we need to upgrade to OpenSSL 3. In order to do that, we also need to upgrade to Python 3.9, because Python 3.8 doesn't support OpenSSL 3.

Additional Notes

Possible Drawbacks / Trade-offs

Describe how to test/QA your changes

  • Ensure both Agent 6 and Agent 7 (DEB, RPM, MSI and DMG) builds have Python 3.9.16 as the embedded Python 3 interpreter)
  • Ensure that Agent 6 still ships with OpenSSL 1, but not with OpenSSL 3.
  • Ensure that Agent 7 ships with OpenSSL 3, but not with OpenSSL 1.
  • Ensure that a Python integration that will utilize OpenSSL (like the network check used to check a URL with https) works correctly on the embedded Python 3 interpreter.
  • Ensure that integrations can still be installed/uninstalled with the embedded Python 3 interpreter.

Reviewer's Checklist

  • If known, an appropriate milestone has been selected; otherwise the Triage milestone is set.
  • Use the major_change label if your change either has a major impact on the code base, is impacting multiple teams or is changing important well-established internals of the Agent. This label will be use during QA to make sure each team pay extra attention to the changed behavior. For any customer facing change use a releasenote.
  • A release note has been added or the changelog/no-changelog label has been applied.
  • Changed code has automated tests for its functionality.
  • Adequate QA/testing plan information is provided if the qa/skip-qa label is not applied.
  • At least one team/.. label has been applied, indicating the team(s) that should QA this change.
  • If applicable, docs team has been notified or an issue has been opened on the documentation repo.
  • If applicable, the need-change/operator and need-change/helm labels have been applied.
  • If applicable, the k8s/<min-version> label, indicating the lowest Kubernetes version compatible with this feature.
  • If applicable, the config template has been updated.

@bkabrda bkabrda added [deprecated] team/agent-platform major_change Complex/large change, which significantly modifies agent behavior or could impact many agent teams labels Jun 7, 2023
@bkabrda bkabrda added this to the 7.47.0 milestone Jun 7, 2023
@bkabrda bkabrda requested review from a team as code owners June 7, 2023 10:46
@bkabrda
Copy link
Contributor Author

bkabrda commented Jun 7, 2023

@iliakur ack. We'd like to get this merged ASAP to give everyone enough time to test their features with these upgrades. Could you start working on the testing and let me know once you're done with it? Thanks!

Copy link
Contributor

@julien-lebot julien-lebot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

✅ for changes related to the Windows Agent team

Copy link
Contributor

@alai97 alai97 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good for docs!

@bkabrda bkabrda changed the title Upgrade to OpenSSL 3 in Agent 7, upgrade Python 3 to 3.9.16 Upgrade to OpenSSL 3 in Agent 7, upgrade Python 3 to 3.9.17 Jun 8, 2023
- |
Embedded Python 3 interpreter is upgraded to 3.9.17 in both Agent 6 and
Agent 7. Embedded OpenSSL is upgraded to 3.0.9 in Agent 7 on Linux and
macOS. On Windows, Python 3.9 in Agent 7 is still compiled with OpenSSL 1.1.1.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good catch 👍

@bkabrda bkabrda merged commit 4c91442 into main Jun 13, 2023
@bkabrda bkabrda deleted the slavek.kabrda/openssl-python-upgrade branch June 13, 2023 06:45
yshapiro-57 added a commit that referenced this pull request Jun 23, 2023
…17663)

* fix windows nanoserver crash on glog v1.1.x (#17340)

* fix windows nanoserver crash on glog v1.1.x

* remove unused go mod replaces

* [CWS] fix activity tree for busybox utils (#17415)

* [CWS] fix activity tree for busybox utils

* update comment

* Fix reporting of conflicting telemetry metrics (#17417)

Only use the limiter (and thus, send telemetry) from the core
agent. Instances of the demultiplexer in other agents do not receive
dogstatsd metrics.

* Update last stable version to 7.44.1 (#17438)

Signed-off-by: Nicolas Guerguadj <[email protected]>

* update packages to fix vulnerabilities in dependencies (#17418)

* do not use reflection for shallow copy (#17421)

This commit implements ShallowCopy for the pb.Span and pb.TraceChunk types.
The previous reflection-based implementation caused too much overhead in the
main processing loop, resulting in unacceptable performance loss.

This also adds tests to ensure that the ShallowCopy functions are correct.

* fix auto multi-line integration config (#17447)

* fix auto multi-line integration config

* reno

* update tests

* Update release.json and Go modules for 6/7.46.0-rc.2 (#17452)

* [CWS] reset events_stats to a PERCPU_ARRAY instead of a HASHMAP (#17473)

* Bump ncurses to 6.4 to fix CVE-2023-29491 (#17493)

* Kacper murzyn/7.45.0 changelog backport (#17489)

* 7.45.0 changelog (#17394)

* Release date updated

* Update latest stable agent version to 7.45.0 (#17491)

* fix subscriptionId fetching on azure (#17495)

* [SBOM] Remove `DeleteBlobs` from the sbom cache (#17465)

* remove delete missing blobs

* remove test

* fix strconv

* change from code review

* fix typo

* [CWS] fix duration suffix parsing (#17476)

* convert remaining users of old `golang-lru` to new generics based version (#17467)

* convert dogstatsd mapper cache to lru/v2

* convert network process cache to lru/v2

* convert network conntracker to lru/v2

* convert trivy cache to lru/v2

* convert network gateway lookup to lru/v2

* cleanup dependencies

* fix licenses

* fix conntracker tests

* fix conntrack debug

* [CWS] pre-alloc msg tags (#17434)

* silence error log about `DD_API_KEY` in internal profiler (#17371)

* Bump golang.org/x/sys from 0.3.0 to 0.8.0 in /pkg/gohai (#17106)

Bumps [golang.org/x/sys](https://github.com/golang/sys) from 0.3.0 to 0.8.0.
- [Commits](https://github.com/golang/sys/compare/v0.3.0...v0.8.0)

---
updated-dependencies:
- dependency-name: golang.org/x/sys
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <[email protected]>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

* [Gohai] Add common elements of the future new API (#17221)

* chore(gohai): use go version 1.18 to have generics

* feat(gohai): implement Value type

* feat(gohai): implement AsJSON and Initialize

* fix(gohai): fix lint warnings

* docs(gohai): add copyright in new files

* feat(gohai): add NewValueFrom method in Value

* feat(gohai): display suffix field tag in AsJSON

* Fix typo in pkg/gohai/utils/common.go

Co-authored-by: Nicolas Guerguadj <[email protected]>

* fix(gohai): address review comments

* feat(gohai): simplify common, remove Initialize

* docs(gohai): address comments review feedback

* feat(gohai): simplify AsJSON logic

* feat(gohai): return warnings as list of strings in AsJSON

* fix(gohai): fix common tests

* docs(gohai): fix comments/naming related review feedback

* test(gohai): simplify tests following pr review

---------

Co-authored-by: Nicolas Guerguadj <[email protected]>

* CWS: sync BTFhub constants (#17498)

Co-authored-by: paulcacheux <[email protected]>

* Bump golang.org/x/tools from 0.9.1 to 0.9.3 in /pkg/security/secl (#17479)

* Bump golang.org/x/tools from 0.9.1 to 0.9.3 in /pkg/security/secl

Bumps [golang.org/x/tools](https://github.com/golang/tools) from 0.9.1 to 0.9.3.
- [Release notes](https://github.com/golang/tools/releases)
- [Commits](https://github.com/golang/tools/compare/v0.9.1...v0.9.3)

---
updated-dependencies:
- dependency-name: golang.org/x/tools
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <[email protected]>

* Auto-generate go.sum and LICENSE-3rdparty.csv changes

---------

Signed-off-by: dependabot[bot] <[email protected]>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: dependabot[bot] <dependabot[bot]@users.noreply.github.com>

* Bump github.com/stretchr/testify from 1.8.3 to 1.8.4 in /pkg/security/secl (#17478)

* Bump github.com/stretchr/testify in /pkg/security/secl

Bumps [github.com/stretchr/testify](https://github.com/stretchr/testify) from 1.8.3 to 1.8.4.
- [Release notes](https://github.com/stretchr/testify/releases)
- [Commits](https://github.com/stretchr/testify/compare/v1.8.3...v1.8.4)

---
updated-dependencies:
- dependency-name: github.com/stretchr/testify
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <[email protected]>

* Auto-generate go.sum and LICENSE-3rdparty.csv changes

---------

Signed-off-by: dependabot[bot] <[email protected]>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: dependabot[bot] <dependabot[bot]@users.noreply.github.com>

* Bump requests from 2.30.0 to 2.31.0 in /test/e2e/cws-tests (#17428)

Bumps [requests](https://github.com/psf/requests) from 2.30.0 to 2.31.0.
- [Release notes](https://github.com/psf/requests/releases)
- [Changelog](https://github.com/psf/requests/blob/main/HISTORY.md)
- [Commits](https://github.com/psf/requests/compare/v2.30.0...v2.31.0)

---
updated-dependencies:
- dependency-name: requests
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <[email protected]>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

* Bump docker from 6.1.2 to 6.1.3 in /test/e2e/cws-tests (#17427)

Bumps [docker](https://github.com/docker/docker-py) from 6.1.2 to 6.1.3.
- [Release notes](https://github.com/docker/docker-py/releases)
- [Commits](https://github.com/docker/docker-py/compare/6.1.2...6.1.3)

---
updated-dependencies:
- dependency-name: docker
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <[email protected]>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

* Bump datadog-api-client from 2.12.0 to 2.13.1 in /test/e2e/cws-tests (#17429)

Bumps [datadog-api-client](https://github.com/DataDog/datadog-api-client-python) from 2.12.0 to 2.13.1.
- [Release notes](https://github.com/DataDog/datadog-api-client-python/releases)
- [Changelog](https://github.com/DataDog/datadog-api-client-python/blob/master/CHANGELOG.md)
- [Commits](https://github.com/DataDog/datadog-api-client-python/compare/2.12.0...2.13.1)

---
updated-dependencies:
- dependency-name: datadog-api-client
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <[email protected]>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

* [system-probe] only increment unregisters metric if delete actually occurs (#17402)

* only increment unregisters if delete actually occurs

* measure time from start of delete function, only increment if no err

* Bump github.com/prometheus/procfs from 0.10.0 to 0.10.1 (#17347)

Bumps [github.com/prometheus/procfs](https://github.com/prometheus/procfs) from 0.10.0 to 0.10.1.
- [Release notes](https://github.com/prometheus/procfs/releases)
- [Commits](https://github.com/prometheus/procfs/compare/v0.10.0...v0.10.1)

---
updated-dependencies:
- dependency-name: github.com/prometheus/procfs
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <[email protected]>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

* Fix duplicate prebuilt module in use during tests (#17472)

Because this is a global object, and the tests load prebuilt modules a bunch, you can end up with a long list of the same string. Example output:

```
&{{[] 0xc0013ada10} map[] map[closed_conn_dropped:0 conn_dropped:0 conns_bpf_map_size:18 conns_closed:1 kprobes_missed:0 kprobes_triggered:2] map[conntrack:{true 10 2032871} oomKill:{false 0 0} runtimeSecurity:{false 0 0} tcpQueueLength:{false 0 0} tracer:{true 10 3072361} usm:{true 10 2701840}] 2 map[tracer:1 usm:1] [offset-guess offset-guess offset-guess offset-guess offset-guess offset-guess offset-guess offset-guess offset-guess offset-guess offset-guess offset-guess offset-guess offset-guess tracer offset-guess dns tracer offset-guess dns tracer offset-guess dns tracer offset-guess dns tracer offset-guess dns tracer offset-guess dns tracer offset-guess dns tracer offset-guess dns tracer offset-guess dns tracer offset-guess dns tracer offset-guess dns tracer offset-guess dns tracer offset-guess dns tracer offset-guess dns tracer offset-guess dns tracer offset-guess dns tracer offset-guess dns tracer offset-guess dns usm tracer dns usm tracer offset-guess dns tracer offset-guess dns tracer offset-guess dns tracer offset-guess dns tracer offset-guess dns tracer offset-guess dns tracer offset-guess dns tracer offset-guess dns tracer offset-guess dns tracer offset-guess dns tracer offset-guess dns tracer offset-guess dns tracer offset-guess dns tracer offset-guess dns tracer offset-guess dns tracer offset-guess dns tracer offset-guess dns tracer offset-guess dns tracer offset-guess dns tracer offset-guess dns tracer offset-guess dns tracer offset-guess dns tracer offset-guess dns tracer offset-guess dns tracer offset-guess dns tracer offset-guess dns tracer offset-guess dns tracer offset-guess dns tracer offset-guess dns tracer offset-guess dns tracer offset-guess dns tracer offset-guess dns tracer offset-guess dns tracer offset-guess dns tracer offset-guess dns dns dns dns dns dns dns dns dns dns dns dns dns dns dns dns dns dns dns dns dns dns dns dns dns dns dns dns dns dns dns dns dns dns dns dns dns dns dns dns dns dns dns dns dns dns dns dns dns dns dns dns dns dns dns offset-guess dns offset-guess dns offset-guess dns offset-guess dns offset-guess dns offset-guess dns offset-guess dns offset-guess dns offset-guess dns offset-guess dns offset-guess dns offset-guess dns offset-guess dns offset-guess dns offset-guess dns offset-guess dns offset-guess dns offset-guess dns dns offset-guess dns offset-guess dns offset-guess dns offset-guess dns offset-guess dns offset-guess dns offset-guess dns offset-guess dns offset-guess dns offset-guess dns offset-guess dns offset-guess dns offset-guess dns offset-guess dns offset-guess dns offset-guess dns offset-guess dns offset-guess dns offset-guess dns offset-guess dns offset-guess dns offset-guess dns offset-guess dns offset-guess dns offset-guess dns offset-guess dns offset-guess dns offset-guess dns offset-guess dns offset-guess dns offset-guess dns offset-guess dns offset-guess dns offset-guess dns offset-guess dns tracer offset-guess dns usm tracer offset-guess dns usm tracer offset-guess dns usm tracer offset-guess dns usm tracer offset-guess dns usm tracer offset-guess dns usm tracer offset-guess dns usm tracer offset-guess dns usm tracer offset-guess dns usm tracer offset-guess dns usm tracer offset-guess dns usm tracer offset-guess dns usm tracer offset-guess dns usm tracer offset-guess dns usm tracer offset-guess dns usm tracer offset-guess dns usm tracer offset-guess dns dns dns dns dns dns dns dns dns dns dns dns dns dns dns dns dns] map[] map[] map[] map[]}
```

* Add way to log trace_pipe from tests (#17339)

* Bump github.com/vektra/mockery/v2 from 2.26.1 to 2.28.1 in /internal/tools (#17424)

* Bump github.com/vektra/mockery/v2 in /internal/tools

Bumps [github.com/vektra/mockery/v2](https://github.com/vektra/mockery) from 2.26.1 to 2.28.1.
- [Release notes](https://github.com/vektra/mockery/releases)
- [Changelog](https://github.com/vektra/mockery/blob/master/docs/changelog.md)
- [Commits](https://github.com/vektra/mockery/compare/v2.26.1...v2.28.1)

---
updated-dependencies:
- dependency-name: github.com/vektra/mockery/v2
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <[email protected]>

* `inv -e security-agent.gen-mocks`

* `inv -e process-agent.gen-mocks`

---------

Signed-off-by: dependabot[bot] <[email protected]>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: Paul Cacheux <[email protected]>

* [CWS][SEC-3735] Check self tests results in e2e tests (#17387)

* Check self_test results in e2e tests

* Check self_test results in e2e tests

* Fix self_check tests

* fix python lint things

* fix python lint thing

* Changes after review

* fix python lint thing

* [CSPM] Resolve process env variables only if required (#17461)

* [system-probe] Handle/reduce stat cookie collisions (#17197)

* [system-probe] Add internal_profiling.delta_profiles option to system-probe (#17475)

* [CSPM] Fix flakyness of TestProcessInput/Sleeps (#17399)

* system-probe: Remove redundant call for IsAdjusted (#17345)

* npm: Remove connection entry from tcpStats map if the connection is TCP (#17353)

* deprecate usm configuration values (#17216)

* usm: Deprecated network_config.http_replace_rules in favor of service_monitoring_config.http_replace_rules

* usm: Deprecated network_config.max_tracked_http_connections in favor of service_monitoring_config.max_tracked_http_connections

* usm: Deprecated network_config.max_http_stats_buffered in favor of service_monitoring_config.max_http_stats_buffered

* usm: Fixed configuration test

* Added releasenotes

* Fixed CR

* Fixed kitchen tests

* Fixing CI

* Update releasenotes/notes/deprecating-usm-configuration-values-6c43a0181c2cc821.yaml

Co-authored-by: Ursula Chen <[email protected]>

* Remove test patches

* Fixed cr

---------

Co-authored-by: Ursula Chen <[email protected]>

* Cloud Service implementation for Azure App Service (#17483)

This PR is extending serverless Cloud Service support to web apps running in Azure App Service containers.

* [CWS] avoid exec bomb (#17435)

* [CWS] fix process schema (#17422)

* Bump github.com/open-policy-agent/opa from 0.53.0 to 0.53.1 (#17505)

Bumps [github.com/open-policy-agent/opa](https://github.com/open-policy-agent/opa) from 0.53.0 to 0.53.1.
- [Release notes](https://github.com/open-policy-agent/opa/releases)
- [Changelog](https://github.com/open-policy-agent/opa/blob/main/CHANGELOG.md)
- [Commits](https://github.com/open-policy-agent/opa/compare/v0.53.0...v0.53.1)

---
updated-dependencies:
- dependency-name: github.com/open-policy-agent/opa
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <[email protected]>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

* [CSPM] Do not allow http.send and opa.runtime rego builtins (#17409)

* Bump github.com/hashicorp/golang-lru/v2 from 2.0.2 to 2.0.3 (#17503)

Bumps [github.com/hashicorp/golang-lru/v2](https://github.com/hashicorp/golang-lru) from 2.0.2 to 2.0.3.
- [Release notes](https://github.com/hashicorp/golang-lru/releases)
- [Commits](https://github.com/hashicorp/golang-lru/compare/v2.0.2...v2.0.3)

---
updated-dependencies:
- dependency-name: github.com/hashicorp/golang-lru/v2
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <[email protected]>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

* [system-probe] Fix race in Stop() for tcp close consumer (#17511)

* [e2e] target agent-sandbox account by default with e2e tests (#17484)

* typo (#17430)

* process-monitor: Change owner (#17510)

* npm: Spare copying of active connection twice (#17351)

* process-monitor: Change loading order. (#17401)

The refactor forced every user of process-monitor to call initialize. We ensured the initialized is being called only once.
During the initialize phase we scanned all running processes and tried to trigger the callbacks. But since every user called
initialize by itself, we had a race between registering callbacks and scanning the process list.
Now we call initialize only once, at the monitor initialization, and by that ensuring no race exists, as callback registrations
happens before calling the initialization

* [e2e] bump test-infra-definition to v0.0.0-20230607143804-fef23444c9da (#17517)

* npm: Remove redundant err return (#17520)

* system-probe: Avoid unnecessary allocations for trace logs in hot-code-paths (#17354)

* system-probe: Avoid unnecessary allocations for trace logs in hot-code-paths

* Wrapped more logs

* npm: Changed dns resolution to get a set of IPs rather than a list. (#17358)

* npm: Changed dns resolution to get a set of IPs rather than a list.

* Reduce allocated space, for the average case

* Fix potentital use of uninitialized memory (#17490)

This fixes potential use of uninitialized memory when PyList_GetItem
returns NULL.

This code path is impossible to hit in practice with the current
versions of Python, as long as the object is a list and index is in
bounds, which is ensured by the prior call to PyList_Size. These
functions do not use the Python sequence protocol, so evil python code
can not supply incorrect length or throw an unexpected exception
either.

* Bump github.com/stretchr/testify from 1.8.2 to 1.8.4 in /pkg/gohai (#17363)

Bumps [github.com/stretchr/testify](https://github.com/stretchr/testify) from 1.8.2 to 1.8.4.
- [Release notes](https://github.com/stretchr/testify/releases)
- [Commits](https://github.com/stretchr/testify/compare/v1.8.2...v1.8.4)

---
updated-dependencies:
- dependency-name: github.com/stretchr/testify
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <[email protected]>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

* allow snapshot to fail (#17386)

* add JSON decoder for activity dumps (#17444)

* add activity tree stats in activity dump list command (#17369)

* fix secprofile unstable guards (#17509)

* use the remote storage from a command line (#17525)

* Adding shared pool monitoring for Oracle databases (#17360)

* finished

* release notes

* relnotes correction

* review

* Adding more sysmetrics to Oracle monitoring  (#17466)

* both metrics

* send only 60s interval for sysmetrics

* latency & refactoring

* bg cpu usage

* indexes

* io

* more metrics

* new metrics

* metrics

* completed

* completed

* release notes

* disable cursor cache hit ratio

* Revert "[CWS][SEC-3735] Check self tests results in e2e tests (#17387)" (#17526)

This reverts commit e83efae53b0554a45557fa1b5f3e7e2f378502f4.

* MetricSecurityProfileAnomalyDetectionGenerated tracks the number of generated anomalies (#17462)

* [CWS] fix race when playing snapshot process data (#17527)

* AP-2099 Prevent jobs that trigger child pipelines to download artefacts (#17117)

* Fix broken loop (#17534)

* Report conntrack ebpf module loading telemetry (#17539)

* [fakeintake] add godoc (#17474)

* [fakeintake] add godoc

* [e2e] fix test example

* [fakeintake] add helpers to client to get payload names

* [fakeintake] move s in api.Payload inside doc link

* [e2e] bump test-infra to 20230607221957

* [e2e] add logs example

* [e2e] fix test-infra version

* [e2e] remove unused config file

* usm: process monitor: Call heavy operation only if needed (#17457)

* usm: process monitor: Call heavy operation only if needed

From now on, we're scanning already running processes if and only if there are registered exec callbacks.
Furthermore, we maintain 2 atomic booleans to indicate if we have any exec or exit callbacks, if we don't
have, then we're sparing mutex acquiring

* Added documentation

* Removed filed

* Update java integration tests to use latest layers. (#17194)

* Add workaround for database connection loss (#17486)

* implemented

* release notes

* Update releasenotes/notes/connection-loss-workaround-c457738d985fda2a.yaml

Co-authored-by: Austin Lai <[email protected]>

* Update pkg/collector/corechecks/oracle-dbm/oracle.go

Co-authored-by: Alexandre Normand <[email protected]>

* removed comments

* corrected syntac errors after merging

---------

Co-authored-by: Austin Lai <[email protected]>
Co-authored-by: Alexandre Normand <[email protected]>

* [CWS] remove load controller (#17220)

* [CWS] rework secprofile warmup tests (#17377)

* (rcm) simplify the RC thin client (#17468)

* (rcm) simplify the RC thin client

* simplify listeners as well

* fix apm and security agent

* fix cws profiles

* fix apm client

* CWS: sync BTFhub constants (#17550)

Co-authored-by: paulcacheux <[email protected]>

* https java tests use local https server (#17067)

https java tests use local https server

* [CWS] revert snapshot event playing  (#17553)

* [CWS] do not play snapshot for now

* remove test

* deprecate more usm values (#17342)

* Fixed bug in configuration

* usm: Deprecated system_probe_config.http_map_cleaner_interval_in_s in favor of service_monitoring_config.http_map_cleaner_interval_in_s

* usm: Deprecated system_probe_config.http_idle_connection_ttl_in_s in favor of service_monitoring_config.http_idle_connection_ttl_in_s

* usm: Deprecated network_config.http_notification_threshold in favor of service_monitoring_config.http_notification_threshold

* usm: Deprecated network_config.http_max_request_fragment in favor of service_monitoring_config.http_max_request_fragment

* usm: Added releasenotes

* Fixed file name linter

* Addressed CR comments

* usm: Use apply default

* Fixed test

* added missing import

* Fixed imports

* Adds DD_RESOURCE_GROUP and DD_SUBSCRIPTION_ID to env vars (#17558)

* rtloader: Use execinfo only on glibc (#15256)

Use execinfo only on glibc.
Functions in execinfo.h are GNU extensions and not available on other C libraries like musl.

We used to use libexecinfo package (A quick-n-dirty BSD licensed clone of the GNU libc backtrace facility.) of Alpine Linux to build datadog-agent on Alpine, but it has been removed since Alpine 3.17.
This PR allow to build datadog-agent on Alpine Linux and other non-glibc environments.

* Remove a no more used SBOM check config parameter (#17405)

* Adjust default value for Oracle check interval (#17551)

* adapted the default value

* reverted

* changed default in the factory

* remove init in config

* Add new invoke task to test buildimage update (#17241)

* Add new invoke task to test buildimage update

* Use new utils method in invoke task and more tests

* Bump emoji from 2.2.0 to 2.4.0 in /test/e2e/cws-tests (#17425)

Bumps [emoji](https://github.com/carpedm20/emoji) from 2.2.0 to 2.4.0.
- [Release notes](https://github.com/carpedm20/emoji/releases)
- [Changelog](https://github.com/carpedm20/emoji/blob/master/CHANGES.md)
- [Commits](https://github.com/carpedm20/emoji/compare/v2.2.0...v2.4.0)

---
updated-dependencies:
- dependency-name: emoji
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <[email protected]>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

* Bump github.com/itchyny/gojq from 0.12.12 to 0.12.13 (#17442)

Bumps [github.com/itchyny/gojq](https://github.com/itchyny/gojq) from 0.12.12 to 0.12.13.
- [Release notes](https://github.com/itchyny/gojq/releases)
- [Changelog](https://github.com/itchyny/gojq/blob/main/CHANGELOG.md)
- [Commits](https://github.com/itchyny/gojq/compare/v0.12.12...v0.12.13)

---
updated-dependencies:
- dependency-name: github.com/itchyny/gojq
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <[email protected]>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

* [CWS] remove unused arg from `fill_exec_context` (#17579)

* chore(gohai): update gopsutil/v3 to 3.23.2 (#17500)

* mount docker socket to dev container (#17385)

* add semver to requirements.txt (#17384)

* [DCA][Autodiscovery] Add more context to error log (#17464)

* USMO-259 - Support Java Async frameworks (#16346)

* - added support for java Async frameworks ioctl messages

* changed java tls structs and maps to support async instrumentation's messages

* refactored names and cleaned debug logs

* files location and names refactoring

* more code documentation

* - refactored java tls to work with tail calls

* - initializing connection_by_peer_key on stack, after the split to tail calls, we don't reach the stack limit anymore

* fixed compilation and verifier errors on 4.14

* fixed compilation error

* added java tls tail-calls to undefined probs list

* - removed unused map

- fixed the check if java tls is working before adding the tail calls

* - fixed the check for enabling java tls program

* added java tls tail calls to exclude list for shared_libraries_test.go

* fixed error in a previous commit conflict merge

* [PROC-2913] Create protobuf definitions for process workload stream server (#17497)

* Create proto definitions

* Update workloadmeta.proto

* Build proto files

* Add `eventId` field

* Apply moises' suggestions

* [RCM] Fix rc config deletion (#17581)

* Fix rc config deletion

* Cleanup

* Add test

* bump `ebpf-manager` to latest (#17585)

* bump `ebpf-manager` to latest

* `inv -e generate-licenses`

* [Gohai][ASC-471] implement cpu collection using sysctl syscall (#17556)

* feat(gohai): implement cpu collection using sysctl syscall

* Update release note

* Update releasenotes/notes/gohai-darwin-cpu-native-a931acf4d9d543ae.yaml

Co-authored-by: Heston Hoffman <[email protected]>

---------

Co-authored-by: Heston Hoffman <[email protected]>

* Add tests to CI (#17541)

* [USM] don't flood logs when a process is not java (#17590)

[Debug] java pid 26055 attachment rejected

* Fix the formating for debug log in SetAgentMetadata (#17382)

* [process-agent] Create WorkloadMetaExtractor v1 (#17448)

* Create wlm extractor

* Initial workloadmeta changes

* Add extractor tests

* Fix import cycle

* Add some tracing for QA

* Fixed an edge case where the map key != proc.pid

* Add tracing for QA

* Add release note

* Added caching and produce events

* Apply guy's suggestion and check in `grpc.go`

* Update pkg/languagedetection/languagemodels/types.go

Co-authored-by: Guy Arbitman <[email protected]>

* Update pkg/process/metadata/workloadmeta/grpc.go

Co-authored-by: Guy Arbitman <[email protected]>

* Fix linter errors

* Update create-wlm-extractor-e408e2826cc77be8.yaml

Removed comments

* Update pkg/process/metadata/workloadmeta/workloadmeta.go

Co-authored-by: Moisés Botarro <[email protected]>

* Update pkg/process/metadata/workloadmeta/extractor_test.go

Co-authored-by: Moisés Botarro <[email protected]>

* Address comments

* add debug log on instantiation

* apply suggestions

* Add benchmark for sprintf vs itoa

* Fix flaky test

* Add trace log and fix comment

---------

Co-authored-by: Guy Arbitman <[email protected]>
Co-authored-by: Moisés Botarro <[email protected]>

* [usm] Add ability to report payload telemetry (#17544)

* [usm] Add ability to report payload telemetry

* Require USM payload telemetry to be explicitly declared

* Rename `OptTelemetry` to `OptPayloadTelemetry`

* Add unit test

* Update the `test-infra-definitions` dependency in `test/new-e2e` (#17566)

* Revert "[usm] Improve `incompleteBuffer` (#17164)" (#17593)

This reverts commit a0481de26a4a68c1e6fb228294774e8916f37943.

* DD_SERVICE_MAPPING in extension (#17189)

* DD_SERVICE_MAPPING in extension

* lint

* release note

* edit release note

* make DD_SERVICE_MAPPING src code split up into smaller parts for easier testing, fix tests, leverage config pkg

* gofmt

* add serverless prefix

* Update releasenotes/notes/serverless-DD-SERVICE-MAPPING-594cc2cb7d090473.yaml

Co-authored-by: Ursula Chen <[email protected]>

* trigger ci

* cover same key and value, add more bad input tests

* add new test cases

* format

---------

Co-authored-by: Ursula Chen <[email protected]>

* Improves python check docs to use virtualenv and sort out PYTHONPATH when needed (#17569)

* Adds docs to use virtualenv and sort out PYTHONPATH when needed

* Adds feedback from PR comments

* Adds note about needing -p arg for virtualenv

* Bump github.com/hashicorp/golang-lru/v2 in /pkg/security/secl (#17599)

Bumps [github.com/hashicorp/golang-lru/v2](https://github.com/hashicorp/golang-lru) from 2.0.2 to 2.0.3.
- [Release notes](https://github.com/hashicorp/golang-lru/releases)
- [Commits](https://github.com/hashicorp/golang-lru/compare/v2.0.2...v2.0.3)

---
updated-dependencies:
- dependency-name: github.com/hashicorp/golang-lru/v2
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <[email protected]>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

* CWS: sync BTFhub constants (#17608)

Co-authored-by: paulcacheux <[email protected]>

* Upgrade to OpenSSL 3 in Agent 7, upgrade Python 3 to 3.9.17 (#17501)

Co-authored-by: Florent Clarret <[email protected]>

* Bump golang.org/x/sys from 0.8.0 to 0.9.0 in /pkg/security/secl (#17600)

* Bump golang.org/x/sys from 0.8.0 to 0.9.0 in /pkg/security/secl

Bumps [golang.org/x/sys](https://github.com/golang/sys) from 0.8.0 to 0.9.0.
- [Commits](https://github.com/golang/sys/compare/v0.8.0...v0.9.0)

---
updated-dependencies:
- dependency-name: golang.org/x/sys
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <[email protected]>

* Auto-generate go.sum and LICENSE-3rdparty.csv changes

---------

Signed-off-by: dependabot[bot] <[email protected]>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: dependabot[bot] <dependabot[bot]@users.noreply.github.com>

* Fix username generation on windows (#17547)

* [USM] tests RunDockerServer/RunHostServer log pid (#17587)

* [pkg/netflow] Collect `flow_process_nf_errors_count` metric from goflow2 (#17460)

* Collect flow_process_nf_errors_count goflow@ metric

* Add release note

* [CWS] remove unused mount group id field (#17222)

* [CWS] remove unused mount group id field

* update docs

* [CWS] cgroup resolver: use hashmap instead of LRU to track workload PIDs (#17487)

* [CWS] cgroup resolver: use hashmap instead of LRU to track workload PIDs

* add missing increment

* [AP-2139] Add amazonlinux2023 to the kitchen tests (#17548)

* Add amazonlinux2023 to the kitchen tests

* use 2023 as default to test as branch build launches only default

* [workloadmeta][process] Bootstrap process entities in workloadmeta (#17327)

* [CSPM] Make sure we do not create zombie processes in our tests (#17609)

* [CSPM] Make sure we do not create zombie processes in our tests

* add more details in case of test failure

* feat: support provisioned concurrency and proactive initialization (#17014)

* feat: support provisioned concurrency and proactive initialization

* feat: Set init time at beginning of agent init, unless that would somehow overlap the lambda start span

* feat: Use real time object until we need to convert

* feat: Use platform.initStart to begin the cold start span

* feat: specs

* fix: Update logs_test

* Fix: Set init time on EC

* feat: No string interpolation if we don't need it

* feat: Fix ecs proactiveInit boolean so it turns off after first invocation

* feat: Fix logs collector tags so we don't over-tag here

* feat: fmt

* feat: Add logs test for Proactive Initialization

* Bump snowflake-connector-python to 3.0.4 (#17445)

* [CWS] remote use of internal pointer (#16731)

* Include AAS metadata in span tags (#17591)

Currently, customers need to manually set the environment variable DD_AZURE_APP_SERVICES=true in order to make the traces include AAS metadata in tags. This change will eliminate that step and include these tags by detecting if we're on AAS. The PR also removes the old logic which adds these tags based on the environment variable DD_AZURE_APP_SERVICES.

* [RCM] Add rc client in flare (#17094)

* Add rc client in flare

* Add rc listeners

* Change constructor

* Fix AGENT_TASK read

* Cleanup

* Fix lint

* Address reviews

* Add release note

* Fix CI

* Fix CI

* Add mutex

* Address review

* Address review

* [secrets][tests] properly reset secrets backend timeout after test (#17614)

* [CWS] fix prerm scripts error logs (#17383)

* Handle missing result json file (#17537)

* [CWS] move arithmetic secl test to the secl package (#17610)

* [CWS] decouple a bit AD/Profile from probe (#17131)

* [CWS] cleanup runner before running btfhub sync job (#17629)

* cleanup runner before running btfhub sync job

* bump setup-go and remove cache step (included in v4)

* CWS: sync BTFhub constants (#17633)

Co-authored-by: paulcacheux <[email protected]>

* [corechecks/snmp] Refactor Profile Config (#17618)

* [CWS] rework secprofile tryAutolearn (#17535)

* Rework secprofile autoLearn func (including 2 fixes), and add 44 unitary tests arround it

* Fix go lint

* Apply review suggestion

* [Fix] Agent version cache not correctly loaded in multiple CI jobs (#17606)

* http2: remove packed enum values (#17586)

Signed-off-by: Guillaume Pagnoux <[email protected]>

* Make `nettop` available (#17458)

* [CWS] support kernel with usernamespaces arguments for security functions (#17634)

* remove unused function

* remove unused function

* PoC support new userns arg

* constantify the argument position selection

* fix same name issue

* horrible hack to pass the verifier

* [CWS] add unknown source for process entry (#17636)

* [CWS] update fallback constants for recent kernels (#17639)

* fix bpf map id constant

* fix bpf mai name offset

* fix bpf prog aux name offset

* move `kitchen_test_dummy_job_tmp` to k8s runners (#17641)

* [gitlab] Migration of unit tests CI jobs to k8s Gitlab runners (#17179)

Requires https://github.com/DataDog/datadog-agent-buildimages/pull/370 first.

This PR:
- updates the Linux build images used in the `datadog-agent` Gitlab CI pipelines to images that do not have an entrypoint script (required because our k8s Gitlab runner infrastructure overwrites the entrypoint of images, therefore we can't rely on it being run)
- updates all relevant CI scripts to run `source /root/.bashrc` at the very beginning, since this is not run in the entrypoint anymore
- updates all jobs in the `setup`, `deps_fetch`, `source_test`, `binary_build` stages to run on k8s runners instead of classic runners
- updates container-related unit tests to work when run in a k8s environment (thanks @L3n41c, cc @DataDog/container-integrations)
- skips a few gohai and gohai-related metadata unit tests that are failing on the arm64 rpm runner because `df` doesn't work in this specific setup, for reasons that remain to be investigated (cc @DataDog/agent-shared-components)
- adds a way to specify concurrency for `golangci-lint` invocations (see https://github.com/DataDog/datadog-agent/pull/15722 and https://github.com/DataDog/datadog-agent/pull/15762)
- fixes the `package_dependencies` jobs in the `kernel_matrix_testing` stage, which weren't using the correct `BUILDIMAGES_SUFFIX`. variable

Co-authored-by: Lénaïc Huard <[email protected]>

* [gitlab] Migrate docker publish jobs to k8s runners (#17270)

Migrates docker publishing jobs to the new Kubernetes-based runners.

The docker build jobs were migrated in #15511, but the publishing jobs are still using old runners.

* Add mutex to runtime settings (#17640)

* Process BTF archive nightly (#17621)

* Minor fixes to system-probe (#17622)

* Use correct module name in restart command

* Proper PingTCP/PingUDP cleanup

* [CWS Agent] RC rules override local rules if IDs conflict (#17573)

* reverse order of policy loading

* adding PolicyProviderType consts

* move enforcement of policy provider loading into a testable func

* pkg/flare: add missing APM variables to envvars (#17597)

This PR adds all APM environmental variables currently being used by 
the agent to the flare. Previously, some variables were missing and so
their values would not be represented when producing a flare.

* [Serverless] Use prebuilt opentelemetry lambda layers in integration tests. (#17568)

* Enable auto-instrumentation for python integration test.

* Update snapshot for otlp-python.

* Linting.

* Sort values of tag _dd.tags.container.

* Add encoding info to tailer info for the agent status verbose page (#17533)

* add encoding info to tailer info for the agent status verbose page

* push encoding information straight into tailer info

* moving adding tailer info to parser instead

* NIT

* [usm] Intern Kafka topic names (#17648)

* Add benchmark

* Intern topic name strings

* Fix data synchronization

* config/apm: fix parsing DD_APM_FEATURES (#17630)

Support either "," or " " as separator when parsing the value of DD_APM_FEATURES. It fixes a regression introduced in #15904 which changed the separator from comma to space. This was a breaking change. From 7.44 to 7.46 using a space as separator was suggested ad a workaround, this PR ensures we don't break compatibility again. We now support either space or comma.

* [CWS] do not handle broken lineage during snapshot (#17624)

* Revert "[CWS] do not handle broken lineage during snapshot (#17624)" (#17656)

This reverts commit 970566077529b17147e7c74129344c7b35f766b3.

* [CWS] Improve tryAutolearn unit tests by making fake events to have a valid lineage (#17657)

* [CWS] fix overlayfs inode read on kernel 5.19 and higher (#17644)

* dbg output

* xfs hacky solution to go around the 300MB limit

* PoC test fix overlayfs

* pipe constant param to select the lower inode selection

* implement kernel version check for feature detection

* small fix

* implement function probing based detection

* apply suggested review changes

* Revert "Revert "[CWS] do not handle broken lineage during snapshot (#17624)" (#17656)" (#17658)

This reverts commit 1ad88863157f73078068eab05bda4ea00ddadb58.

* Report config mutation events from the agent

* Create initial config for DDQA (#15675)

* Create initial config for DDQA

* Adding USM

* Adding NDM

* Update config.toml

* add ebpf-platform

* [tools] Adding ASC team to DDQA initial config

* Fixing USM jira project

* Add Agent Platform

* Add platform integrations team

* Add APM

* Add Remote Config

* Add Container-[Integrations-Ecosystems]

* Add Security And Compliance Agent

* Add Processes

* Change RCM issue type to QA

* Add Windows Agent

* [container-app] add qa team metadata

* Network Performance Monitoring

* Update .ddqa/config.toml - Database Monitoring

* Add Windows Kernel Integrations

* add final team Agent Integrations

* Use QA Task for Windows Agent

* [tools] Update task type for agent-shared-components QA issue generation

We now have a new task type for ASC QA operations that should be used

* final update

https://github.com/DataDog/ddqa/pull/13

* rename top-level option

* remove `changelog/no-changelog` as an ignored label

* [ddqa] exclude members for team ASC

* ddqa: AML exclude_members.

* Updating excluded devs for CI team

* Add CSPM Agent

* Exclude olivielpeau from Agent Platform QA

---------

Co-authored-by: Guy Arbitman <[email protected]>
Co-authored-by: Florian Veaux <[email protected]>
Co-authored-by: Alexander Nicholas Costas <[email protected]>
Co-authored-by: Bryce Kahle <[email protected]>
Co-authored-by: Srdjan Grubor <[email protected]>
Co-authored-by: Kylian Serrania <[email protected]>
Co-authored-by: Sarah Witt <[email protected]>
Co-authored-by: Katie Hockman <[email protected]>
Co-authored-by: Baptiste Foy <[email protected]>
Co-authored-by: Cedric Lamoriniere <[email protected]>
Co-authored-by: Paul Cacheux <[email protected]>
Co-authored-by: Moises Botarro <[email protected]>
Co-authored-by: Julien Lebot <[email protected]>
Co-authored-by: fisherevans <[email protected]>
Co-authored-by: Lee Avital <[email protected]>
Co-authored-by: Joel Marcotte <[email protected]>
Co-authored-by: Rich Lancia <[email protected]>
Co-authored-by: Pierre Gimalac <[email protected]>
Co-authored-by: Remy Mathieu <[email protected]>
Co-authored-by: Kacper <[email protected]>
Co-authored-by: David du Colombier <[email protected]>
Co-authored-by: Alexandre Menasria <[email protected]>

* dump silent workloads (#17412)

* [CWS] constantify `vm_flags` access in `vm_area_struct` (#17662)

* constantify `vm_flags` access in `vm_area_struct`

* re-gen constants

* regression detector: change baseline variant from latest main to merge base (#17449)

* regression detector: add compute merge base job

We need to compute the merge base of a non-`main` branch with respect
to `main` in order to establish the commit SHA of a baseline variant
in regression detection. This commit adds a job to compute that merge
base, echo the result to a file (sans newline), and then upload that
result to the Single-Machine Performance S3 bucket for Agent team.

Signed-off-by: Geoffrey M. Oxberry <[email protected]>

* regression detector: note build step we may need

The existing Single-Machine Performance (SMP) regression detector
setup doesn't build additional containers -- it just publishes
containers already built in Agent CI to SMP's ECR for Agent
images. I've written the new "merge-base" container assuming we can
continue with that strategy, but if we can't continue with that
strategy for some reason, then this commit includes a bunch of
comments sketching out a backup plan that would build the container we
would need and publish it to SMP's ECR for Agent images.

Signed-off-by: Geoffrey M. Oxberry <[email protected]>

* regression detectr: sketch functional test tweaks

As in the spirit of the previous commit, this commit adds some
comments on how to transition the existing setup in
`functional_test/regression_detector.yml` from using the latest commit
on `main` to using a "merge base" baseline SHA.

There are a few parts of this commit that are actually functional, but
shouldn't functionally alter the existing regression detector output
-- it still uses a "latest `main`" baseline SHA -- but it does
introduce the idea of getting the new baseline SHA from an artifact in
a previous stage, rather than uploading it to S3 (although I also
upload the merge base baseline SHA to S3 as well, as a contingency
plan).

Signed-off-by: Geoffrey M. Oxberry <[email protected]>

* smp, docker_linux.yml: use literal for image

I thought using a `!reference` tag would work for specifying an image
for the merge-base-computing job, but that mental model may be wrong,
so this commit replaces that tag with the literal it's supposed to
(de)reference.

Signed-off-by: Geoffrey M. Oxberry <[email protected]>

* smp, docker_linux.yml: comment out merge base job

The merge base job isn't running because no runners are configured to
run it. This issue could be one that I can't resolve, so it could be
that I need to get permissions to run it. For now, I'll comment out
this job and instead try and get the information necessary to compute
a baseline as part of the regression detector job, but without
otherwise altering the regression detector behavior.

Signed-off-by: Geoffrey M. Oxberry <[email protected]>

* regression_detector.yml: compute merge base

As a test of whether I can compute the merge base using a `git fetch`
command, this commit adds that command (and other supporting commands)
to the regression detector job to see if these commands will work.

Signed-off-by: Geoffrey M. Oxberry <[email protected]>

* smp, docker_linux.yml: add container build notes

I've managed to figure out how to get the most important case to work
-- the one in which the base branch of the pull request (i.e., in
GitLab terms, the target branch of the merge request) is
`main`. Having managed to get this case to work, this commit updates
my implementation notes to reflect how to implement the case when the
base branch of the pull request *is not* `main`.

Signed-off-by: Geoffrey M. Oxberry <[email protected]>

* regression detector: set baseline to merge base

It looks like creating a separate job (at least in
`.gitlab/container_build/docker_linux.yml`) will not work because
GitLab won't run it in a runner. This behavior may be a configuration
setting for security reasons, and may require repo admin privileges to
change. Instead, this commit implements setting the baseline SHA to
the merge base, along with some commands to abort the regression
detector job if the base branch of the pull request (target branch of
the merge request) is not `main`, and copies the merge base SHA to S3
for debugging/auditing purposes.

Signed-off-by: Geoffrey M. Oxberry <[email protected]>

* regression_detector.yml: remove obsolete comments

Now that I'm pretty sure I've figured out how to change the baseline
SHA to the merge base of a pull request, this commit deletes most of
the commented-out lines I introduced into
`.gitlab/functional_test/regression_detector.yml` as notes to
myself. These comments are no longer necessary for record-keeping.

Signed-off-by: Geoffrey M. Oxberry <[email protected]>

* docker_linux.yml: correct line comment

This commit corrects a line comment in which I write "Use this line if
comparing `main` against itself makes sense" when I should have
written "Use this line if comparing `main` against itself does not
make sense".

Signed-off-by: Geoffrey M. Oxberry <[email protected]>

* regression_detector.yml: make if block a one-liner

GitLab apparently doesn't echo multiline commands by default, so this
commit rewrites this multiline `if` block as a one line command for
debugging purposes.

Signed-off-by: Geoffrey M. Oxberry <[email protected]>

* regression_detector.yml: remove backticks

I'm pretty sure that the "command `main` not found" message I saw was
because the shell was likely interpreting those backticks as running a
command. This commit replaces the backitcks with single quotes to
avoid having the shell attempt to run a command called `main`.

Signed-off-by: Geoffrey M. Oxberry <[email protected]>

* regression_detector.yml: disable base branch check

This commit temporarily disables the check for the base branch because
it doesn't quite seem to work at the moment, and it adds enough
debugging output so I can get some idea of why that check does not
work.

Signed-off-by: Geoffrey M. Oxberry <[email protected]>

* regression_detector.yml: add artifacts + cleanup

This commit cleans up the last bits of debugging comments and code in
`.gitlab/functional_test/regression_detector.yml`, plus it adds the
reporting outputs as artifacts to provide additional diagnostic output
for debugging purposes.

Signed-off-by: Geoffrey M. Oxberry <[email protected]>

* docker_linux.yml: remove comment cruft

This commit removes a large comment block I put in
`.gitlab/container_build/docker_linux.yml` because I no longer think
it's a good idea to compute the merge base commit in a job separate
from the regression detector job.

Signed-off-by: Geoffrey M. Oxberry <[email protected]>

* regression detector: compute baseline via loop

The previous regression detector baseline SHA computation assumed that
an image exists in ECR for the commit returned by `$(git commit
merge-base HEAD main)`, *i.e.*, the merge base with `main`. However,
the container associated with that commit may not exist because that
container may have failed to build, or may have failed to upload to
ECR, so we need to check whether that container exists in ECR. If that
container exists, then the merge base with `main` is the baseline
SHA. If not, then we must iterate over predecessor commits in `main`
until we find a commit in `main` for which a container exists in
ECR. The first commit we find in this loop becomes the baseline SHA
for the regression detector.

This commit implements that check, along with a loop that iterates
over predecessor commits, if necessary, as described above. The
implementation is currently a rather verbose one-liner because
single-line statements are easier to debug in GitLab CI. Once this
implementation succeeds, a subsequent commit will clean up the
implementation by changing it to a multi-line statement, after which
this branch will be ready for review.

Signed-off-by: Geoffrey M. Oxberry <[email protected]>

* regression detector: reduce stderr noise

The initial implementation of the container existence check turns out
to be pretty noisy when a container doesn't exist because `awscli`
will output a bunch of error information on failure. While this
information is helpful for debugging purposes, it will be annoying in
CI, so this commit redirects the `stderr` of that command to
`/dev/null` to reduce noise in the CI output of the regression
detector CI job.

Signed-off-by: Geoffrey M. Oxberry <[email protected]>

* regression detector: fix copy before write error

The regression detector job in the Agent CI pipeline currently fails
because it attempts to copy a file that doesn't exist to S3. This
commit fixes that error by moving the copy command to a point after
the file is generated.

Signed-off-by: Geoffrey M. Oxberry <[email protected]>

* regression detector: tweak ecr query for debugging

For some reason, the `aws ecr describe-images` command is not finding
images that I know exist. To aid in debugging at the expense of noise,
this commit removes redirection of `stderr` to `/dev/null`, adds the
`--registry-id` flag to be more explicit about ECR repo location, and
also moves the `--profile` flag and its argument to a more readable
location.

Signed-off-by: Geoffrey M. Oxberry <[email protected]>

* regression detector: remove "latest main" job

The baseline-SHA-computing steps in the
`single-machine-performance-regression_detector` job make computing
the commit of "latest `main`" unnecessary, so this commit deletes the
job that does that computation.

Signed-off-by: Geoffrey M. Oxberry <[email protected]>

* regression detector: remove loop cleanup comment

Due to falling behind on Agent performance investigations, I don't
think I'm going to get to clean up the long, one line loop statement,
so this commit deletes that aspirational comment -- it can be cleaned
up in a subsequent pull request.

Signed-off-by: Geoffrey M. Oxberry <[email protected]>

* regression detector: pin pr-commenter version

This commit attempts to triage the `pr-commenter` errors by pinning
the `pr-commenter` version in a fashion similar to that used by
other Datadog repositories (e.g., `dogweb`).

Signed-off-by: Geoffrey M. Oxberry <[email protected]>

* regression detector: delete more stale comments

For now, I don't plan to move the baseline SHA computation into a
separate job, although that may happen later. Given this change of
plans, this commit removes the stale comment regarding refactoring the
baseline SHA computation to a separate job.

Signed-off-by: Geoffrey M. Oxberry <[email protected]>

---------

Signed-off-by: Geoffrey M. Oxberry <[email protected]>

* stop dumping workloads with a stable event type (#17536)

* [CWS] run functional tests on al2023 (#17612)

* run CWS functional tests on al2023

* skip the SELinux/sel_disable test on al2023

* fix sizeof_inode and tty_offset fallback constants on al2023 kernel 6.1

* fix import

* skip docker rc tests on al2023

* fix `vm_area_struct_flags_offset` on al2023

* Run system-probe tests using kernel matrix testing scenario (#16406)

* system-probe-test_spec.go added

* compile system-probe-test_spec and add to dependencies

* fix base image

* move test spec into test dir

* comment out wip

* use x64 images for packaging dependencies

* print outputs json

* save output ips to file

* print stack.outputs

* fix typo

* flush buffer to file

* move stack.outputs to CI_PROJECT_DIR

* connect to microvm

* fix gitlab yaml error

* use amis available in build-stable

* fix invoke command

* define AWS_REGION env var

* use kernel matrix testing image

* fix build tags

* use ssh private key to ssh

* explanatory comment

* fix path export

* save ssh key file to CI_PROJECT_DIR

* get ssh key info

* define AWS_SSH_KEY variable

* turn off StrictHostKeyChecking

* fix typo

* set ssh key file perms

* verbose ssh output

* change ssh file name and use the same name

* fix aws ssh key name

* change ssh key name

* try with root user

* change ssh user name back to 'ubuntu'

* set BatchMode to prevent passphare query

* set owner to user 1000

* some debugging

* fix ssh key perms

* add new line to ssh key file

* turn of host key verification when connecting with micro-vm

* use dedicated pulumi image

* run init script

* fix micro-vm-init.sh script path when scp'ing

* wrap inner ssh command in quotes

* pass arch to microvm init

* give full path of dependencies

* copy all shared dependencies and strip-components when extracting

* use go version 1.19 and run tests

* cross compile

* fix dir path of system-probe-tests

* retrieve go deps

* tidy-all

* use new tenant

* print .aws/config file

* use string replacer

* use exact s3 bucket as new-e2e

* use k8 runners for env job

* remove security groups and subnets, and change ssh key param name

* set correct env name

* add copyright

* fix micro-vm-init.sh and move it in the dependencies

* fix yaml syntax

* make GOVERSION string

* add agent-qa amis

* set GOARCH as env var

* change ssh key name

* try without key pair name

* use pulumi image

* reintroduce key pair config value

* change docker image tag

* try agent-ci-sandbox as ssh key name

* remove printing of aws_config

* add default key pair name: datadog-agent-ci

* update amis

* update amis

* update vmconfig. Run only am64

* fix arm64 package name

* update arm64 ami id

* fix typo in yaml

* add go_deps dependency

* disable arm64 distros for now

* add dummy provider for copy file

* fix command provider init

* provider args fix package

* unique name for command provider

* add go_deps as dependency to test job

* set go_tools_dep, and switch to new profile in tests

* use pulumi docker image

* add debug logs of ssh keys

* retry if failed to dial libvirt

* add comment and log

* tidy all

* make it possible to easily run scenario in dev machine

* fix dependency fetching step

* easier to launch in dev machine and do not use sudo

* fix infra-env in env setup job

* lint python

* fix ssh key name

* incorrect image tag introduced

* reintroduce custom arm64 kernels

* exec perms on micro-vm-init.sh

* apply suggestions

* avoid priniting out ssm read to debug output

* use script to setup dependencies before tests

* fix config

* fix function name

* remove shutdown command as it is moved in scenario, and cleanup path names

* use fallback of env vars

* simplify output streaming

* shutdown period must be int

* create and download junit and testjson files

* set up testjson and junit package in init script

* exit with system-probe test exit code

* fix shutdown option

* small fixes

* pass current environment to cmd

* fix junit and testjson download

* add env for DD_SYSTEM_PROBE_JAVA_DIR in new test spec

* fix delete path of junit and testjson package

* fix NewRunner after test-infra-definitions update

* pass arch to fetch_dependencies

* return error code of tests

* add '$' to resolve INSTANCE_IP

* Simplify micro-vm-init.sh IP address command

* Minor improvements

* Add timeouts

* Handle errors in outputsToFile

* Update AMIs

* Update to handle bundle-less test run

* Fix minor nits

* Update x86_64 AMI

* Update arm64 AMI

* add kvt to on_system_probe_changes_or_manual

* allow_failure for kvt jobs

* run all tests despite errors

* change stack name

* Fix json glob pattern

* update test-infra-definitions

* remove custom kernels and use distribution images

* remove custom kernels from vmconfig

* run tests only on x86_64

* update test-infra-definitions

* add arm64 tests back in

* fix variable replacement bash

* update ami

* Collect failed tests and output them at the end

* Output kernel release

* fix BTF_DIR path

* always color output if supported

* Use GO_VERSION from runner docker image

* force color

* only output FAIL if test has a name

* Do not return error if we only have failed tests

* do test json review at end to properly fail job

* Fix binary path

* specify binary name and main.go path

* Add missing end single quote

* Force color in review too

* add cleanup job

* provide secret sudo-password

* make kmt job manual

* revert back to running job on system-probe changes

* run dummy job on system-probe changes

* update test-infra-definitions

* fix grep

* fix grep for pattern '-instance-ip'

* set empty path when no private key path provided, to prevent pulumi from attempting to read ssh key file by guessing path

* change instance types to c6i.metal for intel and c6g.metal for arm

* split arm64 and x86 kmt tests

* fix parallel test runs

* fix dependencies in cleanup job

* fail if retrying tests

* run python linter

* use storage optimized instances for both intel and arm

* use compute optimized instance for arm

* allow all kernel_matrix_testing jobs to fail

* make kmt jobs manual

* change arm instance type to m6g and x86 to m5

---------

Co-authored-by: Bryce Kahle <[email protected]>

* Expose agent telemetry on system-probe UDS (#17652)

* [new-e2e] use standard-verbose format when verbose is True (#17660)

* add missing filter_tag envvars to config (#17653)

This commit adds the filter_tag envvars (DD_APM_FILTER_TAGS_REQUIRE
and DD_APM_FILTER_TAGS_REJECT) to the config_template.yaml file.
Previously, these envvars were not represented with the filter_tag 
parameter, so adding these in can improve documentation.

* [serverless] add `peer.service` to inferred spans (#17414)

* add `peerService` constant

* add `peerService` in `Span.Meta` tags

* remove logic from `span_enrichment.go`

* move logic to `lifecycle.go`

* Double Agent replicate counts (#17664)

* Double Agent replicate counts

We intend to automatically expand the number of replicates if a possible wobble
is detected. In order to approach the goal of doing this automatically we first
need to validate that consistently running this number of replicates is
1. feasible and 2. we draw more clear results by this doubling. There is work
required on our side to adjust the statistics, so in some sense this PR is a
change that will lead to a change that will lead to a change.

REF SMP-599
REF SMP-333

Signed-off-by: Brian L. Troutwine <[email protected]>

* empty commit to trigger CI

Signed-off-by: Brian L. Troutwine <[email protected]>

---------

Signed-off-by: Brian L. Troutwine <[email protected]>

* Remove dependency on github.com/iovisor/gobpf for single function (#17649)

* Remove dependency on github.com/iovisor/gobpf for single function

* Fix copyright check

* Add test

* More fixes from CWS module name change (#17650)

* Fix cyclical import

* Correctly handle empty error messages

* CWS: sync BTFhub constants (#17679)

Co-authored-by: paulcacheux <[email protected]>

* AP-2062 Change version of builders image and change kitchen cleanup task (#17529)

* Change version of builders image and change kitchen cleanup task

* remove manual trigger

* Update builder image

* [USM] Monitor & HTTP refactor (#17283)

* http: telemetry: remove unused error value

Signed-off-by: Guillaume Pagnoux <[email protected]>

* protocols: add protocol registration

Signed-off-by: Guillaume Pagnoux <[email protected]>

* protocols: add http protocol

Signed-off-by: Guillaume Pagnoux <[email protected]>

* monitor: handle excluded functions

Signed-off-by: Guillaume Pagnoux <[email protected]>

* ebpfProgram: remove now unused mapCleaner field

Signed-off-by: Guillaume Pagnoux <[email protected]>

* pkg/network: fix network state tests

Signed-off-by: Guillaume Pagnoux <[email protected]>

* protocols: remove ProtocolKind type

Signed-off-by: Guillaume Pagnoux <[email protected]>

* http: fix call to NewTelemetry

Signed-off-by: Guillaume Pagnoux <[email protected]>

* fix: remove old go build constraints

Signed-off-by: Guillaume Pagnoux <[email protected]>

* http: remove nil checks in pointer receivers

Signed-off-by: Guillaume Pagnoux <[email protected]>

* fix: rename loop variables

Signed-off-by: Guillaume Pagnoux <[email protected]>

* protocols: document types

Signed-off-by: Guillaume Pagnoux <[email protected]>

* monitor_test: use getHttpStats

Signed-off-by: Guillaume Pagnoux <[email protected]>

* protocols: map protocols in monitor & remove use of init

Signed-off-by: Guillaume Pagnoux <[email protected]>

* monitor: stop process monitor first

Signed-off-by: Guillaume Pagnoux <[email protected]>

* monitor: fix log message

Signed-off-by: Guillaume Pagnoux <[email protected]>

* monitor: document initProtocol

Signed-off-by: Guillaume Pagnoux <[email protected]>

* http: document ConfigureOptions

Signed-off-by: Guillaume Pagnoux <[email protected]>

* protocols: add errors

Signed-off-by: Guillaume Pagnoux <[email protected]>

* http: remove redundant log

Signed-off-by: Guillaume Pagnoux <[email protected]>

* http: add doc

Signed-off-by: Guillaume Pagnoux <[email protected]>

* monitor: fix log

Signed-off-by: Guillaume Pagnoux <[email protected]>

* monitor: better error message in case of init failure

…
yshapiro-57 added a commit that referenced this pull request Jun 27, 2023
* The skeleton of logging telemetry events from the cluster agent (#17397)

* The skeleton of logging telemetry events from the cluster agent

* Fix lint and unit test failures

* Address the first set of review comments

* Use ResetClient instead of plain HTTP client per comment

* Factor out getRemoteConfigPatchEvent into a separate function per review comment

* Refactor the way we get ClusterId per comment

* Send telemetry events when a cluster agent mutates a remote config (#17663)

* fix windows nanoserver crash on glog v1.1.x (#17340)

* fix windows nanoserver crash on glog v1.1.x

* remove unused go mod replaces

* [CWS] fix activity tree for busybox utils (#17415)

* [CWS] fix activity tree for busybox utils

* update comment

* Fix reporting of conflicting telemetry metrics (#17417)

Only use the limiter (and thus, send telemetry) from the core
agent. Instances of the demultiplexer in other agents do not receive
dogstatsd metrics.

* Update last stable version to 7.44.1 (#17438)

Signed-off-by: Nicolas Guerguadj <[email protected]>

* update packages to fix vulnerabilities in dependencies (#17418)

* do not use reflection for shallow copy (#17421)

This commit implements ShallowCopy for the pb.Span and pb.TraceChunk types.
The previous reflection-based implementation caused too much overhead in the
main processing loop, resulting in unacceptable performance loss.

This also adds tests to ensure that the ShallowCopy functions are correct.

* fix auto multi-line integration config (#17447)

* fix auto multi-line integration config

* reno

* update tests

* Update release.json and Go modules for 6/7.46.0-rc.2 (#17452)

* [CWS] reset events_stats to a PERCPU_ARRAY instead of a HASHMAP (#17473)

* Bump ncurses to 6.4 to fix CVE-2023-29491 (#17493)

* Kacper murzyn/7.45.0 changelog backport (#17489)

* 7.45.0 changelog (#17394)

* Release date updated

* Update latest stable agent version to 7.45.0 (#17491)

* fix subscriptionId fetching on azure (#17495)

* [SBOM] Remove `DeleteBlobs` from the sbom cache (#17465)

* remove delete missing blobs

* remove test

* fix strconv

* change from code review

* fix typo

* [CWS] fix duration suffix parsing (#17476)

* convert remaining users of old `golang-lru` to new generics based version (#17467)

* convert dogstatsd mapper cache to lru/v2

* convert network process cache to lru/v2

* convert network conntracker to lru/v2

* convert trivy cache to lru/v2

* convert network gateway lookup to lru/v2

* cleanup dependencies

* fix licenses

* fix conntracker tests

* fix conntrack debug

* [CWS] pre-alloc msg tags (#17434)

* silence error log about `DD_API_KEY` in internal profiler (#17371)

* Bump golang.org/x/sys from 0.3.0 to 0.8.0 in /pkg/gohai (#17106)

Bumps [golang.org/x/sys](https://github.com/golang/sys) from 0.3.0 to 0.8.0.
- [Commits](https://github.com/golang/sys/compare/v0.3.0...v0.8.0)

---
updated-dependencies:
- dependency-name: golang.org/x/sys
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <[email protected]>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

* [Gohai] Add common elements of the future new API (#17221)

* chore(gohai): use go version 1.18 to have generics

* feat(gohai): implement Value type

* feat(gohai): implement AsJSON and Initialize

* fix(gohai): fix lint warnings

* docs(gohai): add copyright in new files

* feat(gohai): add NewValueFrom method in Value

* feat(gohai): display suffix field tag in AsJSON

* Fix typo in pkg/gohai/utils/common.go

Co-authored-by: Nicolas Guerguadj <[email protected]>

* fix(gohai): address review comments

* feat(gohai): simplify common, remove Initialize

* docs(gohai): address comments review feedback

* feat(gohai): simplify AsJSON logic

* feat(gohai): return warnings as list of strings in AsJSON

* fix(gohai): fix common tests

* docs(gohai): fix comments/naming related review feedback

* test(gohai): simplify tests following pr review

---------

Co-authored-by: Nicolas Guerguadj <[email protected]>

* CWS: sync BTFhub constants (#17498)

Co-authored-by: paulcacheux <[email protected]>

* Bump golang.org/x/tools from 0.9.1 to 0.9.3 in /pkg/security/secl (#17479)

* Bump golang.org/x/tools from 0.9.1 to 0.9.3 in /pkg/security/secl

Bumps [golang.org/x/tools](https://github.com/golang/tools) from 0.9.1 to 0.9.3.
- [Release notes](https://github.com/golang/tools/releases)
- [Commits](https://github.com/golang/tools/compare/v0.9.1...v0.9.3)

---
updated-dependencies:
- dependency-name: golang.org/x/tools
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <[email protected]>

* Auto-generate go.sum and LICENSE-3rdparty.csv changes

---------

Signed-off-by: dependabot[bot] <[email protected]>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: dependabot[bot] <dependabot[bot]@users.noreply.github.com>

* Bump github.com/stretchr/testify from 1.8.3 to 1.8.4 in /pkg/security/secl (#17478)

* Bump github.com/stretchr/testify in /pkg/security/secl

Bumps [github.com/stretchr/testify](https://github.com/stretchr/testify) from 1.8.3 to 1.8.4.
- [Release notes](https://github.com/stretchr/testify/releases)
- [Commits](https://github.com/stretchr/testify/compare/v1.8.3...v1.8.4)

---
updated-dependencies:
- dependency-name: github.com/stretchr/testify
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <[email protected]>

* Auto-generate go.sum and LICENSE-3rdparty.csv changes

---------

Signed-off-by: dependabot[bot] <[email protected]>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: dependabot[bot] <dependabot[bot]@users.noreply.github.com>

* Bump requests from 2.30.0 to 2.31.0 in /test/e2e/cws-tests (#17428)

Bumps [requests](https://github.com/psf/requests) from 2.30.0 to 2.31.0.
- [Release notes](https://github.com/psf/requests/releases)
- [Changelog](https://github.com/psf/requests/blob/main/HISTORY.md)
- [Commits](https://github.com/psf/requests/compare/v2.30.0...v2.31.0)

---
updated-dependencies:
- dependency-name: requests
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <[email protected]>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

* Bump docker from 6.1.2 to 6.1.3 in /test/e2e/cws-tests (#17427)

Bumps [docker](https://github.com/docker/docker-py) from 6.1.2 to 6.1.3.
- [Release notes](https://github.com/docker/docker-py/releases)
- [Commits](https://github.com/docker/docker-py/compare/6.1.2...6.1.3)

---
updated-dependencies:
- dependency-name: docker
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <[email protected]>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

* Bump datadog-api-client from 2.12.0 to 2.13.1 in /test/e2e/cws-tests (#17429)

Bumps [datadog-api-client](https://github.com/DataDog/datadog-api-client-python) from 2.12.0 to 2.13.1.
- [Release notes](https://github.com/DataDog/datadog-api-client-python/releases)
- [Changelog](https://github.com/DataDog/datadog-api-client-python/blob/master/CHANGELOG.md)
- [Commits](https://github.com/DataDog/datadog-api-client-python/compare/2.12.0...2.13.1)

---
updated-dependencies:
- dependency-name: datadog-api-client
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <[email protected]>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

* [system-probe] only increment unregisters metric if delete actually occurs (#17402)

* only increment unregisters if delete actually occurs

* measure time from start of delete function, only increment if no err

* Bump github.com/prometheus/procfs from 0.10.0 to 0.10.1 (#17347)

Bumps [github.com/prometheus/procfs](https://github.com/prometheus/procfs) from 0.10.0 to 0.10.1.
- [Release notes](https://github.com/prometheus/procfs/releases)
- [Commits](https://github.com/prometheus/procfs/compare/v0.10.0...v0.10.1)

---
updated-dependencies:
- dependency-name: github.com/prometheus/procfs
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <[email protected]>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

* Fix duplicate prebuilt module in use during tests (#17472)

Because this is a global object, and the tests load prebuilt modules a bunch, you can end up with a long list of the same string. Example output:

```
&{{[] 0xc0013ada10} map[] map[closed_conn_dropped:0 conn_dropped:0 conns_bpf_map_size:18 conns_closed:1 kprobes_missed:0 kprobes_triggered:2] map[conntrack:{true 10 2032871} oomKill:{false 0 0} runtimeSecurity:{false 0 0} tcpQueueLength:{false 0 0} tracer:{true 10 3072361} usm:{true 10 2701840}] 2 map[tracer:1 usm:1] [offset-guess offset-guess offset-guess offset-guess offset-guess offset-guess offset-guess offset-guess offset-guess offset-guess offset-guess offset-guess offset-guess offset-guess tracer offset-guess dns tracer offset-guess dns tracer offset-guess dns tracer offset-guess dns tracer offset-guess dns tracer offset-guess dns tracer offset-guess dns tracer offset-guess dns tracer offset-guess dns tracer offset-guess dns tracer offset-guess dns tracer offset-guess dns tracer offset-guess dns tracer offset-guess dns tracer offset-guess dns tracer offset-guess dns tracer offset-guess dns tracer offset-guess dns usm tracer dns usm tracer offset-guess dns tracer offset-guess dns tracer offset-guess dns tracer offset-guess dns tracer offset-guess dns tracer offset-guess dns tracer offset-guess dns tracer offset-guess dns tracer offset-guess dns tracer offset-guess dns tracer offset-guess dns tracer offset-guess dns tracer offset-guess dns tracer offset-guess dns tracer offset-guess dns tracer offset-guess dns tracer offset-guess dns tracer offset-guess dns tracer offset-guess dns tracer offset-guess dns tracer offset-guess dns tracer offset-guess dns tracer offset-guess dns tracer offset-guess dns tracer offset-guess dns tracer offset-guess dns tracer offset-guess dns tracer offset-guess dns tracer offset-guess dns tracer offset-guess dns tracer offset-guess dns tracer offset-guess dns tracer offset-guess dns tracer offset-guess dns tracer offset-guess dns dns dns dns dns dns dns dns dns dns dns dns dns dns dns dns dns dns dns dns dns dns dns dns dns dns dns dns dns dns dns dns dns dns dns dns dns dns dns dns dns dns dns dns dns dns dns dns dns dns dns dns dns dns dns offset-guess dns offset-guess dns offset-guess dns offset-guess dns offset-guess dns offset-guess dns offset-guess dns offset-guess dns offset-guess dns offset-guess dns offset-guess dns offset-guess dns offset-guess dns offset-guess dns offset-guess dns offset-guess dns offset-guess dns offset-guess dns dns offset-guess dns offset-guess dns offset-guess dns offset-guess dns offset-guess dns offset-guess dns offset-guess dns offset-guess dns offset-guess dns offset-guess dns offset-guess dns offset-guess dns offset-guess dns offset-guess dns offset-guess dns offset-guess dns offset-guess dns offset-guess dns offset-guess dns offset-guess dns offset-guess dns offset-guess dns offset-guess dns offset-guess dns offset-guess dns offset-guess dns offset-guess dns offset-guess dns offset-guess dns offset-guess dns offset-guess dns offset-guess dns offset-guess dns offset-guess dns offset-guess dns tracer offset-guess dns usm tracer offset-guess dns usm tracer offset-guess dns usm tracer offset-guess dns usm tracer offset-guess dns usm tracer offset-guess dns usm tracer offset-guess dns usm tracer offset-guess dns usm tracer offset-guess dns usm tracer offset-guess dns usm tracer offset-guess dns usm tracer offset-guess dns usm tracer offset-guess dns usm tracer offset-guess dns usm tracer offset-guess dns usm tracer offset-guess dns usm tracer offset-guess dns dns dns dns dns dns dns dns dns dns dns dns dns dns dns dns dns] map[] map[] map[] map[]}
```

* Add way to log trace_pipe from tests (#17339)

* Bump github.com/vektra/mockery/v2 from 2.26.1 to 2.28.1 in /internal/tools (#17424)

* Bump github.com/vektra/mockery/v2 in /internal/tools

Bumps [github.com/vektra/mockery/v2](https://github.com/vektra/mockery) from 2.26.1 to 2.28.1.
- [Release notes](https://github.com/vektra/mockery/releases)
- [Changelog](https://github.com/vektra/mockery/blob/master/docs/changelog.md)
- [Commits](https://github.com/vektra/mockery/compare/v2.26.1...v2.28.1)

---
updated-dependencies:
- dependency-name: github.com/vektra/mockery/v2
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <[email protected]>

* `inv -e security-agent.gen-mocks`

* `inv -e process-agent.gen-mocks`

---------

Signed-off-by: dependabot[bot] <[email protected]>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: Paul Cacheux <[email protected]>

* [CWS][SEC-3735] Check self tests results in e2e tests (#17387)

* Check self_test results in e2e tests

* Check self_test results in e2e tests

* Fix self_check tests

* fix python lint things

* fix python lint thing

* Changes after review

* fix python lint thing

* [CSPM] Resolve process env variables only if required (#17461)

* [system-probe] Handle/reduce stat cookie collisions (#17197)

* [system-probe] Add internal_profiling.delta_profiles option to system-probe (#17475)

* [CSPM] Fix flakyness of TestProcessInput/Sleeps (#17399)

* system-probe: Remove redundant call for IsAdjusted (#17345)

* npm: Remove connection entry from tcpStats map if the connection is TCP (#17353)

* deprecate usm configuration values (#17216)

* usm: Deprecated network_config.http_replace_rules in favor of service_monitoring_config.http_replace_rules

* usm: Deprecated network_config.max_tracked_http_connections in favor of service_monitoring_config.max_tracked_http_connections

* usm: Deprecated network_config.max_http_stats_buffered in favor of service_monitoring_config.max_http_stats_buffered

* usm: Fixed configuration test

* Added releasenotes

* Fixed CR

* Fixed kitchen tests

* Fixing CI

* Update releasenotes/notes/deprecating-usm-configuration-values-6c43a0181c2cc821.yaml

Co-authored-by: Ursula Chen <[email protected]>

* Remove test patches

* Fixed cr

---------

Co-authored-by: Ursula Chen <[email protected]>

* Cloud Service implementation for Azure App Service (#17483)

This PR is extending serverless Cloud Service support to web apps running in Azure App Service containers.

* [CWS] avoid exec bomb (#17435)

* [CWS] fix process schema (#17422)

* Bump github.com/open-policy-agent/opa from 0.53.0 to 0.53.1 (#17505)

Bumps [github.com/open-policy-agent/opa](https://github.com/open-policy-agent/opa) from 0.53.0 to 0.53.1.
- [Release notes](https://github.com/open-policy-agent/opa/releases)
- [Changelog](https://github.com/open-policy-agent/opa/blob/main/CHANGELOG.md)
- [Commits](https://github.com/open-policy-agent/opa/compare/v0.53.0...v0.53.1)

---
updated-dependencies:
- dependency-name: github.com/open-policy-agent/opa
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <[email protected]>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

* [CSPM] Do not allow http.send and opa.runtime rego builtins (#17409)

* Bump github.com/hashicorp/golang-lru/v2 from 2.0.2 to 2.0.3 (#17503)

Bumps [github.com/hashicorp/golang-lru/v2](https://github.com/hashicorp/golang-lru) from 2.0.2 to 2.0.3.
- [Release notes](https://github.com/hashicorp/golang-lru/releases)
- [Commits](https://github.com/hashicorp/golang-lru/compare/v2.0.2...v2.0.3)

---
updated-dependencies:
- dependency-name: github.com/hashicorp/golang-lru/v2
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <[email protected]>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

* [system-probe] Fix race in Stop() for tcp close consumer (#17511)

* [e2e] target agent-sandbox account by default with e2e tests (#17484)

* typo (#17430)

* process-monitor: Change owner (#17510)

* npm: Spare copying of active connection twice (#17351)

* process-monitor: Change loading order. (#17401)

The refactor forced every user of process-monitor to call initialize. We ensured the initialized is being called only once.
During the initialize phase we scanned all running processes and tried to trigger the callbacks. But since every user called
initialize by itself, we had a race between registering callbacks and scanning the process list.
Now we call initialize only once, at the monitor initialization, and by that ensuring no race exists, as callback registrations
happens before calling the initialization

* [e2e] bump test-infra-definition to v0.0.0-20230607143804-fef23444c9da (#17517)

* npm: Remove redundant err return (#17520)

* system-probe: Avoid unnecessary allocations for trace logs in hot-code-paths (#17354)

* system-probe: Avoid unnecessary allocations for trace logs in hot-code-paths

* Wrapped more logs

* npm: Changed dns resolution to get a set of IPs rather than a list. (#17358)

* npm: Changed dns resolution to get a set of IPs rather than a list.

* Reduce allocated space, for the average case

* Fix potentital use of uninitialized memory (#17490)

This fixes potential use of uninitialized memory when PyList_GetItem
returns NULL.

This code path is impossible to hit in practice with the current
versions of Python, as long as the object is a list and index is in
bounds, which is ensured by the prior call to PyList_Size. These
functions do not use the Python sequence protocol, so evil python code
can not supply incorrect length or throw an unexpected exception
either.

* Bump github.com/stretchr/testify from 1.8.2 to 1.8.4 in /pkg/gohai (#17363)

Bumps [github.com/stretchr/testify](https://github.com/stretchr/testify) from 1.8.2 to 1.8.4.
- [Release notes](https://github.com/stretchr/testify/releases)
- [Commits](https://github.com/stretchr/testify/compare/v1.8.2...v1.8.4)

---
updated-dependencies:
- dependency-name: github.com/stretchr/testify
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <[email protected]>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

* allow snapshot to fail (#17386)

* add JSON decoder for activity dumps (#17444)

* add activity tree stats in activity dump list command (#17369)

* fix secprofile unstable guards (#17509)

* use the remote storage from a command line (#17525)

* Adding shared pool monitoring for Oracle databases (#17360)

* finished

* release notes

* relnotes correction

* review

* Adding more sysmetrics to Oracle monitoring  (#17466)

* both metrics

* send only 60s interval for sysmetrics

* latency & refactoring

* bg cpu usage

* indexes

* io

* more metrics

* new metrics

* metrics

* completed

* completed

* release notes

* disable cursor cache hit ratio

* Revert "[CWS][SEC-3735] Check self tests results in e2e tests (#17387)" (#17526)

This reverts commit e83efae53b0554a45557fa1b5f3e7e2f378502f4.

* MetricSecurityProfileAnomalyDetectionGenerated tracks the number of generated anomalies (#17462)

* [CWS] fix race when playing snapshot process data (#17527)

* AP-2099 Prevent jobs that trigger child pipelines to download artefacts (#17117)

* Fix broken loop (#17534)

* Report conntrack ebpf module loading telemetry (#17539)

* [fakeintake] add godoc (#17474)

* [fakeintake] add godoc

* [e2e] fix test example

* [fakeintake] add helpers to client to get payload names

* [fakeintake] move s in api.Payload inside doc link

* [e2e] bump test-infra to 20230607221957

* [e2e] add logs example

* [e2e] fix test-infra version

* [e2e] remove unused config file

* usm: process monitor: Call heavy operation only if needed (#17457)

* usm: process monitor: Call heavy operation only if needed

From now on, we're scanning already running processes if and only if there are registered exec callbacks.
Furthermore, we maintain 2 atomic booleans to indicate if we have any exec or exit callbacks, if we don't
have, then we're sparing mutex acquiring

* Added documentation

* Removed filed

* Update java integration tests to use latest layers. (#17194)

* Add workaround for database connection loss (#17486)

* implemented

* release notes

* Update releasenotes/notes/connection-loss-workaround-c457738d985fda2a.yaml

Co-authored-by: Austin Lai <[email protected]>

* Update pkg/collector/corechecks/oracle-dbm/oracle.go

Co-authored-by: Alexandre Normand <[email protected]>

* removed comments

* corrected syntac errors after merging

---------

Co-authored-by: Austin Lai <[email protected]>
Co-authored-by: Alexandre Normand <[email protected]>

* [CWS] remove load controller (#17220)

* [CWS] rework secprofile warmup tests (#17377)

* (rcm) simplify the RC thin client (#17468)

* (rcm) simplify the RC thin client

* simplify listeners as well

* fix apm and security agent

* fix cws profiles

* fix apm client

* CWS: sync BTFhub constants (#17550)

Co-authored-by: paulcacheux <[email protected]>

* https java tests use local https server (#17067)

https java tests use local https server

* [CWS] revert snapshot event playing  (#17553)

* [CWS] do not play snapshot for now

* remove test

* deprecate more usm values (#17342)

* Fixed bug in configuration

* usm: Deprecated system_probe_config.http_map_cleaner_interval_in_s in favor of service_monitoring_config.http_map_cleaner_interval_in_s

* usm: Deprecated system_probe_config.http_idle_connection_ttl_in_s in favor of service_monitoring_config.http_idle_connection_ttl_in_s

* usm: Deprecated network_config.http_notification_threshold in favor of service_monitoring_config.http_notification_threshold

* usm: Deprecated network_config.http_max_request_fragment in favor of service_monitoring_config.http_max_request_fragment

* usm: Added releasenotes

* Fixed file name linter

* Addressed CR comments

* usm: Use apply default

* Fixed test

* added missing import

* Fixed imports

* Adds DD_RESOURCE_GROUP and DD_SUBSCRIPTION_ID to env vars (#17558)

* rtloader: Use execinfo only on glibc (#15256)

Use execinfo only on glibc.
Functions in execinfo.h are GNU extensions and not available on other C libraries like musl.

We used to use libexecinfo package (A quick-n-dirty BSD licensed clone of the GNU libc backtrace facility.) of Alpine Linux to build datadog-agent on Alpine, but it has been removed since Alpine 3.17.
This PR allow to build datadog-agent on Alpine Linux and other non-glibc environments.

* Remove a no more used SBOM check config parameter (#17405)

* Adjust default value for Oracle check interval (#17551)

* adapted the default value

* reverted

* changed default in the factory

* remove init in config

* Add new invoke task to test buildimage update (#17241)

* Add new invoke task to test buildimage update

* Use new utils method in invoke task and more tests

* Bump emoji from 2.2.0 to 2.4.0 in /test/e2e/cws-tests (#17425)

Bumps [emoji](https://github.com/carpedm20/emoji) from 2.2.0 to 2.4.0.
- [Release notes](https://github.com/carpedm20/emoji/releases)
- [Changelog](https://github.com/carpedm20/emoji/blob/master/CHANGES.md)
- [Commits](https://github.com/carpedm20/emoji/compare/v2.2.0...v2.4.0)

---
updated-dependencies:
- dependency-name: emoji
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <[email protected]>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

* Bump github.com/itchyny/gojq from 0.12.12 to 0.12.13 (#17442)

Bumps [github.com/itchyny/gojq](https://github.com/itchyny/gojq) from 0.12.12 to 0.12.13.
- [Release notes](https://github.com/itchyny/gojq/releases)
- [Changelog](https://github.com/itchyny/gojq/blob/main/CHANGELOG.md)
- [Commits](https://github.com/itchyny/gojq/compare/v0.12.12...v0.12.13)

---
updated-dependencies:
- dependency-name: github.com/itchyny/gojq
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <[email protected]>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

* [CWS] remove unused arg from `fill_exec_context` (#17579)

* chore(gohai): update gopsutil/v3 to 3.23.2 (#17500)

* mount docker socket to dev container (#17385)

* add semver to requirements.txt (#17384)

* [DCA][Autodiscovery] Add more context to error log (#17464)

* USMO-259 - Support Java Async frameworks (#16346)

* - added support for java Async frameworks ioctl messages

* changed java tls structs and maps to support async instrumentation's messages

* refactored names and cleaned debug logs

* files location and names refactoring

* more code documentation

* - refactored java tls to work with tail calls

* - initializing connection_by_peer_key on stack, after the split to tail calls, we don't reach the stack limit anymore

* fixed compilation and verifier errors on 4.14

* fixed compilation error

* added java tls tail-calls to undefined probs list

* - removed unused map

- fixed the check if java tls is working before adding the tail calls

* - fixed the check for enabling java tls program

* added java tls tail calls to exclude list for shared_libraries_test.go

* fixed error in a previous commit conflict merge

* [PROC-2913] Create protobuf definitions for process workload stream server (#17497)

* Create proto definitions

* Update workloadmeta.proto

* Build proto files

* Add `eventId` field

* Apply moises' suggestions

* [RCM] Fix rc config deletion (#17581)

* Fix rc config deletion

* Cleanup

* Add test

* bump `ebpf-manager` to latest (#17585)

* bump `ebpf-manager` to latest

* `inv -e generate-licenses`

* [Gohai][ASC-471] implement cpu collection using sysctl syscall (#17556)

* feat(gohai): implement cpu collection using sysctl syscall

* Update release note

* Update releasenotes/notes/gohai-darwin-cpu-native-a931acf4d9d543ae.yaml

Co-authored-by: Heston Hoffman <[email protected]>

---------

Co-authored-by: Heston Hoffman <[email protected]>

* Add tests to CI (#17541)

* [USM] don't flood logs when a process is not java (#17590)

[Debug] java pid 26055 attachment rejected

* Fix the formating for debug log in SetAgentMetadata (#17382)

* [process-agent] Create WorkloadMetaExtractor v1 (#17448)

* Create wlm extractor

* Initial workloadmeta changes

* Add extractor tests

* Fix import cycle

* Add some tracing for QA

* Fixed an edge case where the map key != proc.pid

* Add tracing for QA

* Add release note

* Added caching and produce events

* Apply guy's suggestion and check in `grpc.go`

* Update pkg/languagedetection/languagemodels/types.go

Co-authored-by: Guy Arbitman <[email protected]>

* Update pkg/process/metadata/workloadmeta/grpc.go

Co-authored-by: Guy Arbitman <[email protected]>

* Fix linter errors

* Update create-wlm-extractor-e408e2826cc77be8.yaml

Removed comments

* Update pkg/process/metadata/workloadmeta/workloadmeta.go

Co-authored-by: Moisés Botarro <[email protected]>

* Update pkg/process/metadata/workloadmeta/extractor_test.go

Co-authored-by: Moisés Botarro <[email protected]>

* Address comments

* add debug log on instantiation

* apply suggestions

* Add benchmark for sprintf vs itoa

* Fix flaky test

* Add trace log and fix comment

---------

Co-authored-by: Guy Arbitman <[email protected]>
Co-authored-by: Moisés Botarro <[email protected]>

* [usm] Add ability to report payload telemetry (#17544)

* [usm] Add ability to report payload telemetry

* Require USM payload telemetry to be explicitly declared

* Rename `OptTelemetry` to `OptPayloadTelemetry`

* Add unit test

* Update the `test-infra-definitions` dependency in `test/new-e2e` (#17566)

* Revert "[usm] Improve `incompleteBuffer` (#17164)" (#17593)

This reverts commit a0481de26a4a68c1e6fb228294774e8916f37943.

* DD_SERVICE_MAPPING in extension (#17189)

* DD_SERVICE_MAPPING in extension

* lint

* release note

* edit release note

* make DD_SERVICE_MAPPING src code split up into smaller parts for easier testing, fix tests, leverage config pkg

* gofmt

* add serverless prefix

* Update releasenotes/notes/serverless-DD-SERVICE-MAPPING-594cc2cb7d090473.yaml

Co-authored-by: Ursula Chen <[email protected]>

* trigger ci

* cover same key and value, add more bad input tests

* add new test cases

* format

---------

Co-authored-by: Ursula Chen <[email protected]>

* Improves python check docs to use virtualenv and sort out PYTHONPATH when needed (#17569)

* Adds docs to use virtualenv and sort out PYTHONPATH when needed

* Adds feedback from PR comments

* Adds note about needing -p arg for virtualenv

* Bump github.com/hashicorp/golang-lru/v2 in /pkg/security/secl (#17599)

Bumps [github.com/hashicorp/golang-lru/v2](https://github.com/hashicorp/golang-lru) from 2.0.2 to 2.0.3.
- [Release notes](https://github.com/hashicorp/golang-lru/releases)
- [Commits](https://github.com/hashicorp/golang-lru/compare/v2.0.2...v2.0.3)

---
updated-dependencies:
- dependency-name: github.com/hashicorp/golang-lru/v2
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <[email protected]>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

* CWS: sync BTFhub constants (#17608)

Co-authored-by: paulcacheux <[email protected]>

* Upgrade to OpenSSL 3 in Agent 7, upgrade Python 3 to 3.9.17 (#17501)

Co-authored-by: Florent Clarret <[email protected]>

* Bump golang.org/x/sys from 0.8.0 to 0.9.0 in /pkg/security/secl (#17600)

* Bump golang.org/x/sys from 0.8.0 to 0.9.0 in /pkg/security/secl

Bumps [golang.org/x/sys](https://github.com/golang/sys) from 0.8.0 to 0.9.0.
- [Commits](https://github.com/golang/sys/compare/v0.8.0...v0.9.0)

---
updated-dependencies:
- dependency-name: golang.org/x/sys
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <[email protected]>

* Auto-generate go.sum and LICENSE-3rdparty.csv changes

---------

Signed-off-by: dependabot[bot] <[email protected]>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: dependabot[bot] <dependabot[bot]@users.noreply.github.com>

* Fix username generation on windows (#17547)

* [USM] tests RunDockerServer/RunHostServer log pid (#17587)

* [pkg/netflow] Collect `flow_process_nf_errors_count` metric from goflow2 (#17460)

* Collect flow_process_nf_errors_count goflow@ metric

* Add release note

* [CWS] remove unused mount group id field (#17222)

* [CWS] remove unused mount group id field

* update docs

* [CWS] cgroup resolver: use hashmap instead of LRU to track workload PIDs (#17487)

* [CWS] cgroup resolver: use hashmap instead of LRU to track workload PIDs

* add missing increment

* [AP-2139] Add amazonlinux2023 to the kitchen tests (#17548)

* Add amazonlinux2023 to the kitchen tests

* use 2023 as default to test as branch build launches only default

* [workloadmeta][process] Bootstrap process entities in workloadmeta (#17327)

* [CSPM] Make sure we do not create zombie processes in our tests (#17609)

* [CSPM] Make sure we do not create zombie processes in our tests

* add more details in case of test failure

* feat: support provisioned concurrency and proactive initialization (#17014)

* feat: support provisioned concurrency and proactive initialization

* feat: Set init time at beginning of agent init, unless that would somehow overlap the lambda start span

* feat: Use real time object until we need to convert

* feat: Use platform.initStart to begin the cold start span

* feat: specs

* fix: Update logs_test

* Fix: Set init time on EC

* feat: No string interpolation if we don't need it

* feat: Fix ecs proactiveInit boolean so it turns off after first invocation

* feat: Fix logs collector tags so we don't over-tag here

* feat: fmt

* feat: Add logs test for Proactive Initialization

* Bump snowflake-connector-python to 3.0.4 (#17445)

* [CWS] remote use of internal pointer (#16731)

* Include AAS metadata in span tags (#17591)

Currently, customers need to manually set the environment variable DD_AZURE_APP_SERVICES=true in order to make the traces include AAS metadata in tags. This change will eliminate that step and include these tags by detecting if we're on AAS. The PR also removes the old logic which adds these tags based on the environment variable DD_AZURE_APP_SERVICES.

* [RCM] Add rc client in flare (#17094)

* Add rc client in flare

* Add rc listeners

* Change constructor

* Fix AGENT_TASK read

* Cleanup

* Fix lint

* Address reviews

* Add release note

* Fix CI

* Fix CI

* Add mutex

* Address review

* Address review

* [secrets][tests] properly reset secrets backend timeout after test (#17614)

* [CWS] fix prerm scripts error logs (#17383)

* Handle missing result json file (#17537)

* [CWS] move arithmetic secl test to the secl package (#17610)

* [CWS] decouple a bit AD/Profile from probe (#17131)

* [CWS] cleanup runner before running btfhub sync job (#17629)

* cleanup runner before running btfhub sync job

* bump setup-go and remove cache step (included in v4)

* CWS: sync BTFhub constants (#17633)

Co-authored-by: paulcacheux <[email protected]>

* [corechecks/snmp] Refactor Profile Config (#17618)

* [CWS] rework secprofile tryAutolearn (#17535)

* Rework secprofile autoLearn func (including 2 fixes), and add 44 unitary tests arround it

* Fix go lint

* Apply review suggestion

* [Fix] Agent version cache not correctly loaded in multiple CI jobs (#17606)

* http2: remove packed enum values (#17586)

Signed-off-by: Guillaume Pagnoux <[email protected]>

* Make `nettop` available (#17458)

* [CWS] support kernel with usernamespaces arguments for security functions (#17634)

* remove unused function

* remove unused function

* PoC support new userns arg

* constantify the argument position selection

* fix same name issue

* horrible hack to pass the verifier

* [CWS] add unknown source for process entry (#17636)

* [CWS] update fallback constants for recent kernels (#17639)

* fix bpf map id constant

* fix bpf mai name offset

* fix bpf prog aux name offset

* move `kitchen_test_dummy_job_tmp` to k8s runners (#17641)

* [gitlab] Migration of unit tests CI jobs to k8s Gitlab runners (#17179)

Requires https://github.com/DataDog/datadog-agent-buildimages/pull/370 first.

This PR:
- updates the Linux build images used in the `datadog-agent` Gitlab CI pipelines to images that do not have an entrypoint script (required because our k8s Gitlab runner infrastructure overwrites the entrypoint of images, therefore we can't rely on it being run)
- updates all relevant CI scripts to run `source /root/.bashrc` at the very beginning, since this is not run in the entrypoint anymore
- updates all jobs in the `setup`, `deps_fetch`, `source_test`, `binary_build` stages to run on k8s runners instead of classic runners
- updates container-related unit tests to work when run in a k8s environment (thanks @L3n41c, cc @DataDog/container-integrations)
- skips a few gohai and gohai-related metadata unit tests that are failing on the arm64 rpm runner because `df` doesn't work in this specific setup, for reasons that remain to be investigated (cc @DataDog/agent-shared-components)
- adds a way to specify concurrency for `golangci-lint` invocations (see https://github.com/DataDog/datadog-agent/pull/15722 and https://github.com/DataDog/datadog-agent/pull/15762)
- fixes the `package_dependencies` jobs in the `kernel_matrix_testing` stage, which weren't using the correct `BUILDIMAGES_SUFFIX`. variable

Co-authored-by: Lénaïc Huard <[email protected]>

* [gitlab] Migrate docker publish jobs to k8s runners (#17270)

Migrates docker publishing jobs to the new Kubernetes-based runners.

The docker build jobs were migrated in #15511, but the publishing jobs are still using old runners.

* Add mutex to runtime settings (#17640)

* Process BTF archive nightly (#17621)

* Minor fixes to system-probe (#17622)

* Use correct module name in restart command

* Proper PingTCP/PingUDP cleanup

* [CWS Agent] RC rules override local rules if IDs conflict (#17573)

* reverse order of policy loading

* adding PolicyProviderType consts

* move enforcement of policy provider loading into a testable func

* pkg/flare: add missing APM variables to envvars (#17597)

This PR adds all APM environmental variables currently being used by 
the agent to the flare. Previously, some variables were missing and so
their values would not be represented when producing a flare.

* [Serverless] Use prebuilt opentelemetry lambda layers in integration tests. (#17568)

* Enable auto-instrumentation for python integration test.

* Update snapshot for otlp-python.

* Linting.

* Sort values of tag _dd.tags.container.

* Add encoding info to tailer info for the agent status verbose page (#17533)

* add encoding info to tailer info for the agent status verbose page

* push encoding information straight into tailer info

* moving adding tailer info to parser instead

* NIT

* [usm] Intern Kafka topic names (#17648)

* Add benchmark

* Intern topic name strings

* Fix data synchronization

* config/apm: fix parsing DD_APM_FEATURES (#17630)

Support either "," or " " as separator when parsing the value of DD_APM_FEATURES. It fixes a regression introduced in #15904 which changed the separator from comma to space. This was a breaking change. From 7.44 to 7.46 using a space as separator was suggested ad a workaround, this PR ensures we don't break compatibility again. We now support either space or comma.

* [CWS] do not handle broken lineage during snapshot (#17624)

* Revert "[CWS] do not handle broken lineage during snapshot (#17624)" (#17656)

This reverts commit 970566077529b17147e7c74129344c7b35f766b3.

* [CWS] Improve tryAutolearn unit tests by making fake events to have a valid lineage (#17657)

* [CWS] fix overlayfs inode read on kernel 5.19 and higher (#17644)

* dbg output

* xfs hacky solution to go around the 300MB limit

* PoC test fix overlayfs

* pipe constant param to select the lower inode selection

* implement kernel version check for feature detection

* small fix

* implement function probing based detection

* apply suggested review changes

* Revert "Revert "[CWS] do not handle broken lineage during snapshot (#17624)" (#17656)" (#17658)

This reverts commit 1ad88863157f73078068eab05bda4ea00ddadb58.

* Report config mutation events from the agent

* Create initial config for DDQA (#15675)

* Create initial config for DDQA

* Adding USM

* Adding NDM

* Update config.toml

* add ebpf-platform

* [tools] Adding ASC team to DDQA initial config

* Fixing USM jira project

* Add Agent Platform

* Add platform integrations team

* Add APM

* Add Remote Config

* Add Container-[Integrations-Ecosystems]

* Add Security And Compliance Agent

* Add Processes

* Change RCM issue type to QA

* Add Windows Agent

* [container-app] add qa team metadata

* Network Performance Monitoring

* Update .ddqa/config.toml - Database Monitoring

* Add Windows Kernel Integrations

* add final team Agent Integrations

* Use QA Task for Windows Agent

* [tools] Update task type for agent-shared-components QA issue generation

We now have a new task type for ASC QA operations that should be used

* final update

https://github.com/DataDog/ddqa/pull/13

* rename top-level option

* remove `changelog/no-changelog` as an ignored label

* [ddqa] exclude members for team ASC

* ddqa: AML exclude_members.

* Updating excluded devs for CI team

* Add CSPM Agent

* Exclude olivielpeau from Agent Platform QA

---------

Co-authored-by: Guy Arbitman <[email protected]>
Co-authored-by: Florian Veaux <[email protected]>
Co-authored-by: Alexander Nicholas Costas <[email protected]>
Co-authored-by: Bryce Kahle <[email protected]>
Co-authored-by: Srdjan Grubor <[email protected]>
Co-authored-by: Kylian Serrania <[email protected]>
Co-authored-by: Sarah Witt <[email protected]>
Co-authored-by: Katie Hockman <[email protected]>
Co-authored-by: Baptiste Foy <[email protected]>
Co-authored-by: Cedric Lamoriniere <[email protected]>
Co-authored-by: Paul Cacheux <[email protected]>
Co-authored-by: Moises Botarro <[email protected]>
Co-authored-by: Julien Lebot <[email protected]>
Co-authored-by: fisherevans <[email protected]>
Co-authored-by: Lee Avital <[email protected]>
Co-authored-by: Joel Marcotte <[email protected]>
Co-authored-by: Rich Lancia <[email protected]>
Co-authored-by: Pierre Gimalac <[email protected]>
Co-authored-by: Remy Mathieu <[email protected]>
Co-authored-by: Kacper <[email protected]>
Co-authored-by: David du Colombier <[email protected]>
Co-authored-by: Alexandre Menasria <[email protected]>

* dump silent workloads (#17412)

* [CWS] constantify `vm_flags` access in `vm_area_struct` (#17662)

* constantify `vm_flags` access in `vm_area_struct`

* re-gen constants

* regression detector: change baseline variant from latest main to merge base (#17449)

* regression detector: add compute merge base job

We need to compute the merge base of a non-`main` branch with respect
to `main` in order to establish the commit SHA of a baseline variant
in regression detection. This commit adds a job to compute that merge
base, echo the result to a file (sans newline), and then upload that
result to the Single-Machine Performance S3 bucket for Agent team.

Signed-off-by: Geoffrey M. Oxberry <[email protected]>

* regression detector: note build step we may need

The existing Single-Machine Performance (SMP) regression detector
setup doesn't build additional containers -- it just publishes
containers already built in Agent CI to SMP's ECR for Agent
images. I've written the new "merge-base" container assuming we can
continue with that strategy, but if we can't continue with that
strategy for some reason, then this commit includes a bunch of
comments sketching out a backup plan that would build the container we
would need and publish it to SMP's ECR for Agent images.

Signed-off-by: Geoffrey M. Oxberry <[email protected]>

* regression detectr: sketch functional test tweaks

As in the spirit of the previous commit, this commit adds some
comments on how to transition the existing setup in
`functional_test/regression_detector.yml` from using the latest commit
on `main` to using a "merge base" baseline SHA.

There are a few parts of this commit that are actually functional, but
shouldn't functionally alter the existing regression detector output
-- it still uses a "latest `main`" baseline SHA -- but it does
introduce the idea of getting the new baseline SHA from an artifact in
a previous stage, rather than uploading it to S3 (although I also
upload the merge base baseline SHA to S3 as well, as a contingency
plan).

Signed-off-by: Geoffrey M. Oxberry <[email protected]>

* smp, docker_linux.yml: use literal for image

I thought using a `!reference` tag would work for specifying an image
for the merge-base-computing job, but that mental model may be wrong,
so this commit replaces that tag with the literal it's supposed to
(de)reference.

Signed-off-by: Geoffrey M. Oxberry <[email protected]>

* smp, docker_linux.yml: comment out merge base job

The merge base job isn't running because no runners are configured to
run it. This issue could be one that I can't resolve, so it could be
that I need to get permissions to run it. For now, I'll comment out
this job and instead try and get the information necessary to compute
a baseline as part of the regression detector job, but without
otherwise altering the regression detector behavior.

Signed-off-by: Geoffrey M. Oxberry <[email protected]>

* regression_detector.yml: compute merge base

As a test of whether I can compute the merge base using a `git fetch`
command, this commit adds that command (and other supporting commands)
to the regression detector job to see if these commands will work.

Signed-off-by: Geoffrey M. Oxberry <[email protected]>

* smp, docker_linux.yml: add container build notes

I've managed to figure out how to get the most important case to work
-- the one in which the base branch of the pull request (i.e., in
GitLab terms, the target branch of the merge request) is
`main`. Having managed to get this case to work, this commit updates
my implementation notes to reflect how to implement the case when the
base branch of the pull request *is not* `main`.

Signed-off-by: Geoffrey M. Oxberry <[email protected]>

* regression detector: set baseline to merge base

It looks like creating a separate job (at least in
`.gitlab/container_build/docker_linux.yml`) will not work because
GitLab won't run it in a runner. This behavior may be a configuration
setting for security reasons, and may require repo admin privileges to
change. Instead, this commit implements setting the baseline SHA to
the merge base, along with some commands to abort the regression
detector job if the base branch of the pull request (target branch of
the merge request) is not `main`, and copies the merge base SHA to S3
for debugging/auditing purposes.

Signed-off-by: Geoffrey M. Oxberry <[email protected]>

* regression_detector.yml: remove obsolete comments

Now that I'm pretty sure I've figured out how to change the baseline
SHA to the merge base of a pull request, this commit deletes most of
the commented-out lines I introduced into
`.gitlab/functional_test/regression_detector.yml` as notes to
myself. These comments are no longer necessary for record-keeping.

Signed-off-by: Geoffrey M. Oxberry <[email protected]>

* docker_linux.yml: correct line comment

This commit corrects a line comment in which I write "Use this line if
comparing `main` against itself makes sense" when I should have
written "Use this line if comparing `main` against itself does not
make sense".

Signed-off-by: Geoffrey M. Oxberry <[email protected]>

* regression_detector.yml: make if block a one-liner

GitLab apparently doesn't echo multiline commands by default, so this
commit rewrites this multiline `if` block as a one line command for
debugging purposes.

Signed-off-by: Geoffrey M. Oxberry <[email protected]>

* regression_detector.yml: remove backticks

I'm pretty sure that the "command `main` not found" message I saw was
because the shell was likely interpreting those backticks as running a
command. This commit replaces the backitcks with single quotes to
avoid having the shell attempt to run a command called `main`.

Signed-off-by: Geoffrey M. Oxberry <[email protected]>

* regression_detector.yml: disable base branch check

This commit temporarily disables the check for the base branch because
it doesn't quite seem to work at the moment, and it adds enough
debugging output so I can get some idea of why that check does not
work.

Signed-off-by: Geoffrey M. Oxberry <[email protected]>

* regression_detector.yml: add artifacts + cleanup

This commit cleans up the last bits of debugging comments and code in
`.gitlab/functional_test/regression_detector.yml`, plus it adds the
reporting outputs as artifacts to provide additional diagnostic output
for debugging purposes.

Signed-off-by: Geoffrey M. Oxberry <[email protected]>

* docker_linux.yml: remove comment cruft

This commit removes a large comment block I put in
`.gitlab/container_build/docker_linux.yml` because I no longer think
it's a good idea to compute the merge base commit in a job separate
from the regression detector job.

Signed-off-by: Geoffrey M. Oxberry <[email protected]>

* regression detector: compute baseline via loop

The previous regression detector baseline SHA computation assumed that
an image exists in ECR for the commit returned by `$(git commit
merge-base HEAD main)`, *i.e.*, the merge base with `main`. However,
the container associated with that commit may not exist because that
container may have failed to build, or may have failed to upload to
ECR, so we need to check whether that container exists in ECR. If that
container exists, then the merge base with `main` is the baseline
SHA. If not, then we must iterate over predecessor commits in `main`
until we find a commit in `main` for which a container exists in
ECR. The first commit we find in this loop becomes the baseline SHA
for the regression detector.

This commit implements that check, along with a loop that iterates
over predecessor commits, if necessary, as described above. The
implementation is currently a rather verbose one-liner because
single-line statements are easier to debug in GitLab CI. Once this
implementation succeeds, a subsequent commit will clean up the
implementation by changing it to a multi-line statement, after which
this branch will be ready for review.

Signed-off-by: Geoffrey M. Oxberry <[email protected]>

* regression detector: reduce stderr noise

The initial implementation of the container existence check turns out
to be pretty noisy when a container doesn't exist because `awscli`
will output a bunch of error information on failure. While this
information is helpful for debugging purposes, it will be annoying in
CI, so this commit redirects the `stderr` of that command to
`/dev/null` to reduce noise in the CI output of the regression
detector CI job.

Signed-off-by: Geoffrey M. Oxberry <[email protected]>

* regression detector: fix copy before write error

The regression detector job in the Agent CI pipeline currently fails
because it attempts to copy a file that doesn't exist to S3. This
commit fixes that error by moving the copy command to a point after
the file is generated.

Signed-off-by: Geoffrey M. Oxberry <[email protected]>

* regression detector: tweak ecr query for debugging

For some reason, the `aws ecr describe-images` command is not finding
images that I know exist. To aid in debugging at the expense of noise,
this commit removes redirection of `stderr` to `/dev/null`, adds the
`--registry-id` flag to be more explicit about ECR repo location, and
also moves the `--profile` flag and its argument to a more readable
location.

Signed-off-by: Geoffrey M. Oxberry <[email protected]>

* regression detector: remove "latest main" job

The baseline-SHA-computing steps in the
`single-machine-performance-regression_detector` job make computing
the commit of "latest `main`" unnecessary, so this commit deletes the
job that does that computation.

Signed-off-by: Geoffrey M. Oxberry <[email protected]>

* regression detector: remove loop cleanup comment

Due to falling behind on Agent performance investigations, I don't
think I'm going to get to clean up the long, one line loop statement,
so this commit deletes that aspirational comment -- it can be cleaned
up in a subsequent pull request.

Signed-off-by: Geoffrey M. Oxberry <[email protected]>

* regression detector: pin pr-commenter version

This commit attempts to triage the `pr-commenter` errors by pinning
the `pr-commenter` version in a fashion similar to that used by
other Datadog repositories (e.g., `dogweb`).

Signed-off-by: Geoffrey M. Oxberry <[email protected]>

* regression detector: delete more stale comments

For now, I don't plan to move the baseline SHA computation into a
separate job, although that may happen later. Given this change of
plans, this commit removes the stale comment regarding refactoring the
baseline SHA computation to a separate job.

Signed-off-by: Geoffrey M. Oxberry <[email protected]>

---------

Signed-off-by: Geoffrey M. Oxberry <[email protected]>

* stop dumping workloads with a stable event type (#17536)

* [CWS] run functional tests on al2023 (#17612)

* run CWS functional tests on al2023

* skip the SELinux/sel_disable test on al2023

* fix sizeof_inode and tty_offset fallback constants on al2023 kernel 6.1

* fix import

* skip docker rc tests on al2023

* fix `vm_area_struct_flags_offset` on al2023

* Run system-probe tests using kernel matrix testing scenario (#16406)

* system-probe-test_spec.go added

* compile system-probe-test_spec and add to dependencies

* fix base image

* move test spec into test dir

* comment out wip

* use x64 images for packaging dependencies

* print outputs json

* save output ips to file

* print stack.outputs

* fix typo

* flush buffer to file

* move stack.outputs to CI_PROJECT_DIR

* connect to microvm

* fix gitlab yaml error

* use amis available in build-stable

* fix invoke command

* define AWS_REGION env var

* use kernel matrix testing image

* fix build tags

* use ssh private key to ssh

* explanatory comment

* fix path export

* save ssh key file to CI_PROJECT_DIR

* get ssh key info

* define AWS_SSH_KEY variable

* turn off StrictHostKeyChecking

* fix typo

* set ssh key file perms

* verbose ssh output

* change ssh file name and use the same name

* fix aws ssh key name

* change ssh key name

* try with root user

* change ssh user name back to 'ubuntu'

* set BatchMode to prevent passphare query

* set owner to user 1000

* some debugging

* fix ssh key perms

* add new line to ssh key file

* turn of host key verification when connecting with micro-vm

* use dedicated pulumi image

* run init script

* fix micro-vm-init.sh script path when scp'ing

* wrap inner ssh command in quotes

* pass arch to microvm init

* give full path of dependencies

* copy all shared dependencies and strip-components when extracting

* use go version 1.19 and run tests

* cross compile

* fix dir path of system-probe-tests

* retrieve go deps

* tidy-all

* use new tenant

* print .aws/config file

* use string replacer

* use exact s3 bucket as new-e2e

* use k8 runners for env job

* remove security groups and subnets, and change ssh key param name

* set correct env name

* add copyright

* fix micro-vm-init.sh and move it in the dependencies

* fix yaml syntax

* make GOVERSION string

* add agent-qa amis

* set GOARCH as env var

* change ssh key name

* try without key pair name

* use pulumi image

* reintroduce key pair config value

* change docker image tag

* try agent-ci-sandbox as ssh key name

* remove printing of aws_config

* add default key pair name: datadog-agent-ci

* update amis

* update amis

* update vmconfig. Run only am64

* fix arm64 package name

* update arm64 ami id

* fix typo in yaml

* add go_deps dependency

* disable arm64 distros for now

* add dummy provider for copy file

* fix command provider init

* provider args fix package

* unique name for command provider

* add go_deps as dependency to test job

* set go_tools_dep, and switch to new profile in tests

* use pulumi docker image

* add debug logs of ssh keys

* retry if failed to dial libvirt

* add comment and log

* tidy all

* make it possible to easily run scenario in dev machine

* fix dependency fetching step

* easier to launch in dev machine and do not use sudo

* fix infra-env in env setup job

* lint python

* fix ssh key name

* incorrect image tag introduced

* reintroduce custom arm64 kernels

* exec perms on micro-vm-init.sh

* apply suggestions

* avoid priniting out ssm read to debug output

* use script to setup dependencies before tests

* fix config

* fix function name

* remove shutdown command as it is moved in scenario, and cleanup path names

* use fallback of env vars

* simplify output streaming

* shutdown period must be int

* create and download junit and testjson files

* set up testjson and junit package in init script

* exit with system-probe test exit code

* fix shutdown option

* small fixes

* pass current environment to cmd

* fix junit and testjson download

* add env for DD_SYSTEM_PROBE_JAVA_DIR in new test spec

* fix delete path of junit and testjson package

* fix NewRunner after test-infra-definitions update

* pass arch to fetch_dependencies

* return error code of tests

* add '$' to resolve INSTANCE_IP

* Simplify micro-vm-init.sh IP address command

* Minor improvements

* Add timeouts

* Handle errors in outputsToFile

* Update AMIs

* Update to handle bundle-less test run

* Fix minor nits

* Update x86_64 AMI

* Update arm64 AMI

* add kvt to on_system_probe_changes_or_manual

* allow_failure for kvt jobs

* run all tests despite errors

* change stack name

* Fix json glob pattern

* update test-infra-definitions

* remove custom kernels and use distribution images

* remove custom kernels from vmconfig

* run tests only on x86_64

* update test-infra-definitions

* add arm64 tests back in

* fix variable replacement bash

* update ami

* Collect failed tests and output them at the end

* Output kernel release

* fix BTF_DIR path

* always color output if supported

* Use GO_VERSION from runner docker image

* force color

* only output FAIL if test has a name

* Do not return error if we only have failed tests

* do test json review at end to properly fail job

* Fix binary path

* specify binary name and main.go path

* Add missing end single quote

* Force color in review too

* add cleanup job

* provide secret sudo-password

* make kmt job manual

* revert back to running job on system-probe changes

* run dummy job on system-probe changes

* update test-infra-definitions

* fix grep

* fix grep for pattern '-instance-ip'

* set empty path when no private key path provided, to prevent pulumi from attempting to read ssh key file by guessing path

* change instance types to c6i.metal for intel and c6g.metal for arm

* split arm64 and x86 kmt tests

* fix parallel test runs

* fix dependencies in cleanup job

* fail if retrying tests

* run python linter

* use storage optimized instances for both intel and arm

* use compute optimized instance for arm

* allow all kernel_matrix_testing jobs to fail

* make kmt jobs manual

* change arm instance type to m6g and x86 to m5

---------

Co-authored-by: Bryce Kahle <[email protected]>

* Expose agent telemetry on system-probe UDS (#17652)

* [new-e2e] use standard-verbose format when verbose is True (#17660)

* add missing filter_tag envvars to config (#17653)

This commit adds the filter_tag envvars (DD_APM_FILTER_TAGS_REQUIRE
and DD_APM_FILTER_TAGS_REJECT) to the config_template.yaml file.
Previously, these envvars were not represented with the filter_tag 
parameter, so adding these in can improve documentation.

* [serverless] add `peer.service` to inferred spans (#17414)

* add `peerService` constant

* add `peerService` in `Span.Meta` tags

* remove logic from `span_enrichment.go`

* move logic to `lifecycle.go`

* Double Agent replicate counts (#17664)

* Double Agent replicate counts

We intend to automatically expand the number of replicates if a possible wobble
is detected. In order to approach the goal of doing this automatically we first
need to validate that consistently running this number of replicates is
1. feasible and 2. we draw more clear results by this doubling. There is work
required on our side to adjust the statistics, so in some sense this PR is a
change that will lead to a change that will lead to a change.

REF SMP-599
REF SMP-333

Signed-off-by: Brian L. Troutwine <[email protected]>

* empty commit to trigger CI

Signed-off-by: Brian L. Troutwine <[email protected]>

---------

Signed-off-by: Brian L. Troutwine <[email protected]>

* Remove dependency on github.com/iovisor/gobpf for single function (#17649)

* Remove dependency on github.com/iovisor/gobpf for single function

* Fix copyright check

* Add test

* More fixes from CWS module name change (#17650)

* Fix cyclical import

* Correctly handle empty error messages

* CWS: sync BTFhub constants (#17679)

Co-authored-by: paulcacheux <[email protected]>

* AP-2062 Change version of builders image and change kitchen cleanup task (#17529)

* Change version of builders image and change kitchen cleanup task

* remove manual trigger

* Update builder image

* [USM] Monitor & HTTP refactor (#17283)

* http: telemetry: remove unused error value

Signed-off-by: Guillaume Pagnoux <[email protected]>

* protocols: add protocol registration

Signed-off-by: Guillaume Pagnoux <[email protected]>

* protocols: add http protocol

Signed-off-by: Guillaume Pagnoux <[email protected]>

* monitor: handle excluded functions

Signed-off-by: Guillaume Pagnoux <[email protected]>

* ebpfProgram: remove now unused mapCleaner field

Signed-off-by: Guillaume Pagnoux <[email protected]>

* pkg/network: fix network state tests

Signed-off-by: Guillaume Pagnoux <[email protected]>

* protocols: remove ProtocolKind type

Signed-off-by: Guillaume Pagnoux <[email protected]>

* http: fix call to NewTelemetry

Signed-off-by: Guillaume Pagnoux <[email protected]>

* fix: remove old go build constraints

Signed-off-by: Guillaume Pagnoux <[email protected]>

* http: remove nil checks in pointer receivers

Signed-off-by: Guillaume Pagnoux <[email protected]>

* fix: rename loop variables

Signed-off-by: Guillaume Pagnoux <[email protected]>

* protocols: document types

Signed-off-by: Guillaume Pagnoux <[email protected]>

* monitor_test: use getHttpStats

Signed-off-by: Guillaume Pagnoux <[email protected]>

* protocols: map protocols in monitor & remove use of init

Signed-off-by: Guillaume Pagnoux <[email protected]>

* monitor: stop process monitor first

Signed-off-by: Guillaume Pagnoux <[email protected]>

* monitor: fix log message

Signed-off-by: Guillaume Pagnoux <[email protected]>

* monitor: document initProtocol

Signed-off-by: Guillaume Pagnoux <[email protected]>

* http: document ConfigureOptions

Signed-off-by: Guilla…
nenadnoveljic pushed a commit that referenced this pull request Jul 3, 2023
nenadnoveljic added a commit that referenced this pull request Jul 3, 2023
* The skeleton of logging telemetry events from the cluster agent (#17397)

* The skeleton of logging telemetry events from the cluster agent

* Fix lint and unit test failures

* Address the first set of review comments

* Use ResetClient instead of plain HTTP client per comment

* Factor out getRemoteConfigPatchEvent into a separate function per review comment

* Refactor the way we get ClusterId per comment

* Send telemetry events when a cluster agent mutates a remote config (#17663)

* fix windows nanoserver crash on glog v1.1.x (#17340)

* fix windows nanoserver crash on glog v1.1.x

* remove unused go mod replaces

* [CWS] fix activity tree for busybox utils (#17415)

* [CWS] fix activity tree for busybox utils

* update comment

* Fix reporting of conflicting telemetry metrics (#17417)

Only use the limiter (and thus, send telemetry) from the core
agent. Instances of the demultiplexer in other agents do not receive
dogstatsd metrics.

* Update last stable version to 7.44.1 (#17438)

Signed-off-by: Nicolas Guerguadj <[email protected]>

* update packages to fix vulnerabilities in dependencies (#17418)

* do not use reflection for shallow copy (#17421)

This commit implements ShallowCopy for the pb.Span and pb.TraceChunk types.
The previous reflection-based implementation caused too much overhead in the
main processing loop, resulting in unacceptable performance loss.

This also adds tests to ensure that the ShallowCopy functions are correct.

* fix auto multi-line integration config (#17447)

* fix auto multi-line integration config

* reno

* update tests

* Update release.json and Go modules for 6/7.46.0-rc.2 (#17452)

* [CWS] reset events_stats to a PERCPU_ARRAY instead of a HASHMAP (#17473)

* Bump ncurses to 6.4 to fix CVE-2023-29491 (#17493)

* Kacper murzyn/7.45.0 changelog backport (#17489)

* 7.45.0 changelog (#17394)

* Release date updated

* Update latest stable agent version to 7.45.0 (#17491)

* fix subscriptionId fetching on azure (#17495)

* [SBOM] Remove `DeleteBlobs` from the sbom cache (#17465)

* remove delete missing blobs

* remove test

* fix strconv

* change from code review

* fix typo

* [CWS] fix duration suffix parsing (#17476)

* convert remaining users of old `golang-lru` to new generics based version (#17467)

* convert dogstatsd mapper cache to lru/v2

* convert network process cache to lru/v2

* convert network conntracker to lru/v2

* convert trivy cache to lru/v2

* convert network gateway lookup to lru/v2

* cleanup dependencies

* fix licenses

* fix conntracker tests

* fix conntrack debug

* [CWS] pre-alloc msg tags (#17434)

* silence error log about `DD_API_KEY` in internal profiler (#17371)

* Bump golang.org/x/sys from 0.3.0 to 0.8.0 in /pkg/gohai (#17106)

Bumps [golang.org/x/sys](https://github.com/golang/sys) from 0.3.0 to 0.8.0.
- [Commits](https://github.com/golang/sys/compare/v0.3.0...v0.8.0)

---
updated-dependencies:
- dependency-name: golang.org/x/sys
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <[email protected]>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

* [Gohai] Add common elements of the future new API (#17221)

* chore(gohai): use go version 1.18 to have generics

* feat(gohai): implement Value type

* feat(gohai): implement AsJSON and Initialize

* fix(gohai): fix lint warnings

* docs(gohai): add copyright in new files

* feat(gohai): add NewValueFrom method in Value

* feat(gohai): display suffix field tag in AsJSON

* Fix typo in pkg/gohai/utils/common.go

Co-authored-by: Nicolas Guerguadj <[email protected]>

* fix(gohai): address review comments

* feat(gohai): simplify common, remove Initialize

* docs(gohai): address comments review feedback

* feat(gohai): simplify AsJSON logic

* feat(gohai): return warnings as list of strings in AsJSON

* fix(gohai): fix common tests

* docs(gohai): fix comments/naming related review feedback

* test(gohai): simplify tests following pr review

---------

Co-authored-by: Nicolas Guerguadj <[email protected]>

* CWS: sync BTFhub constants (#17498)

Co-authored-by: paulcacheux <[email protected]>

* Bump golang.org/x/tools from 0.9.1 to 0.9.3 in /pkg/security/secl (#17479)

* Bump golang.org/x/tools from 0.9.1 to 0.9.3 in /pkg/security/secl

Bumps [golang.org/x/tools](https://github.com/golang/tools) from 0.9.1 to 0.9.3.
- [Release notes](https://github.com/golang/tools/releases)
- [Commits](https://github.com/golang/tools/compare/v0.9.1...v0.9.3)

---
updated-dependencies:
- dependency-name: golang.org/x/tools
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <[email protected]>

* Auto-generate go.sum and LICENSE-3rdparty.csv changes

---------

Signed-off-by: dependabot[bot] <[email protected]>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: dependabot[bot] <dependabot[bot]@users.noreply.github.com>

* Bump github.com/stretchr/testify from 1.8.3 to 1.8.4 in /pkg/security/secl (#17478)

* Bump github.com/stretchr/testify in /pkg/security/secl

Bumps [github.com/stretchr/testify](https://github.com/stretchr/testify) from 1.8.3 to 1.8.4.
- [Release notes](https://github.com/stretchr/testify/releases)
- [Commits](https://github.com/stretchr/testify/compare/v1.8.3...v1.8.4)

---
updated-dependencies:
- dependency-name: github.com/stretchr/testify
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <[email protected]>

* Auto-generate go.sum and LICENSE-3rdparty.csv changes

---------

Signed-off-by: dependabot[bot] <[email protected]>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: dependabot[bot] <dependabot[bot]@users.noreply.github.com>

* Bump requests from 2.30.0 to 2.31.0 in /test/e2e/cws-tests (#17428)

Bumps [requests](https://github.com/psf/requests) from 2.30.0 to 2.31.0.
- [Release notes](https://github.com/psf/requests/releases)
- [Changelog](https://github.com/psf/requests/blob/main/HISTORY.md)
- [Commits](https://github.com/psf/requests/compare/v2.30.0...v2.31.0)

---
updated-dependencies:
- dependency-name: requests
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <[email protected]>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

* Bump docker from 6.1.2 to 6.1.3 in /test/e2e/cws-tests (#17427)

Bumps [docker](https://github.com/docker/docker-py) from 6.1.2 to 6.1.3.
- [Release notes](https://github.com/docker/docker-py/releases)
- [Commits](https://github.com/docker/docker-py/compare/6.1.2...6.1.3)

---
updated-dependencies:
- dependency-name: docker
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <[email protected]>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

* Bump datadog-api-client from 2.12.0 to 2.13.1 in /test/e2e/cws-tests (#17429)

Bumps [datadog-api-client](https://github.com/DataDog/datadog-api-client-python) from 2.12.0 to 2.13.1.
- [Release notes](https://github.com/DataDog/datadog-api-client-python/releases)
- [Changelog](https://github.com/DataDog/datadog-api-client-python/blob/master/CHANGELOG.md)
- [Commits](https://github.com/DataDog/datadog-api-client-python/compare/2.12.0...2.13.1)

---
updated-dependencies:
- dependency-name: datadog-api-client
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <[email protected]>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

* [system-probe] only increment unregisters metric if delete actually occurs (#17402)

* only increment unregisters if delete actually occurs

* measure time from start of delete function, only increment if no err

* Bump github.com/prometheus/procfs from 0.10.0 to 0.10.1 (#17347)

Bumps [github.com/prometheus/procfs](https://github.com/prometheus/procfs) from 0.10.0 to 0.10.1.
- [Release notes](https://github.com/prometheus/procfs/releases)
- [Commits](https://github.com/prometheus/procfs/compare/v0.10.0...v0.10.1)

---
updated-dependencies:
- dependency-name: github.com/prometheus/procfs
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <[email protected]>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

* Fix duplicate prebuilt module in use during tests (#17472)

Because this is a global object, and the tests load prebuilt modules a bunch, you can end up with a long list of the same string. Example output:

```
&{{[] 0xc0013ada10} map[] map[closed_conn_dropped:0 conn_dropped:0 conns_bpf_map_size:18 conns_closed:1 kprobes_missed:0 kprobes_triggered:2] map[conntrack:{true 10 2032871} oomKill:{false 0 0} runtimeSecurity:{false 0 0} tcpQueueLength:{false 0 0} tracer:{true 10 3072361} usm:{true 10 2701840}] 2 map[tracer:1 usm:1] [offset-guess offset-guess offset-guess offset-guess offset-guess offset-guess offset-guess offset-guess offset-guess offset-guess offset-guess offset-guess offset-guess offset-guess tracer offset-guess dns tracer offset-guess dns tracer offset-guess dns tracer offset-guess dns tracer offset-guess dns tracer offset-guess dns tracer offset-guess dns tracer offset-guess dns tracer offset-guess dns tracer offset-guess dns tracer offset-guess dns tracer offset-guess dns tracer offset-guess dns tracer offset-guess dns tracer offset-guess dns tracer offset-guess dns tracer offset-guess dns tracer offset-guess dns usm tracer dns usm tracer offset-guess dns tracer offset-guess dns tracer offset-guess dns tracer offset-guess dns tracer offset-guess dns tracer offset-guess dns tracer offset-guess dns tracer offset-guess dns tracer offset-guess dns tracer offset-guess dns tracer offset-guess dns tracer offset-guess dns tracer offset-guess dns tracer offset-guess dns tracer offset-guess dns tracer offset-guess dns tracer offset-guess dns tracer offset-guess dns tracer offset-guess dns tracer offset-guess dns tracer offset-guess dns tracer offset-guess dns tracer offset-guess dns tracer offset-guess dns tracer offset-guess dns tracer offset-guess dns tracer offset-guess dns tracer offset-guess dns tracer offset-guess dns tracer offset-guess dns tracer offset-guess dns tracer offset-guess dns tracer offset-guess dns tracer offset-guess dns tracer offset-guess dns dns dns dns dns dns dns dns dns dns dns dns dns dns dns dns dns dns dns dns dns dns dns dns dns dns dns dns dns dns dns dns dns dns dns dns dns dns dns dns dns dns dns dns dns dns dns dns dns dns dns dns dns dns dns offset-guess dns offset-guess dns offset-guess dns offset-guess dns offset-guess dns offset-guess dns offset-guess dns offset-guess dns offset-guess dns offset-guess dns offset-guess dns offset-guess dns offset-guess dns offset-guess dns offset-guess dns offset-guess dns offset-guess dns offset-guess dns dns offset-guess dns offset-guess dns offset-guess dns offset-guess dns offset-guess dns offset-guess dns offset-guess dns offset-guess dns offset-guess dns offset-guess dns offset-guess dns offset-guess dns offset-guess dns offset-guess dns offset-guess dns offset-guess dns offset-guess dns offset-guess dns offset-guess dns offset-guess dns offset-guess dns offset-guess dns offset-guess dns offset-guess dns offset-guess dns offset-guess dns offset-guess dns offset-guess dns offset-guess dns offset-guess dns offset-guess dns offset-guess dns offset-guess dns offset-guess dns offset-guess dns tracer offset-guess dns usm tracer offset-guess dns usm tracer offset-guess dns usm tracer offset-guess dns usm tracer offset-guess dns usm tracer offset-guess dns usm tracer offset-guess dns usm tracer offset-guess dns usm tracer offset-guess dns usm tracer offset-guess dns usm tracer offset-guess dns usm tracer offset-guess dns usm tracer offset-guess dns usm tracer offset-guess dns usm tracer offset-guess dns usm tracer offset-guess dns usm tracer offset-guess dns dns dns dns dns dns dns dns dns dns dns dns dns dns dns dns dns] map[] map[] map[] map[]}
```

* Add way to log trace_pipe from tests (#17339)

* Bump github.com/vektra/mockery/v2 from 2.26.1 to 2.28.1 in /internal/tools (#17424)

* Bump github.com/vektra/mockery/v2 in /internal/tools

Bumps [github.com/vektra/mockery/v2](https://github.com/vektra/mockery) from 2.26.1 to 2.28.1.
- [Release notes](https://github.com/vektra/mockery/releases)
- [Changelog](https://github.com/vektra/mockery/blob/master/docs/changelog.md)
- [Commits](https://github.com/vektra/mockery/compare/v2.26.1...v2.28.1)

---
updated-dependencies:
- dependency-name: github.com/vektra/mockery/v2
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <[email protected]>

* `inv -e security-agent.gen-mocks`

* `inv -e process-agent.gen-mocks`

---------

Signed-off-by: dependabot[bot] <[email protected]>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: Paul Cacheux <[email protected]>

* [CWS][SEC-3735] Check self tests results in e2e tests (#17387)

* Check self_test results in e2e tests

* Check self_test results in e2e tests

* Fix self_check tests

* fix python lint things

* fix python lint thing

* Changes after review

* fix python lint thing

* [CSPM] Resolve process env variables only if required (#17461)

* [system-probe] Handle/reduce stat cookie collisions (#17197)

* [system-probe] Add internal_profiling.delta_profiles option to system-probe (#17475)

* [CSPM] Fix flakyness of TestProcessInput/Sleeps (#17399)

* system-probe: Remove redundant call for IsAdjusted (#17345)

* npm: Remove connection entry from tcpStats map if the connection is TCP (#17353)

* deprecate usm configuration values (#17216)

* usm: Deprecated network_config.http_replace_rules in favor of service_monitoring_config.http_replace_rules

* usm: Deprecated network_config.max_tracked_http_connections in favor of service_monitoring_config.max_tracked_http_connections

* usm: Deprecated network_config.max_http_stats_buffered in favor of service_monitoring_config.max_http_stats_buffered

* usm: Fixed configuration test

* Added releasenotes

* Fixed CR

* Fixed kitchen tests

* Fixing CI

* Update releasenotes/notes/deprecating-usm-configuration-values-6c43a0181c2cc821.yaml

Co-authored-by: Ursula Chen <[email protected]>

* Remove test patches

* Fixed cr

---------

Co-authored-by: Ursula Chen <[email protected]>

* Cloud Service implementation for Azure App Service (#17483)

This PR is extending serverless Cloud Service support to web apps running in Azure App Service containers.

* [CWS] avoid exec bomb (#17435)

* [CWS] fix process schema (#17422)

* Bump github.com/open-policy-agent/opa from 0.53.0 to 0.53.1 (#17505)

Bumps [github.com/open-policy-agent/opa](https://github.com/open-policy-agent/opa) from 0.53.0 to 0.53.1.
- [Release notes](https://github.com/open-policy-agent/opa/releases)
- [Changelog](https://github.com/open-policy-agent/opa/blob/main/CHANGELOG.md)
- [Commits](https://github.com/open-policy-agent/opa/compare/v0.53.0...v0.53.1)

---
updated-dependencies:
- dependency-name: github.com/open-policy-agent/opa
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <[email protected]>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

* [CSPM] Do not allow http.send and opa.runtime rego builtins (#17409)

* Bump github.com/hashicorp/golang-lru/v2 from 2.0.2 to 2.0.3 (#17503)

Bumps [github.com/hashicorp/golang-lru/v2](https://github.com/hashicorp/golang-lru) from 2.0.2 to 2.0.3.
- [Release notes](https://github.com/hashicorp/golang-lru/releases)
- [Commits](https://github.com/hashicorp/golang-lru/compare/v2.0.2...v2.0.3)

---
updated-dependencies:
- dependency-name: github.com/hashicorp/golang-lru/v2
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <[email protected]>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

* [system-probe] Fix race in Stop() for tcp close consumer (#17511)

* [e2e] target agent-sandbox account by default with e2e tests (#17484)

* typo (#17430)

* process-monitor: Change owner (#17510)

* npm: Spare copying of active connection twice (#17351)

* process-monitor: Change loading order. (#17401)

The refactor forced every user of process-monitor to call initialize. We ensured the initialized is being called only once.
During the initialize phase we scanned all running processes and tried to trigger the callbacks. But since every user called
initialize by itself, we had a race between registering callbacks and scanning the process list.
Now we call initialize only once, at the monitor initialization, and by that ensuring no race exists, as callback registrations
happens before calling the initialization

* [e2e] bump test-infra-definition to v0.0.0-20230607143804-fef23444c9da (#17517)

* npm: Remove redundant err return (#17520)

* system-probe: Avoid unnecessary allocations for trace logs in hot-code-paths (#17354)

* system-probe: Avoid unnecessary allocations for trace logs in hot-code-paths

* Wrapped more logs

* npm: Changed dns resolution to get a set of IPs rather than a list. (#17358)

* npm: Changed dns resolution to get a set of IPs rather than a list.

* Reduce allocated space, for the average case

* Fix potentital use of uninitialized memory (#17490)

This fixes potential use of uninitialized memory when PyList_GetItem
returns NULL.

This code path is impossible to hit in practice with the current
versions of Python, as long as the object is a list and index is in
bounds, which is ensured by the prior call to PyList_Size. These
functions do not use the Python sequence protocol, so evil python code
can not supply incorrect length or throw an unexpected exception
either.

* Bump github.com/stretchr/testify from 1.8.2 to 1.8.4 in /pkg/gohai (#17363)

Bumps [github.com/stretchr/testify](https://github.com/stretchr/testify) from 1.8.2 to 1.8.4.
- [Release notes](https://github.com/stretchr/testify/releases)
- [Commits](https://github.com/stretchr/testify/compare/v1.8.2...v1.8.4)

---
updated-dependencies:
- dependency-name: github.com/stretchr/testify
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <[email protected]>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

* allow snapshot to fail (#17386)

* add JSON decoder for activity dumps (#17444)

* add activity tree stats in activity dump list command (#17369)

* fix secprofile unstable guards (#17509)

* use the remote storage from a command line (#17525)

* Adding shared pool monitoring for Oracle databases (#17360)

* finished

* release notes

* relnotes correction

* review

* Adding more sysmetrics to Oracle monitoring  (#17466)

* both metrics

* send only 60s interval for sysmetrics

* latency & refactoring

* bg cpu usage

* indexes

* io

* more metrics

* new metrics

* metrics

* completed

* completed

* release notes

* disable cursor cache hit ratio

* Revert "[CWS][SEC-3735] Check self tests results in e2e tests (#17387)" (#17526)

This reverts commit e83efae53b0554a45557fa1b5f3e7e2f378502f4.

* MetricSecurityProfileAnomalyDetectionGenerated tracks the number of generated anomalies (#17462)

* [CWS] fix race when playing snapshot process data (#17527)

* AP-2099 Prevent jobs that trigger child pipelines to download artefacts (#17117)

* Fix broken loop (#17534)

* Report conntrack ebpf module loading telemetry (#17539)

* [fakeintake] add godoc (#17474)

* [fakeintake] add godoc

* [e2e] fix test example

* [fakeintake] add helpers to client to get payload names

* [fakeintake] move s in api.Payload inside doc link

* [e2e] bump test-infra to 20230607221957

* [e2e] add logs example

* [e2e] fix test-infra version

* [e2e] remove unused config file

* usm: process monitor: Call heavy operation only if needed (#17457)

* usm: process monitor: Call heavy operation only if needed

From now on, we're scanning already running processes if and only if there are registered exec callbacks.
Furthermore, we maintain 2 atomic booleans to indicate if we have any exec or exit callbacks, if we don't
have, then we're sparing mutex acquiring

* Added documentation

* Removed filed

* Update java integration tests to use latest layers. (#17194)

* Add workaround for database connection loss (#17486)

* implemented

* release notes

* Update releasenotes/notes/connection-loss-workaround-c457738d985fda2a.yaml

Co-authored-by: Austin Lai <[email protected]>

* Update pkg/collector/corechecks/oracle-dbm/oracle.go

Co-authored-by: Alexandre Normand <[email protected]>

* removed comments

* corrected syntac errors after merging

---------

Co-authored-by: Austin Lai <[email protected]>
Co-authored-by: Alexandre Normand <[email protected]>

* [CWS] remove load controller (#17220)

* [CWS] rework secprofile warmup tests (#17377)

* (rcm) simplify the RC thin client (#17468)

* (rcm) simplify the RC thin client

* simplify listeners as well

* fix apm and security agent

* fix cws profiles

* fix apm client

* CWS: sync BTFhub constants (#17550)

Co-authored-by: paulcacheux <[email protected]>

* https java tests use local https server (#17067)

https java tests use local https server

* [CWS] revert snapshot event playing  (#17553)

* [CWS] do not play snapshot for now

* remove test

* deprecate more usm values (#17342)

* Fixed bug in configuration

* usm: Deprecated system_probe_config.http_map_cleaner_interval_in_s in favor of service_monitoring_config.http_map_cleaner_interval_in_s

* usm: Deprecated system_probe_config.http_idle_connection_ttl_in_s in favor of service_monitoring_config.http_idle_connection_ttl_in_s

* usm: Deprecated network_config.http_notification_threshold in favor of service_monitoring_config.http_notification_threshold

* usm: Deprecated network_config.http_max_request_fragment in favor of service_monitoring_config.http_max_request_fragment

* usm: Added releasenotes

* Fixed file name linter

* Addressed CR comments

* usm: Use apply default

* Fixed test

* added missing import

* Fixed imports

* Adds DD_RESOURCE_GROUP and DD_SUBSCRIPTION_ID to env vars (#17558)

* rtloader: Use execinfo only on glibc (#15256)

Use execinfo only on glibc.
Functions in execinfo.h are GNU extensions and not available on other C libraries like musl.

We used to use libexecinfo package (A quick-n-dirty BSD licensed clone of the GNU libc backtrace facility.) of Alpine Linux to build datadog-agent on Alpine, but it has been removed since Alpine 3.17.
This PR allow to build datadog-agent on Alpine Linux and other non-glibc environments.

* Remove a no more used SBOM check config parameter (#17405)

* Adjust default value for Oracle check interval (#17551)

* adapted the default value

* reverted

* changed default in the factory

* remove init in config

* Add new invoke task to test buildimage update (#17241)

* Add new invoke task to test buildimage update

* Use new utils method in invoke task and more tests

* Bump emoji from 2.2.0 to 2.4.0 in /test/e2e/cws-tests (#17425)

Bumps [emoji](https://github.com/carpedm20/emoji) from 2.2.0 to 2.4.0.
- [Release notes](https://github.com/carpedm20/emoji/releases)
- [Changelog](https://github.com/carpedm20/emoji/blob/master/CHANGES.md)
- [Commits](https://github.com/carpedm20/emoji/compare/v2.2.0...v2.4.0)

---
updated-dependencies:
- dependency-name: emoji
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <[email protected]>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

* Bump github.com/itchyny/gojq from 0.12.12 to 0.12.13 (#17442)

Bumps [github.com/itchyny/gojq](https://github.com/itchyny/gojq) from 0.12.12 to 0.12.13.
- [Release notes](https://github.com/itchyny/gojq/releases)
- [Changelog](https://github.com/itchyny/gojq/blob/main/CHANGELOG.md)
- [Commits](https://github.com/itchyny/gojq/compare/v0.12.12...v0.12.13)

---
updated-dependencies:
- dependency-name: github.com/itchyny/gojq
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <[email protected]>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

* [CWS] remove unused arg from `fill_exec_context` (#17579)

* chore(gohai): update gopsutil/v3 to 3.23.2 (#17500)

* mount docker socket to dev container (#17385)

* add semver to requirements.txt (#17384)

* [DCA][Autodiscovery] Add more context to error log (#17464)

* USMO-259 - Support Java Async frameworks (#16346)

* - added support for java Async frameworks ioctl messages

* changed java tls structs and maps to support async instrumentation's messages

* refactored names and cleaned debug logs

* files location and names refactoring

* more code documentation

* - refactored java tls to work with tail calls

* - initializing connection_by_peer_key on stack, after the split to tail calls, we don't reach the stack limit anymore

* fixed compilation and verifier errors on 4.14

* fixed compilation error

* added java tls tail-calls to undefined probs list

* - removed unused map

- fixed the check if java tls is working before adding the tail calls

* - fixed the check for enabling java tls program

* added java tls tail calls to exclude list for shared_libraries_test.go

* fixed error in a previous commit conflict merge

* [PROC-2913] Create protobuf definitions for process workload stream server (#17497)

* Create proto definitions

* Update workloadmeta.proto

* Build proto files

* Add `eventId` field

* Apply moises' suggestions

* [RCM] Fix rc config deletion (#17581)

* Fix rc config deletion

* Cleanup

* Add test

* bump `ebpf-manager` to latest (#17585)

* bump `ebpf-manager` to latest

* `inv -e generate-licenses`

* [Gohai][ASC-471] implement cpu collection using sysctl syscall (#17556)

* feat(gohai): implement cpu collection using sysctl syscall

* Update release note

* Update releasenotes/notes/gohai-darwin-cpu-native-a931acf4d9d543ae.yaml

Co-authored-by: Heston Hoffman <[email protected]>

---------

Co-authored-by: Heston Hoffman <[email protected]>

* Add tests to CI (#17541)

* [USM] don't flood logs when a process is not java (#17590)

[Debug] java pid 26055 attachment rejected

* Fix the formating for debug log in SetAgentMetadata (#17382)

* [process-agent] Create WorkloadMetaExtractor v1 (#17448)

* Create wlm extractor

* Initial workloadmeta changes

* Add extractor tests

* Fix import cycle

* Add some tracing for QA

* Fixed an edge case where the map key != proc.pid

* Add tracing for QA

* Add release note

* Added caching and produce events

* Apply guy's suggestion and check in `grpc.go`

* Update pkg/languagedetection/languagemodels/types.go

Co-authored-by: Guy Arbitman <[email protected]>

* Update pkg/process/metadata/workloadmeta/grpc.go

Co-authored-by: Guy Arbitman <[email protected]>

* Fix linter errors

* Update create-wlm-extractor-e408e2826cc77be8.yaml

Removed comments

* Update pkg/process/metadata/workloadmeta/workloadmeta.go

Co-authored-by: Moisés Botarro <[email protected]>

* Update pkg/process/metadata/workloadmeta/extractor_test.go

Co-authored-by: Moisés Botarro <[email protected]>

* Address comments

* add debug log on instantiation

* apply suggestions

* Add benchmark for sprintf vs itoa

* Fix flaky test

* Add trace log and fix comment

---------

Co-authored-by: Guy Arbitman <[email protected]>
Co-authored-by: Moisés Botarro <[email protected]>

* [usm] Add ability to report payload telemetry (#17544)

* [usm] Add ability to report payload telemetry

* Require USM payload telemetry to be explicitly declared

* Rename `OptTelemetry` to `OptPayloadTelemetry`

* Add unit test

* Update the `test-infra-definitions` dependency in `test/new-e2e` (#17566)

* Revert "[usm] Improve `incompleteBuffer` (#17164)" (#17593)

This reverts commit a0481de26a4a68c1e6fb228294774e8916f37943.

* DD_SERVICE_MAPPING in extension (#17189)

* DD_SERVICE_MAPPING in extension

* lint

* release note

* edit release note

* make DD_SERVICE_MAPPING src code split up into smaller parts for easier testing, fix tests, leverage config pkg

* gofmt

* add serverless prefix

* Update releasenotes/notes/serverless-DD-SERVICE-MAPPING-594cc2cb7d090473.yaml

Co-authored-by: Ursula Chen <[email protected]>

* trigger ci

* cover same key and value, add more bad input tests

* add new test cases

* format

---------

Co-authored-by: Ursula Chen <[email protected]>

* Improves python check docs to use virtualenv and sort out PYTHONPATH when needed (#17569)

* Adds docs to use virtualenv and sort out PYTHONPATH when needed

* Adds feedback from PR comments

* Adds note about needing -p arg for virtualenv

* Bump github.com/hashicorp/golang-lru/v2 in /pkg/security/secl (#17599)

Bumps [github.com/hashicorp/golang-lru/v2](https://github.com/hashicorp/golang-lru) from 2.0.2 to 2.0.3.
- [Release notes](https://github.com/hashicorp/golang-lru/releases)
- [Commits](https://github.com/hashicorp/golang-lru/compare/v2.0.2...v2.0.3)

---
updated-dependencies:
- dependency-name: github.com/hashicorp/golang-lru/v2
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <[email protected]>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

* CWS: sync BTFhub constants (#17608)

Co-authored-by: paulcacheux <[email protected]>

* Upgrade to OpenSSL 3 in Agent 7, upgrade Python 3 to 3.9.17 (#17501)

Co-authored-by: Florent Clarret <[email protected]>

* Bump golang.org/x/sys from 0.8.0 to 0.9.0 in /pkg/security/secl (#17600)

* Bump golang.org/x/sys from 0.8.0 to 0.9.0 in /pkg/security/secl

Bumps [golang.org/x/sys](https://github.com/golang/sys) from 0.8.0 to 0.9.0.
- [Commits](https://github.com/golang/sys/compare/v0.8.0...v0.9.0)

---
updated-dependencies:
- dependency-name: golang.org/x/sys
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <[email protected]>

* Auto-generate go.sum and LICENSE-3rdparty.csv changes

---------

Signed-off-by: dependabot[bot] <[email protected]>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: dependabot[bot] <dependabot[bot]@users.noreply.github.com>

* Fix username generation on windows (#17547)

* [USM] tests RunDockerServer/RunHostServer log pid (#17587)

* [pkg/netflow] Collect `flow_process_nf_errors_count` metric from goflow2 (#17460)

* Collect flow_process_nf_errors_count goflow@ metric

* Add release note

* [CWS] remove unused mount group id field (#17222)

* [CWS] remove unused mount group id field

* update docs

* [CWS] cgroup resolver: use hashmap instead of LRU to track workload PIDs (#17487)

* [CWS] cgroup resolver: use hashmap instead of LRU to track workload PIDs

* add missing increment

* [AP-2139] Add amazonlinux2023 to the kitchen tests (#17548)

* Add amazonlinux2023 to the kitchen tests

* use 2023 as default to test as branch build launches only default

* [workloadmeta][process] Bootstrap process entities in workloadmeta (#17327)

* [CSPM] Make sure we do not create zombie processes in our tests (#17609)

* [CSPM] Make sure we do not create zombie processes in our tests

* add more details in case of test failure

* feat: support provisioned concurrency and proactive initialization (#17014)

* feat: support provisioned concurrency and proactive initialization

* feat: Set init time at beginning of agent init, unless that would somehow overlap the lambda start span

* feat: Use real time object until we need to convert

* feat: Use platform.initStart to begin the cold start span

* feat: specs

* fix: Update logs_test

* Fix: Set init time on EC

* feat: No string interpolation if we don't need it

* feat: Fix ecs proactiveInit boolean so it turns off after first invocation

* feat: Fix logs collector tags so we don't over-tag here

* feat: fmt

* feat: Add logs test for Proactive Initialization

* Bump snowflake-connector-python to 3.0.4 (#17445)

* [CWS] remote use of internal pointer (#16731)

* Include AAS metadata in span tags (#17591)

Currently, customers need to manually set the environment variable DD_AZURE_APP_SERVICES=true in order to make the traces include AAS metadata in tags. This change will eliminate that step and include these tags by detecting if we're on AAS. The PR also removes the old logic which adds these tags based on the environment variable DD_AZURE_APP_SERVICES.

* [RCM] Add rc client in flare (#17094)

* Add rc client in flare

* Add rc listeners

* Change constructor

* Fix AGENT_TASK read

* Cleanup

* Fix lint

* Address reviews

* Add release note

* Fix CI

* Fix CI

* Add mutex

* Address review

* Address review

* [secrets][tests] properly reset secrets backend timeout after test (#17614)

* [CWS] fix prerm scripts error logs (#17383)

* Handle missing result json file (#17537)

* [CWS] move arithmetic secl test to the secl package (#17610)

* [CWS] decouple a bit AD/Profile from probe (#17131)

* [CWS] cleanup runner before running btfhub sync job (#17629)

* cleanup runner before running btfhub sync job

* bump setup-go and remove cache step (included in v4)

* CWS: sync BTFhub constants (#17633)

Co-authored-by: paulcacheux <[email protected]>

* [corechecks/snmp] Refactor Profile Config (#17618)

* [CWS] rework secprofile tryAutolearn (#17535)

* Rework secprofile autoLearn func (including 2 fixes), and add 44 unitary tests arround it

* Fix go lint

* Apply review suggestion

* [Fix] Agent version cache not correctly loaded in multiple CI jobs (#17606)

* http2: remove packed enum values (#17586)

Signed-off-by: Guillaume Pagnoux <[email protected]>

* Make `nettop` available (#17458)

* [CWS] support kernel with usernamespaces arguments for security functions (#17634)

* remove unused function

* remove unused function

* PoC support new userns arg

* constantify the argument position selection

* fix same name issue

* horrible hack to pass the verifier

* [CWS] add unknown source for process entry (#17636)

* [CWS] update fallback constants for recent kernels (#17639)

* fix bpf map id constant

* fix bpf mai name offset

* fix bpf prog aux name offset

* move `kitchen_test_dummy_job_tmp` to k8s runners (#17641)

* [gitlab] Migration of unit tests CI jobs to k8s Gitlab runners (#17179)

Requires https://github.com/DataDog/datadog-agent-buildimages/pull/370 first.

This PR:
- updates the Linux build images used in the `datadog-agent` Gitlab CI pipelines to images that do not have an entrypoint script (required because our k8s Gitlab runner infrastructure overwrites the entrypoint of images, therefore we can't rely on it being run)
- updates all relevant CI scripts to run `source /root/.bashrc` at the very beginning, since this is not run in the entrypoint anymore
- updates all jobs in the `setup`, `deps_fetch`, `source_test`, `binary_build` stages to run on k8s runners instead of classic runners
- updates container-related unit tests to work when run in a k8s environment (thanks @L3n41c, cc @DataDog/container-integrations)
- skips a few gohai and gohai-related metadata unit tests that are failing on the arm64 rpm runner because `df` doesn't work in this specific setup, for reasons that remain to be investigated (cc @DataDog/agent-shared-components)
- adds a way to specify concurrency for `golangci-lint` invocations (see https://github.com/DataDog/datadog-agent/pull/15722 and https://github.com/DataDog/datadog-agent/pull/15762)
- fixes the `package_dependencies` jobs in the `kernel_matrix_testing` stage, which weren't using the correct `BUILDIMAGES_SUFFIX`. variable

Co-authored-by: Lénaïc Huard <[email protected]>

* [gitlab] Migrate docker publish jobs to k8s runners (#17270)

Migrates docker publishing jobs to the new Kubernetes-based runners.

The docker build jobs were migrated in #15511, but the publishing jobs are still using old runners.

* Add mutex to runtime settings (#17640)

* Process BTF archive nightly (#17621)

* Minor fixes to system-probe (#17622)

* Use correct module name in restart command

* Proper PingTCP/PingUDP cleanup

* [CWS Agent] RC rules override local rules if IDs conflict (#17573)

* reverse order of policy loading

* adding PolicyProviderType consts

* move enforcement of policy provider loading into a testable func

* pkg/flare: add missing APM variables to envvars (#17597)

This PR adds all APM environmental variables currently being used by 
the agent to the flare. Previously, some variables were missing and so
their values would not be represented when producing a flare.

* [Serverless] Use prebuilt opentelemetry lambda layers in integration tests. (#17568)

* Enable auto-instrumentation for python integration test.

* Update snapshot for otlp-python.

* Linting.

* Sort values of tag _dd.tags.container.

* Add encoding info to tailer info for the agent status verbose page (#17533)

* add encoding info to tailer info for the agent status verbose page

* push encoding information straight into tailer info

* moving adding tailer info to parser instead

* NIT

* [usm] Intern Kafka topic names (#17648)

* Add benchmark

* Intern topic name strings

* Fix data synchronization

* config/apm: fix parsing DD_APM_FEATURES (#17630)

Support either "," or " " as separator when parsing the value of DD_APM_FEATURES. It fixes a regression introduced in #15904 which changed the separator from comma to space. This was a breaking change. From 7.44 to 7.46 using a space as separator was suggested ad a workaround, this PR ensures we don't break compatibility again. We now support either space or comma.

* [CWS] do not handle broken lineage during snapshot (#17624)

* Revert "[CWS] do not handle broken lineage during snapshot (#17624)" (#17656)

This reverts commit 970566077529b17147e7c74129344c7b35f766b3.

* [CWS] Improve tryAutolearn unit tests by making fake events to have a valid lineage (#17657)

* [CWS] fix overlayfs inode read on kernel 5.19 and higher (#17644)

* dbg output

* xfs hacky solution to go around the 300MB limit

* PoC test fix overlayfs

* pipe constant param to select the lower inode selection

* implement kernel version check for feature detection

* small fix

* implement function probing based detection

* apply suggested review changes

* Revert "Revert "[CWS] do not handle broken lineage during snapshot (#17624)" (#17656)" (#17658)

This reverts commit 1ad88863157f73078068eab05bda4ea00ddadb58.

* Report config mutation events from the agent

* Create initial config for DDQA (#15675)

* Create initial config for DDQA

* Adding USM

* Adding NDM

* Update config.toml

* add ebpf-platform

* [tools] Adding ASC team to DDQA initial config

* Fixing USM jira project

* Add Agent Platform

* Add platform integrations team

* Add APM

* Add Remote Config

* Add Container-[Integrations-Ecosystems]

* Add Security And Compliance Agent

* Add Processes

* Change RCM issue type to QA

* Add Windows Agent

* [container-app] add qa team metadata

* Network Performance Monitoring

* Update .ddqa/config.toml - Database Monitoring

* Add Windows Kernel Integrations

* add final team Agent Integrations

* Use QA Task for Windows Agent

* [tools] Update task type for agent-shared-components QA issue generation

We now have a new task type for ASC QA operations that should be used

* final update

https://github.com/DataDog/ddqa/pull/13

* rename top-level option

* remove `changelog/no-changelog` as an ignored label

* [ddqa] exclude members for team ASC

* ddqa: AML exclude_members.

* Updating excluded devs for CI team

* Add CSPM Agent

* Exclude olivielpeau from Agent Platform QA

---------

Co-authored-by: Guy Arbitman <[email protected]>
Co-authored-by: Florian Veaux <[email protected]>
Co-authored-by: Alexander Nicholas Costas <[email protected]>
Co-authored-by: Bryce Kahle <[email protected]>
Co-authored-by: Srdjan Grubor <[email protected]>
Co-authored-by: Kylian Serrania <[email protected]>
Co-authored-by: Sarah Witt <[email protected]>
Co-authored-by: Katie Hockman <[email protected]>
Co-authored-by: Baptiste Foy <[email protected]>
Co-authored-by: Cedric Lamoriniere <[email protected]>
Co-authored-by: Paul Cacheux <[email protected]>
Co-authored-by: Moises Botarro <[email protected]>
Co-authored-by: Julien Lebot <[email protected]>
Co-authored-by: fisherevans <[email protected]>
Co-authored-by: Lee Avital <[email protected]>
Co-authored-by: Joel Marcotte <[email protected]>
Co-authored-by: Rich Lancia <[email protected]>
Co-authored-by: Pierre Gimalac <[email protected]>
Co-authored-by: Remy Mathieu <[email protected]>
Co-authored-by: Kacper <[email protected]>
Co-authored-by: David du Colombier <[email protected]>
Co-authored-by: Alexandre Menasria <[email protected]>

* dump silent workloads (#17412)

* [CWS] constantify `vm_flags` access in `vm_area_struct` (#17662)

* constantify `vm_flags` access in `vm_area_struct`

* re-gen constants

* regression detector: change baseline variant from latest main to merge base (#17449)

* regression detector: add compute merge base job

We need to compute the merge base of a non-`main` branch with respect
to `main` in order to establish the commit SHA of a baseline variant
in regression detection. This commit adds a job to compute that merge
base, echo the result to a file (sans newline), and then upload that
result to the Single-Machine Performance S3 bucket for Agent team.

Signed-off-by: Geoffrey M. Oxberry <[email protected]>

* regression detector: note build step we may need

The existing Single-Machine Performance (SMP) regression detector
setup doesn't build additional containers -- it just publishes
containers already built in Agent CI to SMP's ECR for Agent
images. I've written the new "merge-base" container assuming we can
continue with that strategy, but if we can't continue with that
strategy for some reason, then this commit includes a bunch of
comments sketching out a backup plan that would build the container we
would need and publish it to SMP's ECR for Agent images.

Signed-off-by: Geoffrey M. Oxberry <[email protected]>

* regression detectr: sketch functional test tweaks

As in the spirit of the previous commit, this commit adds some
comments on how to transition the existing setup in
`functional_test/regression_detector.yml` from using the latest commit
on `main` to using a "merge base" baseline SHA.

There are a few parts of this commit that are actually functional, but
shouldn't functionally alter the existing regression detector output
-- it still uses a "latest `main`" baseline SHA -- but it does
introduce the idea of getting the new baseline SHA from an artifact in
a previous stage, rather than uploading it to S3 (although I also
upload the merge base baseline SHA to S3 as well, as a contingency
plan).

Signed-off-by: Geoffrey M. Oxberry <[email protected]>

* smp, docker_linux.yml: use literal for image

I thought using a `!reference` tag would work for specifying an image
for the merge-base-computing job, but that mental model may be wrong,
so this commit replaces that tag with the literal it's supposed to
(de)reference.

Signed-off-by: Geoffrey M. Oxberry <[email protected]>

* smp, docker_linux.yml: comment out merge base job

The merge base job isn't running because no runners are configured to
run it. This issue could be one that I can't resolve, so it could be
that I need to get permissions to run it. For now, I'll comment out
this job and instead try and get the information necessary to compute
a baseline as part of the regression detector job, but without
otherwise altering the regression detector behavior.

Signed-off-by: Geoffrey M. Oxberry <[email protected]>

* regression_detector.yml: compute merge base

As a test of whether I can compute the merge base using a `git fetch`
command, this commit adds that command (and other supporting commands)
to the regression detector job to see if these commands will work.

Signed-off-by: Geoffrey M. Oxberry <[email protected]>

* smp, docker_linux.yml: add container build notes

I've managed to figure out how to get the most important case to work
-- the one in which the base branch of the pull request (i.e., in
GitLab terms, the target branch of the merge request) is
`main`. Having managed to get this case to work, this commit updates
my implementation notes to reflect how to implement the case when the
base branch of the pull request *is not* `main`.

Signed-off-by: Geoffrey M. Oxberry <[email protected]>

* regression detector: set baseline to merge base

It looks like creating a separate job (at least in
`.gitlab/container_build/docker_linux.yml`) will not work because
GitLab won't run it in a runner. This behavior may be a configuration
setting for security reasons, and may require repo admin privileges to
change. Instead, this commit implements setting the baseline SHA to
the merge base, along with some commands to abort the regression
detector job if the base branch of the pull request (target branch of
the merge request) is not `main`, and copies the merge base SHA to S3
for debugging/auditing purposes.

Signed-off-by: Geoffrey M. Oxberry <[email protected]>

* regression_detector.yml: remove obsolete comments

Now that I'm pretty sure I've figured out how to change the baseline
SHA to the merge base of a pull request, this commit deletes most of
the commented-out lines I introduced into
`.gitlab/functional_test/regression_detector.yml` as notes to
myself. These comments are no longer necessary for record-keeping.

Signed-off-by: Geoffrey M. Oxberry <[email protected]>

* docker_linux.yml: correct line comment

This commit corrects a line comment in which I write "Use this line if
comparing `main` against itself makes sense" when I should have
written "Use this line if comparing `main` against itself does not
make sense".

Signed-off-by: Geoffrey M. Oxberry <[email protected]>

* regression_detector.yml: make if block a one-liner

GitLab apparently doesn't echo multiline commands by default, so this
commit rewrites this multiline `if` block as a one line command for
debugging purposes.

Signed-off-by: Geoffrey M. Oxberry <[email protected]>

* regression_detector.yml: remove backticks

I'm pretty sure that the "command `main` not found" message I saw was
because the shell was likely interpreting those backticks as running a
command. This commit replaces the backitcks with single quotes to
avoid having the shell attempt to run a command called `main`.

Signed-off-by: Geoffrey M. Oxberry <[email protected]>

* regression_detector.yml: disable base branch check

This commit temporarily disables the check for the base branch because
it doesn't quite seem to work at the moment, and it adds enough
debugging output so I can get some idea of why that check does not
work.

Signed-off-by: Geoffrey M. Oxberry <[email protected]>

* regression_detector.yml: add artifacts + cleanup

This commit cleans up the last bits of debugging comments and code in
`.gitlab/functional_test/regression_detector.yml`, plus it adds the
reporting outputs as artifacts to provide additional diagnostic output
for debugging purposes.

Signed-off-by: Geoffrey M. Oxberry <[email protected]>

* docker_linux.yml: remove comment cruft

This commit removes a large comment block I put in
`.gitlab/container_build/docker_linux.yml` because I no longer think
it's a good idea to compute the merge base commit in a job separate
from the regression detector job.

Signed-off-by: Geoffrey M. Oxberry <[email protected]>

* regression detector: compute baseline via loop

The previous regression detector baseline SHA computation assumed that
an image exists in ECR for the commit returned by `$(git commit
merge-base HEAD main)`, *i.e.*, the merge base with `main`. However,
the container associated with that commit may not exist because that
container may have failed to build, or may have failed to upload to
ECR, so we need to check whether that container exists in ECR. If that
container exists, then the merge base with `main` is the baseline
SHA. If not, then we must iterate over predecessor commits in `main`
until we find a commit in `main` for which a container exists in
ECR. The first commit we find in this loop becomes the baseline SHA
for the regression detector.

This commit implements that check, along with a loop that iterates
over predecessor commits, if necessary, as described above. The
implementation is currently a rather verbose one-liner because
single-line statements are easier to debug in GitLab CI. Once this
implementation succeeds, a subsequent commit will clean up the
implementation by changing it to a multi-line statement, after which
this branch will be ready for review.

Signed-off-by: Geoffrey M. Oxberry <[email protected]>

* regression detector: reduce stderr noise

The initial implementation of the container existence check turns out
to be pretty noisy when a container doesn't exist because `awscli`
will output a bunch of error information on failure. While this
information is helpful for debugging purposes, it will be annoying in
CI, so this commit redirects the `stderr` of that command to
`/dev/null` to reduce noise in the CI output of the regression
detector CI job.

Signed-off-by: Geoffrey M. Oxberry <[email protected]>

* regression detector: fix copy before write error

The regression detector job in the Agent CI pipeline currently fails
because it attempts to copy a file that doesn't exist to S3. This
commit fixes that error by moving the copy command to a point after
the file is generated.

Signed-off-by: Geoffrey M. Oxberry <[email protected]>

* regression detector: tweak ecr query for debugging

For some reason, the `aws ecr describe-images` command is not finding
images that I know exist. To aid in debugging at the expense of noise,
this commit removes redirection of `stderr` to `/dev/null`, adds the
`--registry-id` flag to be more explicit about ECR repo location, and
also moves the `--profile` flag and its argument to a more readable
location.

Signed-off-by: Geoffrey M. Oxberry <[email protected]>

* regression detector: remove "latest main" job

The baseline-SHA-computing steps in the
`single-machine-performance-regression_detector` job make computing
the commit of "latest `main`" unnecessary, so this commit deletes the
job that does that computation.

Signed-off-by: Geoffrey M. Oxberry <[email protected]>

* regression detector: remove loop cleanup comment

Due to falling behind on Agent performance investigations, I don't
think I'm going to get to clean up the long, one line loop statement,
so this commit deletes that aspirational comment -- it can be cleaned
up in a subsequent pull request.

Signed-off-by: Geoffrey M. Oxberry <[email protected]>

* regression detector: pin pr-commenter version

This commit attempts to triage the `pr-commenter` errors by pinning
the `pr-commenter` version in a fashion similar to that used by
other Datadog repositories (e.g., `dogweb`).

Signed-off-by: Geoffrey M. Oxberry <[email protected]>

* regression detector: delete more stale comments

For now, I don't plan to move the baseline SHA computation into a
separate job, although that may happen later. Given this change of
plans, this commit removes the stale comment regarding refactoring the
baseline SHA computation to a separate job.

Signed-off-by: Geoffrey M. Oxberry <[email protected]>

---------

Signed-off-by: Geoffrey M. Oxberry <[email protected]>

* stop dumping workloads with a stable event type (#17536)

* [CWS] run functional tests on al2023 (#17612)

* run CWS functional tests on al2023

* skip the SELinux/sel_disable test on al2023

* fix sizeof_inode and tty_offset fallback constants on al2023 kernel 6.1

* fix import

* skip docker rc tests on al2023

* fix `vm_area_struct_flags_offset` on al2023

* Run system-probe tests using kernel matrix testing scenario (#16406)

* system-probe-test_spec.go added

* compile system-probe-test_spec and add to dependencies

* fix base image

* move test spec into test dir

* comment out wip

* use x64 images for packaging dependencies

* print outputs json

* save output ips to file

* print stack.outputs

* fix typo

* flush buffer to file

* move stack.outputs to CI_PROJECT_DIR

* connect to microvm

* fix gitlab yaml error

* use amis available in build-stable

* fix invoke command

* define AWS_REGION env var

* use kernel matrix testing image

* fix build tags

* use ssh private key to ssh

* explanatory comment

* fix path export

* save ssh key file to CI_PROJECT_DIR

* get ssh key info

* define AWS_SSH_KEY variable

* turn off StrictHostKeyChecking

* fix typo

* set ssh key file perms

* verbose ssh output

* change ssh file name and use the same name

* fix aws ssh key name

* change ssh key name

* try with root user

* change ssh user name back to 'ubuntu'

* set BatchMode to prevent passphare query

* set owner to user 1000

* some debugging

* fix ssh key perms

* add new line to ssh key file

* turn of host key verification when connecting with micro-vm

* use dedicated pulumi image

* run init script

* fix micro-vm-init.sh script path when scp'ing

* wrap inner ssh command in quotes

* pass arch to microvm init

* give full path of dependencies

* copy all shared dependencies and strip-components when extracting

* use go version 1.19 and run tests

* cross compile

* fix dir path of system-probe-tests

* retrieve go deps

* tidy-all

* use new tenant

* print .aws/config file

* use string replacer

* use exact s3 bucket as new-e2e

* use k8 runners for env job

* remove security groups and subnets, and change ssh key param name

* set correct env name

* add copyright

* fix micro-vm-init.sh and move it in the dependencies

* fix yaml syntax

* make GOVERSION string

* add agent-qa amis

* set GOARCH as env var

* change ssh key name

* try without key pair name

* use pulumi image

* reintroduce key pair config value

* change docker image tag

* try agent-ci-sandbox as ssh key name

* remove printing of aws_config

* add default key pair name: datadog-agent-ci

* update amis

* update amis

* update vmconfig. Run only am64

* fix arm64 package name

* update arm64 ami id

* fix typo in yaml

* add go_deps dependency

* disable arm64 distros for now

* add dummy provider for copy file

* fix command provider init

* provider args fix package

* unique name for command provider

* add go_deps as dependency to test job

* set go_tools_dep, and switch to new profile in tests

* use pulumi docker image

* add debug logs of ssh keys

* retry if failed to dial libvirt

* add comment and log

* tidy all

* make it possible to easily run scenario in dev machine

* fix dependency fetching step

* easier to launch in dev machine and do not use sudo

* fix infra-env in env setup job

* lint python

* fix ssh key name

* incorrect image tag introduced

* reintroduce custom arm64 kernels

* exec perms on micro-vm-init.sh

* apply suggestions

* avoid priniting out ssm read to debug output

* use script to setup dependencies before tests

* fix config

* fix function name

* remove shutdown command as it is moved in scenario, and cleanup path names

* use fallback of env vars

* simplify output streaming

* shutdown period must be int

* create and download junit and testjson files

* set up testjson and junit package in init script

* exit with system-probe test exit code

* fix shutdown option

* small fixes

* pass current environment to cmd

* fix junit and testjson download

* add env for DD_SYSTEM_PROBE_JAVA_DIR in new test spec

* fix delete path of junit and testjson package

* fix NewRunner after test-infra-definitions update

* pass arch to fetch_dependencies

* return error code of tests

* add '$' to resolve INSTANCE_IP

* Simplify micro-vm-init.sh IP address command

* Minor improvements

* Add timeouts

* Handle errors in outputsToFile

* Update AMIs

* Update to handle bundle-less test run

* Fix minor nits

* Update x86_64 AMI

* Update arm64 AMI

* add kvt to on_system_probe_changes_or_manual

* allow_failure for kvt jobs

* run all tests despite errors

* change stack name

* Fix json glob pattern

* update test-infra-definitions

* remove custom kernels and use distribution images

* remove custom kernels from vmconfig

* run tests only on x86_64

* update test-infra-definitions

* add arm64 tests back in

* fix variable replacement bash

* update ami

* Collect failed tests and output them at the end

* Output kernel release

* fix BTF_DIR path

* always color output if supported

* Use GO_VERSION from runner docker image

* force color

* only output FAIL if test has a name

* Do not return error if we only have failed tests

* do test json review at end to properly fail job

* Fix binary path

* specify binary name and main.go path

* Add missing end single quote

* Force color in review too

* add cleanup job

* provide secret sudo-password

* make kmt job manual

* revert back to running job on system-probe changes

* run dummy job on system-probe changes

* update test-infra-definitions

* fix grep

* fix grep for pattern '-instance-ip'

* set empty path when no private key path provided, to prevent pulumi from attempting to read ssh key file by guessing path

* change instance types to c6i.metal for intel and c6g.metal for arm

* split arm64 and x86 kmt tests

* fix parallel test runs

* fix dependencies in cleanup job

* fail if retrying tests

* run python linter

* use storage optimized instances for both intel and arm

* use compute optimized instance for arm

* allow all kernel_matrix_testing jobs to fail

* make kmt jobs manual

* change arm instance type to m6g and x86 to m5

---------

Co-authored-by: Bryce Kahle <[email protected]>

* Expose agent telemetry on system-probe UDS (#17652)

* [new-e2e] use standard-verbose format when verbose is True (#17660)

* add missing filter_tag envvars to config (#17653)

This commit adds the filter_tag envvars (DD_APM_FILTER_TAGS_REQUIRE
and DD_APM_FILTER_TAGS_REJECT) to the config_template.yaml file.
Previously, these envvars were not represented with the filter_tag 
parameter, so adding these in can improve documentation.

* [serverless] add `peer.service` to inferred spans (#17414)

* add `peerService` constant

* add `peerService` in `Span.Meta` tags

* remove logic from `span_enrichment.go`

* move logic to `lifecycle.go`

* Double Agent replicate counts (#17664)

* Double Agent replicate counts

We intend to automatically expand the number of replicates if a possible wobble
is detected. In order to approach the goal of doing this automatically we first
need to validate that consistently running this number of replicates is
1. feasible and 2. we draw more clear results by this doubling. There is work
required on our side to adjust the statistics, so in some sense this PR is a
change that will lead to a change that will lead to a change.

REF SMP-599
REF SMP-333

Signed-off-by: Brian L. Troutwine <[email protected]>

* empty commit to trigger CI

Signed-off-by: Brian L. Troutwine <[email protected]>

---------

Signed-off-by: Brian L. Troutwine <[email protected]>

* Remove dependency on github.com/iovisor/gobpf for single function (#17649)

* Remove dependency on github.com/iovisor/gobpf for single function

* Fix copyright check

* Add test

* More fixes from CWS module name change (#17650)

* Fix cyclical import

* Correctly handle empty error messages

* CWS: sync BTFhub constants (#17679)

Co-authored-by: paulcacheux <[email protected]>

* AP-2062 Change version of builders image and change kitchen cleanup task (#17529)

* Change version of builders image and change kitchen cleanup task

* remove manual trigger

* Update builder image

* [USM] Monitor & HTTP refactor (#17283)

* http: telemetry: remove unused error value

Signed-off-by: Guillaume Pagnoux <[email protected]>

* protocols: add protocol registration

Signed-off-by: Guillaume Pagnoux <[email protected]>

* protocols: add http protocol

Signed-off-by: Guillaume Pagnoux <[email protected]>

* monitor: handle excluded functions

Signed-off-by: Guillaume Pagnoux <[email protected]>

* ebpfProgram: remove now unused mapCleaner field

Signed-off-by: Guillaume Pagnoux <[email protected]>

* pkg/network: fix network state tests

Signed-off-by: Guillaume Pagnoux <[email protected]>

* protocols: remove ProtocolKind type

Signed-off-by: Guillaume Pagnoux <[email protected]>

* http: fix call to NewTelemetry

Signed-off-by: Guillaume Pagnoux <[email protected]>

* fix: remove old go build constraints

Signed-off-by: Guillaume Pagnoux <[email protected]>

* http: remove nil checks in pointer receivers

Signed-off-by: Guillaume Pagnoux <[email protected]>

* fix: rename loop variables

Signed-off-by: Guillaume Pagnoux <[email protected]>

* protocols: document types

Signed-off-by: Guillaume Pagnoux <[email protected]>

* monitor_test: use getHttpStats

Signed-off-by: Guillaume Pagnoux <[email protected]>

* protocols: map protocols in monitor & remove use of init

Signed-off-by: Guillaume Pagnoux <[email protected]>

* monitor: stop process monitor first

Signed-off-by: Guillaume Pagnoux <[email protected]>

* monitor: fix log message

Signed-off-by: Guillaume Pagnoux <[email protected]>

* monitor: document initProtocol

Signed-off-by: Guillaume Pagnoux <[email protected]>

* http: document ConfigureOptions

Signed-off-by: Guilla…
zARODz11z pushed a commit that referenced this pull request Jul 14, 2023
…ndbox

add unit test assertions and edit eventsample

release note

lint

use Errorf

Split extract trace context into two helper functions, attach sampling priority, fix event sample, and fix tests

fmt

fix struct

usm: Replace kprobes with tracepoints (#17698)

* usm: Replace kprobes with tracepoints

Using tracepoints is encouraged as the ABI is stable and change less over the years, furthermore, unlike kprobes, tracepoints
does not suffer from misses.

* Fixed cr

* Fixed CI failures

* Fixed CI

Bump github.com/twmb/franz-go/pkg/kadm from 1.8.0 to 1.8.1 (#16934)

Bumps [github.com/twmb/franz-go/pkg/kadm](https://github.com/twmb/franz-go) from 1.8.0 to 1.8.1.
- [Release notes](https://github.com/twmb/franz-go/releases)
- [Changelog](https://github.com/twmb/franz-go/blob/master/CHANGELOG.md)
- [Commits](https://github.com/twmb/franz-go/compare/v1.8.0...pkg/kadm/v1.8.1)

---
updated-dependencies:
- dependency-name: github.com/twmb/franz-go/pkg/kadm
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <[email protected]>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

Bump github.com/uptrace/bun/driver/pgdriver from 1.1.12 to 1.1.14 (#17292)

Bumps [github.com/uptrace/bun/driver/pgdriver](https://github.com/uptrace/bun) from 1.1.12 to 1.1.14.
- [Release notes](https://github.com/uptrace/bun/releases)
- [Changelog](https://github.com/uptrace/bun/blob/master/CHANGELOG.md)
- [Commits](https://github.com/uptrace/bun/compare/v1.1.12...v1.1.14)

---
updated-dependencies:
- dependency-name: github.com/uptrace/bun/driver/pgdriver
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <[email protected]>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

Bump go.mongodb.org/mongo-driver from 1.11.4 to 1.11.7 (#17507)

Bumps [go.mongodb.org/mongo-driver](https://github.com/mongodb/mongo-go-driver) from 1.11.4 to 1.11.7.
- [Release notes](https://github.com/mongodb/mongo-go-driver/releases)
- [Commits](https://github.com/mongodb/mongo-go-driver/compare/v1.11.4...v1.11.7)

---
updated-dependencies:
- dependency-name: go.mongodb.org/mongo-driver
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <[email protected]>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

Bump github.com/uptrace/bun from 1.1.12 to 1.1.14 (#17332)

Bumps [github.com/uptrace/bun](https://github.com/uptrace/bun) from 1.1.12 to 1.1.14.
- [Release notes](https://github.com/uptrace/bun/releases)
- [Changelog](https://github.com/uptrace/bun/blob/master/CHANGELOG.md)
- [Commits](https://github.com/uptrace/bun/compare/v1.1.12...v1.1.14)

---
updated-dependencies:
- dependency-name: github.com/uptrace/bun
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <[email protected]>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

Bump github.com/uptrace/bun/dialect/pgdialect from 1.1.12 to 1.1.14 (#17294)

Bumps [github.com/uptrace/bun/dialect/pgdialect](https://github.com/uptrace/bun) from 1.1.12 to 1.1.14.
- [Release notes](https://github.com/uptrace/bun/releases)
- [Changelog](https://github.com/uptrace/bun/blob/master/CHANGELOG.md)
- [Commits](https://github.com/uptrace/bun/compare/v1.1.12...v1.1.14)

---
updated-dependencies:
- dependency-name: github.com/uptrace/bun/dialect/pgdialect
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <[email protected]>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

Bump go.mongodb.org/mongo-driver from 1.11.4 to 1.11.6 (#16970)

* Bump go.mongodb.org/mongo-driver from 1.11.4 to 1.11.6

Bumps [go.mongodb.org/mongo-driver](https://github.com/mongodb/mongo-go-driver) from 1.11.4 to 1.11.6.
- [Release notes](https://github.com/mongodb/mongo-go-driver/releases)
- [Commits](https://github.com/mongodb/mongo-go-driver/compare/v1.11.4...v1.11.6)

---
updated-dependencies:
- dependency-name: go.mongodb.org/mongo-driver
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <[email protected]>

* Added licenses

---------

Signed-off-by: dependabot[bot] <[email protected]>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: Guy Arbitman <[email protected]>

Bump github.com/go-sql-driver/mysql from 1.7.0 to 1.7.1 (#16767)

* Bump github.com/go-sql-driver/mysql from 1.7.0 to 1.7.1

Bumps [github.com/go-sql-driver/mysql](https://github.com/go-sql-driver/mysql) from 1.7.0 to 1.7.1.
- [Release notes](https://github.com/go-sql-driver/mysql/releases)
- [Changelog](https://github.com/go-sql-driver/mysql/blob/master/CHANGELOG.md)
- [Commits](https://github.com/go-sql-driver/mysql/compare/v1.7.0...v1.7.1)

---
updated-dependencies:
- dependency-name: github.com/go-sql-driver/mysql
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <[email protected]>

* Added licenses

---------

Signed-off-by: dependabot[bot] <[email protected]>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: Guy Arbitman <[email protected]>

Bump github.com/opencontainers/runtime-spec (#17576)

Bumps [github.com/opencontainers/runtime-spec](https://github.com/opencontainers/runtime-spec) from 1.1.0-rc.2 to 1.1.0-rc.3.
- [Release notes](https://github.com/opencontainers/runtime-spec/releases)
- [Changelog](https://github.com/opencontainers/runtime-spec/blob/main/ChangeLog)
- [Commits](https://github.com/opencontainers/runtime-spec/compare/v1.1.0-rc.2...v1.1.0-rc.3)

---
updated-dependencies:
- dependency-name: github.com/opencontainers/runtime-spec
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <[email protected]>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

Bump the version of PIP3 to 23.0.1 in omnibus (#17785)

Retry functional tests (#17752)

* add retry mechanism and include, exclude of packages

* fix package filtering

* fix incorrect initialization of packagesLs

* tar junit and testjson of each attempt

* add new line before attempt header

* reword attempt header

* build list of directories containing json

* review all attempts

* makes headers magenta

* download multiple junit and testjson archives

* have cleanup jobs for both success and failure

* pass retry count to micro-vm-init.sh

* add explanatory comments and set CIVisibility to full path

* make include packages parameter explicit

* change function name

* resolve suggestions

* remove debug log

Bump github.com/hashicorp/golang-lru/v2 from 2.0.3 to 2.0.4 (#17816)

Bumps [github.com/hashicorp/golang-lru/v2](https://github.com/hashicorp/golang-lru) from 2.0.3 to 2.0.4.
- [Release notes](https://github.com/hashicorp/golang-lru/releases)
- [Commits](https://github.com/hashicorp/golang-lru/compare/v2.0.3...v2.0.4)

---
updated-dependencies:
- dependency-name: github.com/hashicorp/golang-lru/v2
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <[email protected]>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

Bump github.com/prometheus/procfs from 0.10.1 to 0.11.0 (#17815)

Bumps [github.com/prometheus/procfs](https://github.com/prometheus/procfs) from 0.10.1 to 0.11.0.
- [Release notes](https://github.com/prometheus/procfs/releases)
- [Commits](https://github.com/prometheus/procfs/compare/v0.10.1...v0.11.0)

---
updated-dependencies:
- dependency-name: github.com/prometheus/procfs
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <[email protected]>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

Fix log component line location

Since the log component is, for now, just a proxy to the pkg/log the
line shown in the output was wrong. It used to always show the log
component location.

Bump github.com/imdario/mergo from 0.3.15 to 0.3.16 (#17357)

Bumps [github.com/imdario/mergo](https://github.com/imdario/mergo) from 0.3.15 to 0.3.16.
- [Release notes](https://github.com/imdario/mergo/releases)
- [Commits](https://github.com/imdario/mergo/compare/v0.3.15...v0.3.16)

---
updated-dependencies:
- dependency-name: github.com/imdario/mergo
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <[email protected]>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

[RCM] Add error log rate limit (#17802)

* Add error log rate limit

* Cleanup

[e2e] cleanup task (#17709)

* [task] add clean up invoke task for pulumi stacks

* [task] fix linter

* [task] fix import sort

* [task] remove stack prefix filter when removing stacks

* [task] fix flake8 linting

* [task] add option to remove stacks in e2e cleanup

* [task] remove stack instead of destroying it

add more feature flags from CWS into inventories (#17788)

[CWS] remove useless `defer os.RemoveAll` (#17819)

* remove useless `defer os.RemoveAll`

* fix lint

Bump github.com/opencontainers/image-spec from 1.1.0-rc2.0.20221005185240-3a7f492d3f1b to 1.1.0-rc.3 (#16830)

* Bump github.com/opencontainers/image-spec

Bumps [github.com/opencontainers/image-spec](https://github.com/opencontainers/image-spec) from 1.1.0-rc2.0.20221005185240-3a7f492d3f1b to 1.1.0-rc.3.
- [Release notes](https://github.com/opencontainers/image-spec/releases)
- [Changelog](https://github.com/opencontainers/image-spec/blob/main/RELEASES.md)
- [Commits](https://github.com/opencontainers/image-spec/commits/v1.1.0-rc.3)

---
updated-dependencies:
- dependency-name: github.com/opencontainers/image-spec
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <[email protected]>

* Regenerate LICENSE-3rdparty.csv

* Bump github.com/opencontainers/image-spec

Bumps [github.com/opencontainers/image-spec](https://github.com/opencontainers/image-spec) from 1.1.0-rc2 to 1.1.0-rc.3.
- [Release notes](https://github.com/opencontainers/image-spec/releases)
- [Changelog](https://github.com/opencontainers/image-spec/blob/main/RELEASES.md)
- [Commits](https://github.com/opencontainers/image-spec/commits/v1.1.0-rc.3)

---
updated-dependencies:
- dependency-name: github.com/opencontainers/image-spec
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <[email protected]>
Signed-off-by: Lénaïc Huard <[email protected]>

---------

Signed-off-by: dependabot[bot] <[email protected]>
Signed-off-by: Lénaïc Huard <[email protected]>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: Lénaïc Huard <[email protected]>
Co-authored-by: Lénaïc Huard <[email protected]>

[CWS] generate config in tests after opts check (#17818)

[CWS] restore profiles metrics (#17821)

Add configuration variables to the local run config files (#17812)

* add to process agent config path a path which includes config as well

Fix race condition in the test (#17825)

Occasionally, the test tcp connection would be closed while the test
body is still executing, causing a race condtion between assert on
listener.tailers and stopTailer.

Since the test assumes that the connection stays open until the end of
the test, make sure it stays open by adding an explicit close at the
end of the test body.

[USM] add missing map dump (#17805)

* [USM] add missing map dumps

Signed-off-by: Guillaume Pagnoux <[email protected]>

* do not dump static maps

Signed-off-by: Guillaume Pagnoux <[email protected]>

---------

Signed-off-by: Guillaume Pagnoux <[email protected]>

[CWS] Fix activity dump local storage files eviction (#17828)

stop ticker (#17584)

[CWS] bump of test dependencies (#17823)

* bump `github.com/google/pprof` to `v0.0.0-20230602150820-91b7bce49751`

* bump `github.com/iceber/iouring-go` to `v0.0.0-20230403020409-002cfd2e2a90`

* `inv -e generate-licenses`

[RCM-918] Add flare source (#17772)

* Add flare source

* Fix CI

* Fix test

* Address review

[omnibus] Set rpmlog verbosity to RPMLOG_ERR in OpenSCAP (#17829)

This change sets rpmlog verbosity to RPMLOG_ERR in OpenSCAP's RPM probe.

This should prevent the RPM library from printing the following
warnings, which appear when opening the RPM database with the
Berkeley DB compatibility mode:

warning: Found bdb_ro Packages database while attempting sqlite backend: using bdb_ro backend.
warning: could not open /var/lib/rpm/Filetriggername: No such file or directory
warning: could not open /var/lib/rpm/Transfiletriggername: No such file or directory
warning: could not open /var/lib/rpm/Recommendname: No such file or directory
warning: could not open /var/lib/rpm/Suggestname: No such file or directory
warning: could not open /var/lib/rpm/Supplementname: No such file or directory
warning: could not open /var/lib/rpm/Enhancename: No such file or directory

Fix check on file existence in security agent upstart script (#17834)

[Windows][Installer] Update installed file permissions (#17374)

Use ACE inheritance in C:\ProgramData\Datadog

Add/remove an explicit ace for ddagentuser instead of replacing the whole DACL

[USM] fix http_allow_packet() (#17800)

Merge my temporary feature branch into main (#17771)

* The skeleton of logging telemetry events from the cluster agent (#17397)

* The skeleton of logging telemetry events from the cluster agent

* Fix lint and unit test failures

* Address the first set of review comments

* Use ResetClient instead of plain HTTP client per comment

* Factor out getRemoteConfigPatchEvent into a separate function per review comment

* Refactor the way we get ClusterId per comment

* Send telemetry events when a cluster agent mutates a remote config (#17663)

* fix windows nanoserver crash on glog v1.1.x (#17340)

* fix windows nanoserver crash on glog v1.1.x

* remove unused go mod replaces

* [CWS] fix activity tree for busybox utils (#17415)

* [CWS] fix activity tree for busybox utils

* update comment

* Fix reporting of conflicting telemetry metrics (#17417)

Only use the limiter (and thus, send telemetry) from the core
agent. Instances of the demultiplexer in other agents do not receive
dogstatsd metrics.

* Update last stable version to 7.44.1 (#17438)

Signed-off-by: Nicolas Guerguadj <[email protected]>

* update packages to fix vulnerabilities in dependencies (#17418)

* do not use reflection for shallow copy (#17421)

This commit implements ShallowCopy for the pb.Span and pb.TraceChunk types.
The previous reflection-based implementation caused too much overhead in the
main processing loop, resulting in unacceptable performance loss.

This also adds tests to ensure that the ShallowCopy functions are correct.

* fix auto multi-line integration config (#17447)

* fix auto multi-line integration config

* reno

* update tests

* Update release.json and Go modules for 6/7.46.0-rc.2 (#17452)

* [CWS] reset events_stats to a PERCPU_ARRAY instead of a HASHMAP (#17473)

* Bump ncurses to 6.4 to fix CVE-2023-29491 (#17493)

* Kacper murzyn/7.45.0 changelog backport (#17489)

* 7.45.0 changelog (#17394)

* Release date updated

* Update latest stable agent version to 7.45.0 (#17491)

* fix subscriptionId fetching on azure (#17495)

* [SBOM] Remove `DeleteBlobs` from the sbom cache (#17465)

* remove delete missing blobs

* remove test

* fix strconv

* change from code review

* fix typo

* [CWS] fix duration suffix parsing (#17476)

* convert remaining users of old `golang-lru` to new generics based version (#17467)

* convert dogstatsd mapper cache to lru/v2

* convert network process cache to lru/v2

* convert network conntracker to lru/v2

* convert trivy cache to lru/v2

* convert network gateway lookup to lru/v2

* cleanup dependencies

* fix licenses

* fix conntracker tests

* fix conntrack debug

* [CWS] pre-alloc msg tags (#17434)

* silence error log about `DD_API_KEY` in internal profiler (#17371)

* Bump golang.org/x/sys from 0.3.0 to 0.8.0 in /pkg/gohai (#17106)

Bumps [golang.org/x/sys](https://github.com/golang/sys) from 0.3.0 to 0.8.0.
- [Commits](https://github.com/golang/sys/compare/v0.3.0...v0.8.0)

---
updated-dependencies:
- dependency-name: golang.org/x/sys
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <[email protected]>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

* [Gohai] Add common elements of the future new API (#17221)

* chore(gohai): use go version 1.18 to have generics

* feat(gohai): implement Value type

* feat(gohai): implement AsJSON and Initialize

* fix(gohai): fix lint warnings

* docs(gohai): add copyright in new files

* feat(gohai): add NewValueFrom method in Value

* feat(gohai): display suffix field tag in AsJSON

* Fix typo in pkg/gohai/utils/common.go

Co-authored-by: Nicolas Guerguadj <[email protected]>

* fix(gohai): address review comments

* feat(gohai): simplify common, remove Initialize

* docs(gohai): address comments review feedback

* feat(gohai): simplify AsJSON logic

* feat(gohai): return warnings as list of strings in AsJSON

* fix(gohai): fix common tests

* docs(gohai): fix comments/naming related review feedback

* test(gohai): simplify tests following pr review

---------

Co-authored-by: Nicolas Guerguadj <[email protected]>

* CWS: sync BTFhub constants (#17498)

Co-authored-by: paulcacheux <[email protected]>

* Bump golang.org/x/tools from 0.9.1 to 0.9.3 in /pkg/security/secl (#17479)

* Bump golang.org/x/tools from 0.9.1 to 0.9.3 in /pkg/security/secl

Bumps [golang.org/x/tools](https://github.com/golang/tools) from 0.9.1 to 0.9.3.
- [Release notes](https://github.com/golang/tools/releases)
- [Commits](https://github.com/golang/tools/compare/v0.9.1...v0.9.3)

---
updated-dependencies:
- dependency-name: golang.org/x/tools
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <[email protected]>

* Auto-generate go.sum and LICENSE-3rdparty.csv changes

---------

Signed-off-by: dependabot[bot] <[email protected]>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: dependabot[bot] <dependabot[bot]@users.noreply.github.com>

* Bump github.com/stretchr/testify from 1.8.3 to 1.8.4 in /pkg/security/secl (#17478)

* Bump github.com/stretchr/testify in /pkg/security/secl

Bumps [github.com/stretchr/testify](https://github.com/stretchr/testify) from 1.8.3 to 1.8.4.
- [Release notes](https://github.com/stretchr/testify/releases)
- [Commits](https://github.com/stretchr/testify/compare/v1.8.3...v1.8.4)

---
updated-dependencies:
- dependency-name: github.com/stretchr/testify
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <[email protected]>

* Auto-generate go.sum and LICENSE-3rdparty.csv changes

---------

Signed-off-by: dependabot[bot] <[email protected]>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: dependabot[bot] <dependabot[bot]@users.noreply.github.com>

* Bump requests from 2.30.0 to 2.31.0 in /test/e2e/cws-tests (#17428)

Bumps [requests](https://github.com/psf/requests) from 2.30.0 to 2.31.0.
- [Release notes](https://github.com/psf/requests/releases)
- [Changelog](https://github.com/psf/requests/blob/main/HISTORY.md)
- [Commits](https://github.com/psf/requests/compare/v2.30.0...v2.31.0)

---
updated-dependencies:
- dependency-name: requests
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <[email protected]>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

* Bump docker from 6.1.2 to 6.1.3 in /test/e2e/cws-tests (#17427)

Bumps [docker](https://github.com/docker/docker-py) from 6.1.2 to 6.1.3.
- [Release notes](https://github.com/docker/docker-py/releases)
- [Commits](https://github.com/docker/docker-py/compare/6.1.2...6.1.3)

---
updated-dependencies:
- dependency-name: docker
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <[email protected]>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

* Bump datadog-api-client from 2.12.0 to 2.13.1 in /test/e2e/cws-tests (#17429)

Bumps [datadog-api-client](https://github.com/DataDog/datadog-api-client-python) from 2.12.0 to 2.13.1.
- [Release notes](https://github.com/DataDog/datadog-api-client-python/releases)
- [Changelog](https://github.com/DataDog/datadog-api-client-python/blob/master/CHANGELOG.md)
- [Commits](https://github.com/DataDog/datadog-api-client-python/compare/2.12.0...2.13.1)

---
updated-dependencies:
- dependency-name: datadog-api-client
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <[email protected]>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

* [system-probe] only increment unregisters metric if delete actually occurs (#17402)

* only increment unregisters if delete actually occurs

* measure time from start of delete function, only increment if no err

* Bump github.com/prometheus/procfs from 0.10.0 to 0.10.1 (#17347)

Bumps [github.com/prometheus/procfs](https://github.com/prometheus/procfs) from 0.10.0 to 0.10.1.
- [Release notes](https://github.com/prometheus/procfs/releases)
- [Commits](https://github.com/prometheus/procfs/compare/v0.10.0...v0.10.1)

---
updated-dependencies:
- dependency-name: github.com/prometheus/procfs
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <[email protected]>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

* Fix duplicate prebuilt module in use during tests (#17472)

Because this is a global object, and the tests load prebuilt modules a bunch, you can end up with a long list of the same string. Example output:

```
&{{[] 0xc0013ada10} map[] map[closed_conn_dropped:0 conn_dropped:0 conns_bpf_map_size:18 conns_closed:1 kprobes_missed:0 kprobes_triggered:2] map[conntrack:{true 10 2032871} oomKill:{false 0 0} runtimeSecurity:{false 0 0} tcpQueueLength:{false 0 0} tracer:{true 10 3072361} usm:{true 10 2701840}] 2 map[tracer:1 usm:1] [offset-guess offset-guess offset-guess offset-guess offset-guess offset-guess offset-guess offset-guess offset-guess offset-guess offset-guess offset-guess offset-guess offset-guess tracer offset-guess dns tracer offset-guess dns tracer offset-guess dns tracer offset-guess dns tracer offset-guess dns tracer offset-guess dns tracer offset-guess dns tracer offset-guess dns tracer offset-guess dns tracer offset-guess dns tracer offset-guess dns tracer offset-guess dns tracer offset-guess dns tracer offset-guess dns tracer offset-guess dns tracer offset-guess dns tracer offset-guess dns tracer offset-guess dns usm tracer dns usm tracer offset-guess dns tracer offset-guess dns tracer offset-guess dns tracer offset-guess dns tracer offset-guess dns tracer offset-guess dns tracer offset-guess dns tracer offset-guess dns tracer offset-guess dns tracer offset-guess dns tracer offset-guess dns tracer offset-guess dns tracer offset-guess dns tracer offset-guess dns tracer offset-guess dns tracer offset-guess dns tracer offset-guess dns tracer offset-guess dns tracer offset-guess dns tracer offset-guess dns tracer offset-guess dns tracer offset-guess dns tracer offset-guess dns tracer offset-guess dns tracer offset-guess dns tracer offset-guess dns tracer offset-guess dns tracer offset-guess dns tracer offset-guess dns tracer offset-guess dns tracer offset-guess dns tracer offset-guess dns tracer offset-guess dns tracer offset-guess dns tracer offset-guess dns dns dns dns dns dns dns dns dns dns dns dns dns dns dns dns dns dns dns dns dns dns dns dns dns dns dns dns dns dns dns dns dns dns dns dns dns dns dns dns dns dns dns dns dns dns dns dns dns dns dns dns dns dns dns offset-guess dns offset-guess dns offset-guess dns offset-guess dns offset-guess dns offset-guess dns offset-guess dns offset-guess dns offset-guess dns offset-guess dns offset-guess dns offset-guess dns offset-guess dns offset-guess dns offset-guess dns offset-guess dns offset-guess dns offset-guess dns dns offset-guess dns offset-guess dns offset-guess dns offset-guess dns offset-guess dns offset-guess dns offset-guess dns offset-guess dns offset-guess dns offset-guess dns offset-guess dns offset-guess dns offset-guess dns offset-guess dns offset-guess dns offset-guess dns offset-guess dns offset-guess dns offset-guess dns offset-guess dns offset-guess dns offset-guess dns offset-guess dns offset-guess dns offset-guess dns offset-guess dns offset-guess dns offset-guess dns offset-guess dns offset-guess dns offset-guess dns offset-guess dns offset-guess dns offset-guess dns offset-guess dns tracer offset-guess dns usm tracer offset-guess dns usm tracer offset-guess dns usm tracer offset-guess dns usm tracer offset-guess dns usm tracer offset-guess dns usm tracer offset-guess dns usm tracer offset-guess dns usm tracer offset-guess dns usm tracer offset-guess dns usm tracer offset-guess dns usm tracer offset-guess dns usm tracer offset-guess dns usm tracer offset-guess dns usm tracer offset-guess dns usm tracer offset-guess dns usm tracer offset-guess dns dns dns dns dns dns dns dns dns dns dns dns dns dns dns dns dns] map[] map[] map[] map[]}
```

* Add way to log trace_pipe from tests (#17339)

* Bump github.com/vektra/mockery/v2 from 2.26.1 to 2.28.1 in /internal/tools (#17424)

* Bump github.com/vektra/mockery/v2 in /internal/tools

Bumps [github.com/vektra/mockery/v2](https://github.com/vektra/mockery) from 2.26.1 to 2.28.1.
- [Release notes](https://github.com/vektra/mockery/releases)
- [Changelog](https://github.com/vektra/mockery/blob/master/docs/changelog.md)
- [Commits](https://github.com/vektra/mockery/compare/v2.26.1...v2.28.1)

---
updated-dependencies:
- dependency-name: github.com/vektra/mockery/v2
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <[email protected]>

* `inv -e security-agent.gen-mocks`

* `inv -e process-agent.gen-mocks`

---------

Signed-off-by: dependabot[bot] <[email protected]>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: Paul Cacheux <[email protected]>

* [CWS][SEC-3735] Check self tests results in e2e tests (#17387)

* Check self_test results in e2e tests

* Check self_test results in e2e tests

* Fix self_check tests

* fix python lint things

* fix python lint thing

* Changes after review

* fix python lint thing

* [CSPM] Resolve process env variables only if required (#17461)

* [system-probe] Handle/reduce stat cookie collisions (#17197)

* [system-probe] Add internal_profiling.delta_profiles option to system-probe (#17475)

* [CSPM] Fix flakyness of TestProcessInput/Sleeps (#17399)

* system-probe: Remove redundant call for IsAdjusted (#17345)

* npm: Remove connection entry from tcpStats map if the connection is TCP (#17353)

* deprecate usm configuration values (#17216)

* usm: Deprecated network_config.http_replace_rules in favor of service_monitoring_config.http_replace_rules

* usm: Deprecated network_config.max_tracked_http_connections in favor of service_monitoring_config.max_tracked_http_connections

* usm: Deprecated network_config.max_http_stats_buffered in favor of service_monitoring_config.max_http_stats_buffered

* usm: Fixed configuration test

* Added releasenotes

* Fixed CR

* Fixed kitchen tests

* Fixing CI

* Update releasenotes/notes/deprecating-usm-configuration-values-6c43a0181c2cc821.yaml

Co-authored-by: Ursula Chen <[email protected]>

* Remove test patches

* Fixed cr

---------

Co-authored-by: Ursula Chen <[email protected]>

* Cloud Service implementation for Azure App Service (#17483)

This PR is extending serverless Cloud Service support to web apps running in Azure App Service containers.

* [CWS] avoid exec bomb (#17435)

* [CWS] fix process schema (#17422)

* Bump github.com/open-policy-agent/opa from 0.53.0 to 0.53.1 (#17505)

Bumps [github.com/open-policy-agent/opa](https://github.com/open-policy-agent/opa) from 0.53.0 to 0.53.1.
- [Release notes](https://github.com/open-policy-agent/opa/releases)
- [Changelog](https://github.com/open-policy-agent/opa/blob/main/CHANGELOG.md)
- [Commits](https://github.com/open-policy-agent/opa/compare/v0.53.0...v0.53.1)

---
updated-dependencies:
- dependency-name: github.com/open-policy-agent/opa
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <[email protected]>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

* [CSPM] Do not allow http.send and opa.runtime rego builtins (#17409)

* Bump github.com/hashicorp/golang-lru/v2 from 2.0.2 to 2.0.3 (#17503)

Bumps [github.com/hashicorp/golang-lru/v2](https://github.com/hashicorp/golang-lru) from 2.0.2 to 2.0.3.
- [Release notes](https://github.com/hashicorp/golang-lru/releases)
- [Commits](https://github.com/hashicorp/golang-lru/compare/v2.0.2...v2.0.3)

---
updated-dependencies:
- dependency-name: github.com/hashicorp/golang-lru/v2
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <[email protected]>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

* [system-probe] Fix race in Stop() for tcp close consumer (#17511)

* [e2e] target agent-sandbox account by default with e2e tests (#17484)

* typo (#17430)

* process-monitor: Change owner (#17510)

* npm: Spare copying of active connection twice (#17351)

* process-monitor: Change loading order. (#17401)

The refactor forced every user of process-monitor to call initialize. We ensured the initialized is being called only once.
During the initialize phase we scanned all running processes and tried to trigger the callbacks. But since every user called
initialize by itself, we had a race between registering callbacks and scanning the process list.
Now we call initialize only once, at the monitor initialization, and by that ensuring no race exists, as callback registrations
happens before calling the initialization

* [e2e] bump test-infra-definition to v0.0.0-20230607143804-fef23444c9da (#17517)

* npm: Remove redundant err return (#17520)

* system-probe: Avoid unnecessary allocations for trace logs in hot-code-paths (#17354)

* system-probe: Avoid unnecessary allocations for trace logs in hot-code-paths

* Wrapped more logs

* npm: Changed dns resolution to get a set of IPs rather than a list. (#17358)

* npm: Changed dns resolution to get a set of IPs rather than a list.

* Reduce allocated space, for the average case

* Fix potentital use of uninitialized memory (#17490)

This fixes potential use of uninitialized memory when PyList_GetItem
returns NULL.

This code path is impossible to hit in practice with the current
versions of Python, as long as the object is a list and index is in
bounds, which is ensured by the prior call to PyList_Size. These
functions do not use the Python sequence protocol, so evil python code
can not supply incorrect length or throw an unexpected exception
either.

* Bump github.com/stretchr/testify from 1.8.2 to 1.8.4 in /pkg/gohai (#17363)

Bumps [github.com/stretchr/testify](https://github.com/stretchr/testify) from 1.8.2 to 1.8.4.
- [Release notes](https://github.com/stretchr/testify/releases)
- [Commits](https://github.com/stretchr/testify/compare/v1.8.2...v1.8.4)

---
updated-dependencies:
- dependency-name: github.com/stretchr/testify
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <[email protected]>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

* allow snapshot to fail (#17386)

* add JSON decoder for activity dumps (#17444)

* add activity tree stats in activity dump list command (#17369)

* fix secprofile unstable guards (#17509)

* use the remote storage from a command line (#17525)

* Adding shared pool monitoring for Oracle databases (#17360)

* finished

* release notes

* relnotes correction

* review

* Adding more sysmetrics to Oracle monitoring  (#17466)

* both metrics

* send only 60s interval for sysmetrics

* latency & refactoring

* bg cpu usage

* indexes

* io

* more metrics

* new metrics

* metrics

* completed

* completed

* release notes

* disable cursor cache hit ratio

* Revert "[CWS][SEC-3735] Check self tests results in e2e tests (#17387)" (#17526)

This reverts commit e83efae53b0554a45557fa1b5f3e7e2f378502f4.

* MetricSecurityProfileAnomalyDetectionGenerated tracks the number of generated anomalies (#17462)

* [CWS] fix race when playing snapshot process data (#17527)

* AP-2099 Prevent jobs that trigger child pipelines to download artefacts (#17117)

* Fix broken loop (#17534)

* Report conntrack ebpf module loading telemetry (#17539)

* [fakeintake] add godoc (#17474)

* [fakeintake] add godoc

* [e2e] fix test example

* [fakeintake] add helpers to client to get payload names

* [fakeintake] move s in api.Payload inside doc link

* [e2e] bump test-infra to 20230607221957

* [e2e] add logs example

* [e2e] fix test-infra version

* [e2e] remove unused config file

* usm: process monitor: Call heavy operation only if needed (#17457)

* usm: process monitor: Call heavy operation only if needed

From now on, we're scanning already running processes if and only if there are registered exec callbacks.
Furthermore, we maintain 2 atomic booleans to indicate if we have any exec or exit callbacks, if we don't
have, then we're sparing mutex acquiring

* Added documentation

* Removed filed

* Update java integration tests to use latest layers. (#17194)

* Add workaround for database connection loss (#17486)

* implemented

* release notes

* Update releasenotes/notes/connection-loss-workaround-c457738d985fda2a.yaml

Co-authored-by: Austin Lai <[email protected]>

* Update pkg/collector/corechecks/oracle-dbm/oracle.go

Co-authored-by: Alexandre Normand <[email protected]>

* removed comments

* corrected syntac errors after merging

---------

Co-authored-by: Austin Lai <[email protected]>
Co-authored-by: Alexandre Normand <[email protected]>

* [CWS] remove load controller (#17220)

* [CWS] rework secprofile warmup tests (#17377)

* (rcm) simplify the RC thin client (#17468)

* (rcm) simplify the RC thin client

* simplify listeners as well

* fix apm and security agent

* fix cws profiles

* fix apm client

* CWS: sync BTFhub constants (#17550)

Co-authored-by: paulcacheux <[email protected]>

* https java tests use local https server (#17067)

https java tests use local https server

* [CWS] revert snapshot event playing  (#17553)

* [CWS] do not play snapshot for now

* remove test

* deprecate more usm values (#17342)

* Fixed bug in configuration

* usm: Deprecated system_probe_config.http_map_cleaner_interval_in_s in favor of service_monitoring_config.http_map_cleaner_interval_in_s

* usm: Deprecated system_probe_config.http_idle_connection_ttl_in_s in favor of service_monitoring_config.http_idle_connection_ttl_in_s

* usm: Deprecated network_config.http_notification_threshold in favor of service_monitoring_config.http_notification_threshold

* usm: Deprecated network_config.http_max_request_fragment in favor of service_monitoring_config.http_max_request_fragment

* usm: Added releasenotes

* Fixed file name linter

* Addressed CR comments

* usm: Use apply default

* Fixed test

* added missing import

* Fixed imports

* Adds DD_RESOURCE_GROUP and DD_SUBSCRIPTION_ID to env vars (#17558)

* rtloader: Use execinfo only on glibc (#15256)

Use execinfo only on glibc.
Functions in execinfo.h are GNU extensions and not available on other C libraries like musl.

We used to use libexecinfo package (A quick-n-dirty BSD licensed clone of the GNU libc backtrace facility.) of Alpine Linux to build datadog-agent on Alpine, but it has been removed since Alpine 3.17.
This PR allow to build datadog-agent on Alpine Linux and other non-glibc environments.

* Remove a no more used SBOM check config parameter (#17405)

* Adjust default value for Oracle check interval (#17551)

* adapted the default value

* reverted

* changed default in the factory

* remove init in config

* Add new invoke task to test buildimage update (#17241)

* Add new invoke task to test buildimage update

* Use new utils method in invoke task and more tests

* Bump emoji from 2.2.0 to 2.4.0 in /test/e2e/cws-tests (#17425)

Bumps [emoji](https://github.com/carpedm20/emoji) from 2.2.0 to 2.4.0.
- [Release notes](https://github.com/carpedm20/emoji/releases)
- [Changelog](https://github.com/carpedm20/emoji/blob/master/CHANGES.md)
- [Commits](https://github.com/carpedm20/emoji/compare/v2.2.0...v2.4.0)

---
updated-dependencies:
- dependency-name: emoji
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <[email protected]>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

* Bump github.com/itchyny/gojq from 0.12.12 to 0.12.13 (#17442)

Bumps [github.com/itchyny/gojq](https://github.com/itchyny/gojq) from 0.12.12 to 0.12.13.
- [Release notes](https://github.com/itchyny/gojq/releases)
- [Changelog](https://github.com/itchyny/gojq/blob/main/CHANGELOG.md)
- [Commits](https://github.com/itchyny/gojq/compare/v0.12.12...v0.12.13)

---
updated-dependencies:
- dependency-name: github.com/itchyny/gojq
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <[email protected]>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

* [CWS] remove unused arg from `fill_exec_context` (#17579)

* chore(gohai): update gopsutil/v3 to 3.23.2 (#17500)

* mount docker socket to dev container (#17385)

* add semver to requirements.txt (#17384)

* [DCA][Autodiscovery] Add more context to error log (#17464)

* USMO-259 - Support Java Async frameworks (#16346)

* - added support for java Async frameworks ioctl messages

* changed java tls structs and maps to support async instrumentation's messages

* refactored names and cleaned debug logs

* files location and names refactoring

* more code documentation

* - refactored java tls to work with tail calls

* - initializing connection_by_peer_key on stack, after the split to tail calls, we don't reach the stack limit anymore

* fixed compilation and verifier errors on 4.14

* fixed compilation error

* added java tls tail-calls to undefined probs list

* - removed unused map

- fixed the check if java tls is working before adding the tail calls

* - fixed the check for enabling java tls program

* added java tls tail calls to exclude list for shared_libraries_test.go

* fixed error in a previous commit conflict merge

* [PROC-2913] Create protobuf definitions for process workload stream server (#17497)

* Create proto definitions

* Update workloadmeta.proto

* Build proto files

* Add `eventId` field

* Apply moises' suggestions

* [RCM] Fix rc config deletion (#17581)

* Fix rc config deletion

* Cleanup

* Add test

* bump `ebpf-manager` to latest (#17585)

* bump `ebpf-manager` to latest

* `inv -e generate-licenses`

* [Gohai][ASC-471] implement cpu collection using sysctl syscall (#17556)

* feat(gohai): implement cpu collection using sysctl syscall

* Update release note

* Update releasenotes/notes/gohai-darwin-cpu-native-a931acf4d9d543ae.yaml

Co-authored-by: Heston Hoffman <[email protected]>

---------

Co-authored-by: Heston Hoffman <[email protected]>

* Add tests to CI (#17541)

* [USM] don't flood logs when a process is not java (#17590)

[Debug] java pid 26055 attachment rejected

* Fix the formating for debug log in SetAgentMetadata (#17382)

* [process-agent] Create WorkloadMetaExtractor v1 (#17448)

* Create wlm extractor

* Initial workloadmeta changes

* Add extractor tests

* Fix import cycle

* Add some tracing for QA

* Fixed an edge case where the map key != proc.pid

* Add tracing for QA

* Add release note

* Added caching and produce events

* Apply guy's suggestion and check in `grpc.go`

* Update pkg/languagedetection/languagemodels/types.go

Co-authored-by: Guy Arbitman <[email protected]>

* Update pkg/process/metadata/workloadmeta/grpc.go

Co-authored-by: Guy Arbitman <[email protected]>

* Fix linter errors

* Update create-wlm-extractor-e408e2826cc77be8.yaml

Removed comments

* Update pkg/process/metadata/workloadmeta/workloadmeta.go

Co-authored-by: Moisés Botarro <[email protected]>

* Update pkg/process/metadata/workloadmeta/extractor_test.go

Co-authored-by: Moisés Botarro <[email protected]>

* Address comments

* add debug log on instantiation

* apply suggestions

* Add benchmark for sprintf vs itoa

* Fix flaky test

* Add trace log and fix comment

---------

Co-authored-by: Guy Arbitman <[email protected]>
Co-authored-by: Moisés Botarro <[email protected]>

* [usm] Add ability to report payload telemetry (#17544)

* [usm] Add ability to report payload telemetry

* Require USM payload telemetry to be explicitly declared

* Rename `OptTelemetry` to `OptPayloadTelemetry`

* Add unit test

* Update the `test-infra-definitions` dependency in `test/new-e2e` (#17566)

* Revert "[usm] Improve `incompleteBuffer` (#17164)" (#17593)

This reverts commit a0481de26a4a68c1e6fb228294774e8916f37943.

* DD_SERVICE_MAPPING in extension (#17189)

* DD_SERVICE_MAPPING in extension

* lint

* release note

* edit release note

* make DD_SERVICE_MAPPING src code split up into smaller parts for easier testing, fix tests, leverage config pkg

* gofmt

* add serverless prefix

* Update releasenotes/notes/serverless-DD-SERVICE-MAPPING-594cc2cb7d090473.yaml

Co-authored-by: Ursula Chen <[email protected]>

* trigger ci

* cover same key and value, add more bad input tests

* add new test cases

* format

---------

Co-authored-by: Ursula Chen <[email protected]>

* Improves python check docs to use virtualenv and sort out PYTHONPATH when needed (#17569)

* Adds docs to use virtualenv and sort out PYTHONPATH when needed

* Adds feedback from PR comments

* Adds note about needing -p arg for virtualenv

* Bump github.com/hashicorp/golang-lru/v2 in /pkg/security/secl (#17599)

Bumps [github.com/hashicorp/golang-lru/v2](https://github.com/hashicorp/golang-lru) from 2.0.2 to 2.0.3.
- [Release notes](https://github.com/hashicorp/golang-lru/releases)
- [Commits](https://github.com/hashicorp/golang-lru/compare/v2.0.2...v2.0.3)

---
updated-dependencies:
- dependency-name: github.com/hashicorp/golang-lru/v2
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <[email protected]>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

* CWS: sync BTFhub constants (#17608)

Co-authored-by: paulcacheux <[email protected]>

* Upgrade to OpenSSL 3 in Agent 7, upgrade Python 3 to 3.9.17 (#17501)

Co-authored-by: Florent Clarret <[email protected]>

* Bump golang.org/x/sys from 0.8.0 to 0.9.0 in /pkg/security/secl (#17600)

* Bump golang.org/x/sys from 0.8.0 to 0.9.0 in /pkg/security/secl

Bumps [golang.org/x/sys](https://github.com/golang/sys) from 0.8.0 to 0.9.0.
- [Commits](https://github.com/golang/sys/compare/v0.8.0...v0.9.0)

---
updated-dependencies:
- dependency-name: golang.org/x/sys
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <[email protected]>

* Auto-generate go.sum and LICENSE-3rdparty.csv changes

---------

Signed-off-by: dependabot[bot] <[email protected]>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: dependabot[bot] <dependabot[bot]@users.noreply.github.com>

* Fix username generation on windows (#17547)

* [USM] tests RunDockerServer/RunHostServer log pid (#17587)

* [pkg/netflow] Collect `flow_process_nf_errors_count` metric from goflow2 (#17460)

* Collect flow_process_nf_errors_count goflow@ metric

* Add release note

* [CWS] remove unused mount group id field (#17222)

* [CWS] remove unused mount group id field

* update docs

* [CWS] cgroup resolver: use hashmap instead of LRU to track workload PIDs (#17487)

* [CWS] cgroup resolver: use hashmap instead of LRU to track workload PIDs

* add missing increment

* [AP-2139] Add amazonlinux2023 to the kitchen tests (#17548)

* Add amazonlinux2023 to the kitchen tests

* use 2023 as default to test as branch build launches only default

* [workloadmeta][process] Bootstrap process entities in workloadmeta (#17327)

* [CSPM] Make sure we do not create zombie processes in our tests (#17609)

* [CSPM] Make sure we do not create zombie processes in our tests

* add more details in case of test failure

* feat: support provisioned concurrency and proactive initialization (#17014)

* feat: support provisioned concurrency and proactive initialization

* feat: Set init time at beginning of agent init, unless that would somehow overlap the lambda start span

* feat: Use real time object until we need to convert

* feat: Use platform.initStart to begin the cold start span

* feat: specs

* fix: Update logs_test

* Fix: Set init time on EC

* feat: No string interpolation if we don't need it

* feat: Fix ecs proactiveInit boolean so it turns off after first invocation

* feat: Fix logs collector tags so we don't over-tag here

* feat: fmt

* feat: Add logs test for Proactive Initialization

* Bump snowflake-connector-python to 3.0.4 (#17445)

* [CWS] remote use of internal pointer (#16731)

* Include AAS metadata in span tags (#17591)

Currently, customers need to manually set the environment variable DD_AZURE_APP_SERVICES=true in order to make the traces include AAS metadata in tags. This change will eliminate that step and include these tags by detecting if we're on AAS. The PR also removes the old logic which adds these tags based on the environment variable DD_AZURE_APP_SERVICES.

* [RCM] Add rc client in flare (#17094)

* Add rc client in flare

* Add rc listeners

* Change constructor

* Fix AGENT_TASK read

* Cleanup

* Fix lint

* Address reviews

* Add release note

* Fix CI

* Fix CI

* Add mutex

* Address review

* Address review

* [secrets][tests] properly reset secrets backend timeout after test (#17614)

* [CWS] fix prerm scripts error logs (#17383)

* Handle missing result json file (#17537)

* [CWS] move arithmetic secl test to the secl package (#17610)

* [CWS] decouple a bit AD/Profile from probe (#17131)

* [CWS] cleanup runner before running btfhub sync job (#17629)

* cleanup runner before running btfhub sync job

* bump setup-go and remove cache step (included in v4)

* CWS: sync BTFhub constants (#17633)

Co-authored-by: paulcacheux <[email protected]>

* [corechecks/snmp] Refactor Profile Config (#17618)

* [CWS] rework secprofile tryAutolearn (#17535)

* Rework secprofile autoLearn func (including 2 fixes), and add 44 unitary tests arround it

* Fix go lint

* Apply review suggestion

* [Fix] Agent version cache not correctly loaded in multiple CI jobs (#17606)

* http2: remove packed enum values (#17586)

Signed-off-by: Guillaume Pagnoux <[email protected]>

* Make `nettop` available (#17458)

* [CWS] support kernel with usernamespaces arguments for security functions (#17634)

* remove unused function

* remove unused function

* PoC support new userns arg

* constantify the argument position selection

* fix same name issue

* horrible hack to pass the verifier

* [CWS] add unknown source for process entry (#17636)

* [CWS] update fallback constants for recent kernels (#17639)

* fix bpf map id constant

* fix bpf mai name offset

* fix bpf prog aux name offset

* move `kitchen_test_dummy_job_tmp` to k8s runners (#17641)

* [gitlab] Migration of unit tests CI jobs to k8s Gitlab runners (#17179)

Requires https://github.com/DataDog/datadog-agent-buildimages/pull/370 first.

This PR:
- updates the Linux build images used in the `datadog-agent` Gitlab CI pipelines to images that do not have an entrypoint script (required because our k8s Gitlab runner infrastructure overwrites the entrypoint of images, therefore we can't rely on it being run)
- updates all relevant CI scripts to run `source /root/.bashrc` at the very beginning, since this is not run in the entrypoint anymore
- updates all jobs in the `setup`, `deps_fetch`, `source_test`, `binary_build` stages to run on k8s runners instead of classic runners
- updates container-related unit tests to work when run in a k8s environment (thanks @L3n41c, cc @DataDog/container-integrations)
- skips a few gohai and gohai-related metadata unit tests that are failing on the arm64 rpm runner because `df` doesn't work in this specific setup, for reasons that remain to be investigated (cc @DataDog/agent-shared-components)
- adds a way to specify concurrency for `golangci-lint` invocations (see https://github.com/DataDog/datadog-agent/pull/15722 and https://github.com/DataDog/datadog-agent/pull/15762)
- fixes the `package_dependencies` jobs in the `kernel_matrix_testing` stage, which weren't using the correct `BUILDIMAGES_SUFFIX`. variable

Co-authored-by: Lénaïc Huard <[email protected]>

* [gitlab] Migrate docker publish jobs to k8s runners (#17270)

Migrates docker publishing jobs to the new Kubernetes-based runners.

The docker build jobs were migrated in #15511, but the publishing jobs are still using old runners.

* Add mutex to runtime settings (#17640)

* Process BTF archive nightly (#17621)

* Minor fixes to system-probe (#17622)

* Use correct module name in restart command

* Proper PingTCP/PingUDP cleanup

* [CWS Agent] RC rules override local rules if IDs conflict (#17573)

* reverse order of policy loading

* adding PolicyProviderType consts

* move enforcement of policy provider loading into a testable func

* pkg/flare: add missing APM variables to envvars (#17597)

This PR adds all APM environmental variables currently being used by
the agent to the flare. Previously, some variables were missing and so
their values would not be represented when producing a flare.

* [Serverless] Use prebuilt opentelemetry lambda layers in integration tests. (#17568)

* Enable auto-instrumentation for python integration test.

* Update snapshot for otlp-python.

* Linting.

* Sort values of tag _dd.tags.container.

* Add encoding info to tailer info for the agent status verbose page (#17533)

* add encoding info to tailer info for the agent status verbose page

* push encoding information straight into tailer info

* moving adding tailer info to parser instead

* NIT

* [usm] Intern Kafka topic names (#17648)

* Add benchmark

* Intern topic name strings

* Fix data synchronization

* config/apm: fix parsing DD_APM_FEATURES (#17630)

Support either "," or " " as separator when parsing the value of DD_APM_FEATURES. It fixes a regression introduced in #15904 which changed the separator from comma to space. This was a breaking change. From 7.44 to 7.46 using a space as separator was suggested ad a workaround, this PR ensures we don't break compatibility again. We now support either space or comma.

* [CWS] do not handle broken lineage during snapshot (#17624)

* Revert "[CWS] do not handle broken lineage during snapshot (#17624)" (#17656)

This reverts commit 970566077529b17147e7c74129344c7b35f766b3.

* [CWS] Improve tryAutolearn unit tests by making fake events to have a valid lineage (#17657)

* [CWS] fix overlayfs inode read on kernel 5.19 and higher (#17644)

* dbg output

* xfs hacky solution to go around the 300MB limit

* PoC test fix overlayfs

* pipe constant param to select the lower inode selection

* implement kernel version check for feature detection

* small fix

* implement function probing based detection

* apply suggested review changes

* Revert "Revert "[CWS] do not handle broken lineage during snapshot (#17624)" (#17656)" (#17658)

This reverts commit 1ad88863157f73078068eab05bda4ea00ddadb58.

* Report config mutation events from the agent

* Create initial config for DDQA (#15675)

* Create initial config for DDQA

* Adding USM

* Adding NDM

* Update config.toml

* add ebpf-platform

* [tools] Adding ASC team to DDQA initial config

* Fixing USM jira project

* Add Agent Platform

* Add platform integrations team

* Add APM

* Add Remote Config

* Add Container-[Integrations-Ecosystems]

* Add Security And Compliance Agent

* Add Processes

* Change RCM issue type to QA

* Add Windows Agent

* [container-app] add qa team metadata

* Network Performance Monitoring

* Update .ddqa/config.toml - Database Monitoring

* Add Windows Kernel Integrations

* add final team Agent Integrations

* Use QA Task for Windows Agent

* [tools] Update task type for agent-shared-components QA issue generation

We now have a new task type for ASC QA operations that should be used

* final update

https://github.com/DataDog/ddqa/pull/13

* rename top-level option

* remove `changelog/no-changelog` as an ignored label

* [ddqa] exclude members for team ASC

* ddqa: AML exclude_members.

* Updating excluded devs for CI team

* Add CSPM Agent

* Exclude olivielpeau from Agent Platform QA

---------

Co-authored-by: Guy Arbitman <[email protected]>
Co-authored-by: Florian Veaux <[email protected]>
Co-authored-by: Alexander Nicholas Costas <[email protected]>
Co-authored-by: Bryce Kahle <[email protected]>
Co-authored-by: Srdjan Grubor <[email protected]>
Co-authored-by: Kylian Serrania <[email protected]>
Co-authored-by: Sarah Witt <[email protected]>
Co-authored-by: Katie Hockman <[email protected]>
Co-authored-by: Baptiste Foy <[email protected]>
Co-authored-by: Cedric Lamoriniere <[email protected]>
Co-authored-by: Paul Cacheux <[email protected]>
Co-authored-by: Moises Botarro <[email protected]>
Co-authored-by: Julien Lebot <[email protected]>
Co-authored-by: fisherevans <[email protected]>
Co-authored-by: Lee Avital <[email protected]>
Co-authored-by: Joel Marcotte <[email protected]>
Co-authored-by: Rich Lancia <[email protected]>
Co-authored-by: Pierre Gimalac <[email protected]>
Co-authored-by: Remy Mathieu <[email protected]>
Co-authored-by: Kacper <[email protected]>
Co-authored-by: David du Colombier <[email protected]>
Co-authored-by: Alexandre Menasria <[email protected]>

* dump silent workloads (#17412)

* [CWS] constantify `vm_flags` access in `vm_area_struct` (#17662)

* constantify `vm_flags` access in `vm_area_struct`

* re-gen constants

* regression detector: change baseline variant from latest main to merge base (#17449)

* regression detector: add compute merge base job

We need to compute the merge base of a non-`main` branch with respect
to `main` in order to establish the commit SHA of a baseline variant
in regression detection. This commit adds a job to compute that merge
base, echo the result to a file (sans newline), and then upload that
result to the Single-Machine Performance S3 bucket for Agent team.

Signed-off-by: Geoffrey M. Oxberry <[email protected]>

* regression detector: note build step we may need

The existing Single-Machine Performance (SMP) regression detector
setup doesn't build additional containers -- it just publishes
containers already built in Agent CI to SMP's ECR for Agent
images. I've written the new "merge-base" container assuming we can
continue with that strategy, but if we can't continue with that
strategy for some reason, then this commit includes a bunch of
comments sketching out a backup plan that would build the container we
would need and publish it to SMP's ECR for Agent images.

Signed-off-by: Geoffrey M. Oxberry <[email protected]>

* regression detectr: sketch functional test tweaks

As in the spirit of the previous commit, this commit adds some
comments on how to transition the existing setup in
`functional_test/regression_detector.yml` from using the latest commit
on `main` to using a "merge base" baseline SHA.

There are a few parts of this commit that are actually functional, but
shouldn't functionally alter the existing regression detector output
-- it still uses a "latest `main`" baseline SHA -- but it does
introduce the idea of getting the new baseline SHA from an artifact in
a previous stage, rather than uploading it to S3 (although I also
upload the merge base baseline SHA to S3 as well, as a contingency
plan).

Signed-off-by: Geoffrey M. Oxberry <[email protected]>

* smp, docker_linux.yml: use literal for image

I thought using a `!reference` tag would work for specifying an image
for the merge-base-computing job, but that mental model may be wrong,
so this commit replaces that tag with the literal it's supposed to
(de)reference.

Signed-off-by: Geoffrey M. Oxberry <[email protected]>

* smp, docker_linux.yml: comment out merge base job

The merge base job isn't running because no runners are configured to
run it. This issue could be one that I can't resolve, so it could be
that I need to get permissions to run it. For now, I'll comment out
this job and instead try and get the information necessary to compute
a baseline as part of the regression detector job, but without
otherwise altering the regression detector behavior.

Signed-off-by: Geoffrey M. Oxberry <[email protected]>

* regression_detector.yml: compute merge base

As a test of whether I can compute the merge base using a `git fetch`
command, this commit adds that command (and other supporting commands)
to the regression detector job to see if these commands will work.

Signed-off-by: Geoffrey M. Oxberry <[email protected]>

* smp, docker_linux.yml: add container build notes

I've managed to figure out how to get the most important case to work
-- the one in which the base branch of the pull request (i.e., in
GitLab terms, the target branch of the merge request) is
`main`. Having managed to get this case to work, this commit updates
my implementation notes to reflect how to implement the case when the
base branch of the pull request *is not* `main`.

Signed-off-by: Geoffrey M. Oxberry <[email protected]>

* regression detector: set baseline to merge base

It looks like creating a separate job (at least in
`.gitlab/container_build/docker_linux.yml`) will not work because
GitLab won't run it in a runner. This behavior may be a configuration
setting for security reasons, and may require repo admin privileges to
change. Instead, this commit implements setting the baseline SHA to
the merge base, along with some commands to abort the regression
detector job if the base branch of the pull request (target branch of
the merge request) is not `main`, and copies the merge base SHA to S3
for debugging/auditing purposes.

Signed-off-by: Geoffrey M. Oxberry <[email protected]>

* regression_detector.yml: remove obsolete comments

Now that I'm pretty sure I've figured out how to change the baseline
SHA to the merge base of a pull request, this commit deletes most of
the commented-out lines I introduced into
`.gitlab/functional_test/regression_detector.yml` as notes to
myself. These comments are no longer necessary for record-keeping.

Signed-off-by: Geoffrey M. Oxberry <[email protected]>

* docker_linux.yml: correct line comment

This commit corrects a line comment in which I write "Use this line if
comparing `main` against itself makes sense" when I should have
written "Use this line if comparing `main` against itself does not
make sense".

Signed-off-by: Geoffrey M. Oxberry <[email protected]>

* regression_detector.yml: make if block a one-liner

GitLab apparently doesn't echo multiline commands by default, so this
commit rewrites this multiline `if` block as a one line command for
debugging purposes.

Signed-off-by: Geoffrey M. Oxberry <[email protected]>

* regression_detector.yml: remove backticks

I'm pretty sure that the "command `main` not found" message I saw was
because the shell was likely interpreting those backticks as running a
command. This commit replaces the backitcks with single quotes to
avoid having the shell attempt to run a command called `main`.

Signed-off-by: Geoffrey M. Oxberry <[email protected]>

* regression_detector.yml: disable base branch check

This commit temporarily disables the check for the base branch because
it doesn't quite seem to work at the moment, and it adds enough
debugging output so I can get some idea of why that check does not
work.

Signed-off-by: Geoffrey M. Oxberry <[email protected]>

* regression_detector.yml: add artifacts + cleanup

This commit cleans up the last bits of debugging comments and code in
`.gitlab/functional_test/regression_detector.yml`, plus it adds the
reporting outputs as artifacts to provide additional diagnostic output
for debugging purposes.

Signed-off-by: Geoffrey M. Oxberry <[email protected]>

* docker_linux.yml: remove comment cruft

This commit removes a large comment block I put in
`.gitlab/container_build/docker_linux.yml` because I no longer think
it's a good idea to compute the merge base commit in a job separate
from the regression detector job.

Signed-off-by: Geoffrey M. Oxberry <[email protected]>

* regression detector: compute baseline via loop

The previous regression detector baseline SHA computation assumed that
an image exists in ECR for the commit returned by `$(git commit
merge-base HEAD main)`, *i.e.*, the merge base with `main`. However,
the container associated with that commit may not exist because that
container may have failed to build, or may have failed to upload to
ECR, so we need to check whether that container exists in ECR. If that
container exists, then the merge base with `main` is the baseline
SHA. If not, then we must iterate over predecessor commits in `main`
until we find a commit in `main` for which a container exists in
ECR. The first commit we find in this loop becomes the baseline SHA
for the regression detector.

This commit implements that check, along with a loop that iterates
over predecessor commits, if necessary, as described above. The
implementation is currently a rather verbose one-liner because
single-line statements are easier to debug in GitLab CI. Once this
implementation succeeds, a subsequent commit will clean up the
implementation by changing it to a multi-line statement, after which
this branch will be ready for review.

Signed-off-by: Geoffrey M. Oxberry <[email protected]>

* regression detector: reduce stderr noise

The initial implementation of the container existence check turns out
to be pretty noisy when a container doesn't exist because `awscli`
will output a bunch of error information on failure. While this
information is helpful for debugging purposes, it will be annoying in
CI, so this commit redirects the `stderr` of that command to
`/dev/null` to reduce noise in the CI output of the regression
detector CI job.

Signed-off-by: Geoffrey M. Oxberry <[email protected]>

* regression detector: fix copy before write error

The regression detector job in the Agent CI pipeline currently fails
because it attempts to copy a file that doesn't exist to S3. This
commit fixes that error by moving the copy command to a point after
the file is generated.

Signed-off-by: Geoffrey M. Oxberry <[email protected]>

* regression detector: tweak ecr query for debugging

For some reason, the `aws ecr describe-images` command is not finding
images that I know exist. To aid in debugging at the expense of noise,
this commit removes redirection of `stderr` to `/dev/null`, adds the
`--registry-id` flag to be more explicit about ECR repo location, and
also moves the `--profile` flag and its argument to a more readable
location.

Si…
guyarb pushed a commit that referenced this pull request Aug 14, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
[deprecated] team/agent-platform major_change Complex/large change, which significantly modifies agent behavior or could impact many agent teams team/agent-metrics-logs team/integrations
Projects
None yet
Development

Successfully merging this pull request may close these issues.

9 participants