Skip to content

Releases: DataDog/datadog-agent

7.61.0

13 Jan 09:25
7.61.0
202f54b
Compare
Choose a tag to compare

Agent

Prelude

Release on: 2025-01-13

Upgrade Notes

  • Upgraded JMXFetch to 0.49.6 which fixes a NullPointerException on JBoss when user and password not set. See 0.49.6 for more details.
  • Windows containers were updated to use OpenJDK 11.0.25+9.

New Features

  • Add metrics origins for Nvidia Nim integration.
  • APM: New configuration apm_config.obfuscation.credit_cards.keep_values (DD_APM_OBFUSCATION_CREDIT_CARDS_KEEP_VALUES) can be used to skip specific tag keys that are known to never contain credit card numbers. This is especially useful in cases where a span tag value is a number that triggers false positives from the credit card obfuscator.
  • Add new metric, container.restarts, which indicates the number of times a container has been restarted due to the restart policy. For more details: https://docs.docker.com/engine/containers/start-containers-automatically/.
  • APM: Introducing the Error Tracking Standalone config option. Only span chunks that contain errors or exception OpenTelemetry span events are taken into consideration by sampling.
  • Add new windows images for LTSC 2019 and LTSC 2022:
    • datadog-agent:7-servercore-ltsc2019-amd64
    • datadog-agent:7-servercore-ltsc2022-amd64
    • datadog-agent:7-servercore-ltsc2019-jmx-amd64
    • datadog-agent:7-servercore-ltsc2022-jmx-amd64
    • datadog-agent:latest-servercore-ltsc2019-jmx
    • datadog-agent:latest-servercore-ltsc2022-jmx
    • datadog-agent:latest-servercore-ltsc2019
    • datadog-agent:latest-servercore-ltsc2022
    • datadog-agent:7.X.Y-ltsc2019
    • datadog-agent:7.X.Y-ltsc2022
    • datadog-agent:7.X.Y-ltsc2019-jmx
    • datadog-agent:7.X.Y-ltsc2022-jmx
    • datadog-agent:7.X.Y-servercore-ltsc2019
    • datadog-agent:7.X.Y-servercore-ltsc2022
    • datadog-agent:7.X.Y-servercore-ltsc2019-jmx
    • datadog-agent:7.X.Y-servercore-ltsc2022-jmx
    • datadog-agent:latest-ltsc2019
    • datadog-agent:latest-ltsc2022
  • [ha-agent] Add haagent component used for HA Agent feature.
  • The cluster-agent now can collect pod disruption budgets from the cluster.
  • Added support for collecting container image metadata when running on a CRI-O runtime.
  • USM now monitors TLS traffic encrypted with Go TLS by default. To disable this feature, set the service_monitoring_config.tls.go.enabled configuration option to false.
  • USM now monitors traffic encrypted with Istio mTLS by default. To disable this feature, set the service_monitoring_config.tls.istio.enabled configuration option to false.
  • Introduced a new configuration variable logs_config.http_protocol, allowing users to enforce HTTP/1.1 for outgoing HTTP connections in the Datadog Agent. This provides better control over transport protocols and improves compatibility with systems that do not support HTTP/2. By default, the log agent will now attempt to use HTTP/2 (unless a proxy is configured) and fall back to the best available protocol if HTTP/2 is not supported.
  • Added a new feature flag enable_operation_and_resource_name_logic_v2 in DD_APM_FEATURES. Enabling this flag modifies the logic for computing operation and resource names from OTLP spans to produce shorter, more readable names and improve alignment with OpenTelemetry specifications.
  • Add support for PHP Single Step Instrumentation in Kubernetes (not enabled by default)

Enhancement Notes

  • [ha-agent] Run HA enabled integrations only on leader Agent
  • [ha-agent] Add agent_group tag to datadog.agent.running metric
  • Cluster Agent: DatadogAgent custom resource, cluster Agent deployment, and node Agent daemonset manifests are now added to the flare archive when the Cluster Agent is deployed with the Datadog Operator (version 1.11.0+).
  • Add new host tag provider_kind from the value of DD_PROVIDER_KIND for Agents running in GCE.
  • Add query_timeout to customize the timeout for queries in the Oracle check. Previously, this was fixed at 20,000 seconds.
  • Add ability to show Agent telemetry payloads to be sent by Agent if the telemetry is enabled. One can run it with the following command: agent diagnose show-metadata agent-telemetry. See docs <https://docs.datadoghq.com/data\_security/agent/#telemetry-collection> for more details.
  • Convert Prometheus style Counters and Histograms used in Agent telemetry from monotonically increasing to non-monotonic values (reset on each scrape). In addition de-accumulate Prometheus Histogram bucket values on each scrape.
  • Added support for more than 100 Aurora clusters in a user's account when using database autodiscovery
  • Adds some information about the SNMP autodiscovery status in the Agent status.
  • Adds a dedicated CRI-O Workloadmeta collector, enabling metadata collection for containers running on a CRI-O runtime.
  • Enables a cache for SQL and MongoDB obfuscation. This cache is enabled by default but can be disabled by setting apm_config.obfuscation.cache.enabled to false.
  • Improved logging to add visibility for latency and transport protocol
  • Add a new configuration option log_level for commands where the logger is disabled by default.
  • Adds initial Windows support for TCP probes in Network Path.
  • Query Aurora instances per cluster to allow up to 100 instances per cluster rather than 100 instances total.
  • The AWS Lambda Extension is now able to read the full 128-bit trace ID from the headers of the end-invocation HTTP request made by dd-trace or the datadog-lambda-go library.
  • Standardized cluster check tagging across all environments, allowing DD_TAGS, DD_EXTRA_TAGS, DD_CLUSTER_CHECKS_EXTRA_TAGS, and DD_ORCHESTRATOR_EXPLORER_EXTRA_TAGS to apply to all cluster check data when operating on the Cluster Agent, Node Agent, or Cluster Checks Runner.

Deprecation Notes

  • Deprecates the apm_config.obfuscation.sql.cache option in favor of apm_config.obfuscation.cache.
  • Remove deprecated config otlp_config.metrics.instrumentation_library_metadata_as_tags. Use otlp_config.metrics.instrumentation_scope_metadata_as_tags instead.
  • The remote tagger will attempt to connect to the core agent indefinitely until it is successful. The remote_tagger_timeout_seconds configuration is removed, and the timeout is no longer configurable.
  • The remote tagger for the trace-agent and security-agent is now always enabled and can not be disabled apm_config.remote_tagger, security_agent.remote_tagger, and event_monitoring_config.remote_tagger config entries are removed.

Security Notes

Bug Fixes

  • Cluster Agent: Don't overwrite the LD_PRELOAD environment variable if it's already set, append the path to Datadog's injection library instead.
  • Fix an issue where the remote workloadmeta was not receiving some unset events for ECS containers, causing incorrect billing in CWS, CSPM, CSM Pro, CSM Enterprise, and DevSecOps Enterprise Containers.
  • Corrects the method call for gauges to be Set instead of Add.
  • Fix Oracle execution plan collection failures caused by an out-of-range position column, which can occur if the execution plan is excessively large.
  • Fix excessive number of rows coming from active session history.
  • OTLP ingestion: Stop prefixing http_server_duration, http_server_request_size and http_server_response_size with otelcol.
  • Fixes the issue of disabled services producing an error message in the event log on start. Now produces an informational message.
  • Change kubernetes.memory.working_set and kubernetes.memory.usage metrics to be of type gauge instead of rate.

Other Notes

  • Add metric origins for Platform Integrations: Fly.io, Kepler, Octopus Deploy, and Scaphandre.
  • Extend Agent Telemetry to start reporting logs.sender_latency metric.
  • The enable_receive_resource_spans_v2 flag now defaults to true in Converged Agent. This enables the refactored version of the OTLP span receiver in trace agent, improves performance by 10%, and deprecates the following functionality:
    • No longer checks for information about the resource in HTTP headers (ContainerID, Lang, LangVersion, Interpreter, LangVendor).
    • No longer checks for resource-related values (contai...
Read more

7.60.1

19 Dec 16:16
6466186
Compare
Choose a tag to compare

Agent

7.60.1

Prelude

Release on: 2024-12-19

Security Notes

Datadog Cluster Agent

7.60.1

Prelude

Released on: 2024-12-19 Pinned to datadog-agent v7.60.1: CHANGELOG.

7.60.0

16 Dec 10:37
799e298
Compare
Choose a tag to compare

Datadog Agent

Release Notes

7.60.0

Prelude

Release on: 2024-12-16

Upgrade Notes

    • Parameter peer_tags_aggregation (a.k.a. environment variable DD_APM_PEER_TAGS_AGGREGATION) is now enabled by default. This means that aggregation of peer related tags (e.g., peer.service, db.instance, etc.) now happens in the Agent, which enables statistics for Inferred Entities. If you want to disable this feature, set peer_tags_aggregation to false in your Agent configuration.

    • Parameter compute_stats_by_span_kind (a.k.a. environment variable DD_APM_COMPUTE_STATS_BY_SPAN_KIND) is now enabled by default. This means spans with an eligible span.kind will have stats computed. If disabled, only top-level and measured spans will have stats computed. If you want to disable this feature, set compute_stats_by_span_kind to false in your Agent configuration.

      Note: When using peer_tags_aggregation and compute_stats_by_span_kind, a high cardinality of peer tags or APM resources can contribute to higher CPU and memory consumption. If enabling both causes the Agent to consume too many resources, try disabling compute_stats_by_span_kind first.

    It is recommended that you update your tracing libraries according to the instructions here and set DD_TRACE_REMOVE_INTEGRATION_SERVICE_NAMES_ENABLED (or dd.trace.remove.integration-service-names.enabled) to true.

  • Upgraded JMXFetch to 0.49.5 which adds support for UnloadedClassCount metric and IBM J9 gc metrics. See 0.49.5 for more details.

New Features

  • Inferred Service dependencies are now Generally Available (exiting Beta) and enabled by default. Inferred Services of all kinds now have trace metrics and are available in dependency maps. apm_config.peer_tags_aggregation and apm_config.compute_stats_by_span_kind both now default to true unless explicitly set to false.

  • Add check_tag_cardinality parameter config check.

    By default check_tag_cardinality is not set which doesn't change the behavior of the checks. Once it is set in pod annotaions, it overrides the cardinality value provided in the base agent configuration. Example of usage:

ad.datadoghq.com/redis.checks: |
   {
     "redisdb": {
       "check_tag_cardinality": "high",
        "instances": [
         {
           "host": "%%host%%",
           "port": "6379"
         }
       ]
     }
   } 
  • Added a new feature flag enable_receive_resource_spans_v2 in DD_APM_FEATURES that gates a refactored implementation of ReceiveResourceSpans for OTLP.

Enhancement Notes

  • Added information about where the Agent sourced BTF data for eBPF to the Agent flare. When applicable, this will appear in system-probe/ebpf_btf_loader.log.
  • The Agent flare now returns NAT debug information from conntrack in the system-probe directory.
  • The flare subcommand includes a --provider-timeout option to set a timeout for each file collection (default is 10s), useful for unblocking slow flare creation.
  • This change reduces the number of DNS queries made by Network Traffic based paths in Network Path. A cache of reverse DNS lookups is used to reduce the number of DNS queries. Additionally, reverse DNS lookups are now performed only for private IPs and not for public IPs.
  • Agent flare now includes system-probe telemetry data via system-probe/system_probe_telemetry.log.
  • The MSI installer uses 7zr.exe to decompress the embedded Python.
  • On Windows, the endpoint /windows_crash_detection/check has been modified to report crashes in an asynchronous manner, to allow processing of large crash dumps without blocking or timing out. The first check will return a busy status and continue to do so until the processing is completed.

Deprecation Notes

  • Prebuilt eBPF for the network tracer system-probe module has been deprecated in favor of CO-RE and runtime compilation variants on Linux kernel versions 6+ and RHEL kernel versions 5.14+. To continue to use the prebuilt eBPF network tracer, set system_probe_config.allow_prebuilt_fallback in the system-probe config file, or set the environment variable DD_ALLOW_PREBUILT_FALLBACK, to true on these platforms.
  • The feature flag service_monitoring_config.enable_http_stats_by_status_code was deprecated and removed. No impact on USM's behavior.

Bug Fixes

  • Fixes an issue added in 7.50 that causes the Windows event log tailer to drop events if it cannot open their publisher metadata.
  • Fix a bug in the config parser that broke ignored_ip_addresses from working in NDM Autodiscovery.
  • Fixes host tags with a configurable duration so the metric's context hash doesn't change, preventing the aggregator from mistaking it as a new metric.
  • Fix could not parse voltage fields error in Nvidia Jetson integration when tegrastats output contains mW units.
  • Fix building of Python extension containing native code.
  • [oracle] Fix broken activity sampling with an external Oracle client.
  • Fix nil pointer error on Oracle DBM query when the check's connection is lost before SELECT statement executes.
  • Fix a regression that caused the Agent to not be able to run if its capabilities had been modified with the setcap command.
  • Fix bug wherein single line truncated logs ended with whitespace characters were not being tagged as truncated. Fix issue with the truncation message occasionally causing subsequent logs to think they were truncated when they were not (single line logs only).

Datadog Cluster Agent

Release Notes

7.60.0

Prelude

Released on: 2024-12-16 Pinned to datadog-agent v7.60.0: CHANGELOG.

Bug Fixes

  • Fixes bug where incorrect timestamp would be used for unbundled Kubernetes events.
  • Fixed an issue in the KSM check when it's configured with the option pod_collection_mode set to node_kubelet. Previously, the check could fail to start if there was a timeout while contacting the API server. This issue has now been resolved.

7.59.1

03 Dec 13:53
3638fcd
Compare
Choose a tag to compare

Prelude

Release on: 2024-12-02

Enhancement Notes

7.59.0

07 Nov 11:29
b97c906
Compare
Choose a tag to compare

Agent

Prelude

Release on: 2024-11-07

Upgrade Notes

  • Removed the deprecated config option otlp_config.debug.loglevel in favor of otlp_config.debug.verbosity:
    • loglevel: debug maps to verbosity: detailed
    • loglevel: info maps to verbosity: normal
    • loglevel: warn/error maps to verbosity: basic
    • loglevel: disabled maps to verbosity: none

New Features

  • Add ability to run process/container collection on the core Agent (Linux only). This is controlled by the process_config.run_in_core_agent.enabled option in datadog.yaml.
  • DBM: Add configuration options to SQL obfuscator to customize the obfuscation of SQL statements:
    • KeepJSONPath - option to control whether JSON paths following JSON operators in SQL statements should be obfuscated. This option is only valid when ObfuscationMode is obfuscate_and_normalize.
  • APM: Add new 'sqllexer' feature flag for the Trace Agent, which enables the sqllexer imprementation of the SQL Obfuscator.
  • Introduce new Kubernetes tag gpu_vendor for the GPU resource requested by a container.

Enhancement Notes

  • Added additional Agent telemetry metrics for the log tailer code flow: logs.bytes_sent, logs.encoded_bytes_sent, and logs.bytes_missed

  • Datadog may collect environmental, performance, and feature usage information about the Datadog Agent. This may include diagnostic logs and crash dumps of the Datadog Agent with obfuscated stack traces to support and further improve the Datadog Agent.

    More details could be found in the docs

  • APM: Updates peer tags for peer.db.system.

  • Agents are now built with Go 1.22.8.

  • While using the AWS Lambda Extension, when a Lambda Function is invoked by a [properly instrumented][1] Step Function, the Lambda Function will create its Trace and Parent IDs deterministically based on the Step Function's execution context. [1]: https://docs.datadoghq.com/serverless/step_functions/installation/?tab=custom "Install Serverless Monitoring for AWS Step Functions"

  • Updates default .NET library used for auto-instrumentation from v2 to v3

  • The system-probe selinux policy is now installed on Oracle Linux

  • Increases the default input channel, processing channel, and context store sizes for network traffic paths.

  • Adds support for file log collection from Podman rootless containers when logs_config.use_podman_logs is set to true and podman_db_path is configured.

  • Allow Python integrations to emit Agent telemetry data.

Security Notes

  • Update OpenSSL to 3.3.2 (on Linux & macOS) in order to mitigate CVE-2024-6119.

Bug Fixes

  • Fixes the default configuration template to include the Cloud Security Management configuration options.
  • Fixing a bug introduced in 7.55 where in some specific scenarios, checks associated with a deleted container or POD would keep running until the Agent is restarted.
  • Fix the forwarder health check so that it reports unhealthy when the API key is invalid.
  • Fix the removal of 'non-core' integrations during Agent upgrades.
  • Fix Process Agent argument scrubbing to allow scrubbing of quoted arguments.
  • Fix Orchestrator argument scrubbing to allow scrubbing of quoted arguments.
  • Fixes an issue where TCP traceroute latency was not being calculated correctly.
  • Fixes the telemetry type for Oracle metrics.
  • APM: Fix obfuscation of SQL queries containing non-numeric prepared statement variables.

Other Notes

  • Adds Postgres integration metrics to cross-org telemetry whitelist.
  • The Agent is now built with a custom toolchain that targets our minimally supported glibc version (2.17 on x86_64 and 2.23 on aarch64)
  • On Windows, the TCP socket transport mechanism for system probe communications has been replaced with a named pipe. This deprecates the system_probe_config.sysprobe_socket configuration entry for Windows. The new fixed named pipe path is \pipedd_system_probe.

7.58.2

04 Nov 14:01
4ad1243
Compare
Choose a tag to compare

Prelude

Release on: 2024-11-04

Bug Fixes

  • Use of cloud-provided hostname as default when running the Agent in AKS introduced in 7.56.0 is reverted due to cases where the hostname returned is non-unique. This feature will be fixed and added again in a future release.

7.58.1

24 Oct 14:47
6f52a65
Compare
Choose a tag to compare

Agent

Prelude

Release on: 2024-10-24

Enhancement Notes

  • Removes a log statement which was causing a lot of noise in the Network Path logs.

Bug Fixes

  • [CWS] Fixes an issue where the cws-instrumentation trace command could panic before launching the traced executable when running on AWS Fargate.
  • [CWS] Fixes an issue where ECS Fargate tags would not be resolved correctly on CWS events.
  • Fixes an error in system-probe triggered by packet capture in environments with multiple VLANs.
  • Fix USM's GO-TLS support for Golang 1.23

7.58.0

21 Oct 09:25
cf39839
Compare
Choose a tag to compare

Agent

Prelude

Release on: 2024-10-21

Upgrade Notes

  • Changes behavior of the timeout for Network Path. Previously, the timeout signified the total time to wait for a full traceroute to complete. Now, the timeout signifies the time to wait for each hop in the traceroute. Additionally, the default timeout has been changed to 1000ms.

New Features

  • Added capability to tag any Kubernetes resource based on labels and annotations. This feature can be configured with kubernetes_resources_annotations_as_tags and kubernetes_resources_labels_as_tags. These feature configurations are associate group resources with annotations-to-tags (or labels-to-tags) map For example, pods can be associated with an annotations-to-tags map to configure annotations as tags for pods. Example: {`pods`: {`annotationKey1`: tag1, `annotationKey2`: tag2}}
  • The Kubernetes State Metrics (KSM) check can now be configured to collect pods from the Kubelet in node agents instead of collecting them from the API Server in the Cluster Agent or the Cluster check runners. This is useful in clusters with a large number of pods where emitting pod metrics from a single check instance can cause performance issues due to the large number of metrics emitted.
  • NPM - adds UDP "Packets Sent" and "Packets Received" to the network telemetry in Linux.
  • [oracle] Add the active_session_history configuration parameter to optionally ingest Oracle active session history samples instead of query sampling.
  • Added config option logs_config.tag_truncated_logs. When enabled, file logs will come with a tag truncated:true if they were truncated by the Agent.

Enhancement Notes

  • [DBM] Bump go-sqllexer to 0.0.14 to skip collecting CTE tables as SQL metadata.
  • Agents are now built with Go 1.22.7.
  • Add the ability to tag cisco-sdwan device and interface metrics with user-defined tags.
  • Add support for setting a custom log source from resource attribute or log attribute datadog.log.source.
  • The default UDP port for traceroute (port 33434) is now used for Network Traffic based paths, instead of the port detected by NPM.
  • [oracle] Add oracle_client_lib_dir config parameter.
  • [oracle] Increase tablespace check interval from 1 to 10 minutes.
  • [oracle] Don't try to fetch execution plans where plan_hash_value is 0
  • The OTLP ingest endpoint now maps the new OTel semantic convention deployment.environment.name to env
  • Prevents the use of the process_config.run_in_core_agent.enabled configuration option in unsupported environments.
  • APM: Trace payloads are now compressed with zstd by default.

Security Notes

Bug Fixes

  • Adds missing support for the logs config key to work with AD annotations V2.
  • Fix agent jmx [command] subcommands for container environments with annotations-based configs.
  • Fixed issue with openSUSE 15 RC 6 where the eBPF tracer wouldn't start due to a failed validation of the tcp_sendpage probe.
  • Fixed a rare issue where short-lived containers could cause logs to be sent with the wrong container ID.
  • Fix Windows Process Agent argument stripping to account for spaces in the executable path.
  • Fixes issue with the kubelet corecheck where kubernetes.kubelet.volume.* metrics were not properly being reported if any matching namespace exclusion filter was present.
  • OOM Kill Check now reports the cgroup name of the victim process rather than the triggering process.
  • The process agent will no longer exit prematurely when language detection is enabled or when there is a misconfiguration stemming from process_config.run_in_core_agent.enabled's default enablement in Kubernetes.
  • Change the datadog-security-agent Windows service display name from Datadog Security Service to Datadog Security Agent for consistency with other Agent services.
  • Fix a bug preventing SNMP V3 reconnection.

Other Notes

  • Add metric origins for the Kubeflow integration.
  • Add functional tests to Oracle using a Docker service to host the database instance.
  • Adds Agent telemetry for Oracle collector.

Datadog Cluster Agent

Prelude

Released on: 2024-10-21 Pinned to datadog-agent v7.58.0: CHANGELOG.

New Features

  • Added capability to tag any Kubernetes resource based on labels and annotations. This feature can be configured with kubernetes_resources_annotations_as_tags and kubernetes_resources_labels_as_tags. These feature configurations are associate group resources with annotations-to-tags (or labels-to-tags) map For example, deployments.apps can be associated with an annotations-to-tags map to configure annotations as tags for deployments. Example: {`deployments.apps`: {`annotationKey1`: tag1, `annotationKey2`: tag2}}
  • The Kubernetes State Metrics (KSM) check can now be configured to collect pods from the Kubelet in node agents instead of collecting them from the API Server in the Cluster Agent or the Cluster check runners. This is useful in clusters with a large number of pods where emitting pod metrics from a single check instance can cause performance issues due to the large number of metrics emitted.

Enhancement Notes

  • Added a new option for the Cluster Agent ("admission_controller.inject_config.type_socket_volumes") to specify that injected volumes should be of type "Socket". This option is disabled by default. When set to true, injected pods will not start until the Agent creates the DogstatsD and trace-agent sockets. This ensures no traces or DogstatsD metrics are lost, but it can cause the pod to wait if the Agent has issues creating the sockets.

Bug Fixes

  • Fixed an issue that prevented the Kubernetes autoscaler from evicting pods injected by the Admission Controller.

7.57.2

24 Sep 13:31
38ba0c7
Compare
Choose a tag to compare

Prelude

Release on: 2024-09-24

Enhancement Notes

  • Agents are now built with Go 1.22.7.

Bug Fixes

  • Fix OOM error with cluster agent auto instrumentation by increasing default memory request from 20Mi to 100Mi.
  • Fixes a panic caused by running the Agent on readonly filesystems. The Agent returns integration launchers and handles memory gracefully.

7.57.1

18 Sep 11:35
7.57.1
50bedc2
Compare
Choose a tag to compare

Agent

7.57.1

Prelude

Release on: 2024-09-17

Bug Fixes

  • APM: When the UDS listener cannot be created on the trace-agent, the process will log the error, instead of crashing.
  • Fixes memory leak caused by container check.

Datadog Cluster Agent

7.57.1

Prelude

Released on: 2024-09-17 Pinned to datadog-agent v7.57.1: CHANGELOG.