Releases: DataDog/datadog-agent
7.22.0
Release Notes
7.22.0
Prelude
Release on: 2020-08-25
- Please refer to the 7.22.0 tag on
integrations-core
for the list of changes on the Core Checks
New Features
- Implements agent-side compliance rule evaluation in security agent
using expressions. - Add IO operations monitoring for Docker check
(docker.io.read/write_operations) - Track TCP connection churn on system-probe
- The new Runtime Security Agent collects file integrity monitoring
events. It is disabled by default and only available for Linux for
now. - Make security-agent part of automatically started agents in
RPM/DEB/etc. packages (will do nothing and exit 0 by default) - Add support for receiving and processing SNMP traps, and forwarding
them as logs to Datadog. - APM: A new trace ingestion endpoint was introduced at /v0.5/traces which
supports a more compact payload format, greatly improving resource usage.
The spec for the new wire format can be viewed at here.
Tracers supporting this change, will automatically use the new endpoint.
Enhancement Notes
-
Adds a gauge for system.mem.slab_reclaimable. This is part
of slab memory that might be reclaimed (i.e. caches). Datadog 7.x
adds SReclaimable memory, if
available on the system, to the system.mem.cached gauge by default. This
may lead to inconsistent metrics for clients migrating from Datadog
5.x, where system.mem.cached didn't
include SReclaimable memory. Adding a
gauge for system.mem.slab_reclaimable allows inverse
calculation to remove this value from the system.mem.cached gauge. -
Expand GCR pause container image filter
-
Kubernetes events for pods, replicasets and deployments now have
tags that match the metrics metadata. Namely, pod_name, kube_deployment, kube_replicas_set. -
Enabled the collection of the kubernetes resource requirements
(requests and limits) by bumping the agent-payload dep. and
collecting the resource requirements. -
Implements resource fallbacks for complex compliance check
assertions. -
Add system.cpu.num_cores metric with the number of CPU cores
(windows/linux) -
compliance: Add support for Go custom compliance checks and
implement two for CIS Kubernetes -
Make DSD Mapper also map metrics that already contain tags.
-
If the retrieval of the AWS EC2 instance ID or hostname fails,
previously-retrieved values are now sent, which should mitigate host
aliases flapping issues in-app. -
Increase default timeout on AWS EC2 metadata endpoints, and make it
configurable withec2_metadata_timeout
-
Add container incl./excl. lists support for ECS Fargate
(process-agent) -
Adds support for a heap profile and cpu profile (of configurable
length) to be created and included in the flare. -
Upgrade embedded Python 3 to 3.8.5. Link to Python 3.8 changelog:
https://docs.python.org/3/whatsnew/3.8.htmlNote that the Python 2 version shipped in Agent v6 continues to be
version 2.7.18 (unchanged). -
Upgrade pip to v20.1.1. Link to pip 20.1.1 changelog:
https://pip.pypa.io/en/stable/news/#id54 -
Upgrade pip-tools to v5.3.1. Link to pip-tools 5.3.1 changelog:
https://github.com/jazzband/pip-tools/blob/master/CHANGELOG.md -
Introduces support for resolving pathFrom from in File and Audit
checks. -
On Windows, always add the user to the required groups during
installation. -
APM: A series of changes to internal algorithms were made which reduced
CPU usage between 20-40% based on throughput.
Bug Fixes
- Allow integration commands to work for pre-release versions.
- [Windows] Ensure
PYTHONPATH
variable is ignored correctly when
initializing the Python runtime. - Enable listening for conntrack info from all namespaces in system
probe - Fix cases where the resolution of secrets in integration configs
would not be performed for autodiscovered containers. - Fixes submission of containers blkio metrics that may modify array
after being already used by aggregator. Can cause missing tags on
containerd.* metrics - Restore support of JSON-formatted lists for configuration options
passed as environment variables. - Don't allow pressing the disable button on checks twice.
- Fix container_include_metrics
support for all container checks - Fix a bug where the Agent disables collecting tags when the cluster
checks advanced dispatching is enabled in the Daemonset Agent. - Fixes a bug where the ECS metadata endpoint V2 would get queried
even though it was not configured with the configuration option
cloud_provider_metadata. - Fix a bug when a kubernetes job has exited after some time the
tagger does not update it even if it did change its state. - Fixes the Agent failing to start on sysvinit on systems with
dpkg >= 1.19.3 - The agent was collecting docker container logs (metrics) even if
they are matching DD_CONTAINER_EXCLUDE_LOGS (resp. DD_CONTAINER_EXCLUDE_METRICS) if they
were started before the agent. This is now fixed. - Fix a bug where the Agent would not remove tags for pods that no
longer exist, potentially causing unbounded memory growth. - Fix pidfile support on security-agent
- Fixed system-probe not working on CentOS/RHEL 8 due to our custom
SELinux policy. We now install the custom policy only on CentOS/RHEL
7, where the system-probe is known not to work with the default. On
other platform the default will be used. - Stop sending payload for Cloud Foundry applications containers that
have no container_name tag attached
to avoid them showing up in the UI with empty name.
Other Notes
- APM: datadog.trace_agent.receiver.* metrics are now also tagged by
endpoint_version
6.22.0
7.21.1
6.21.1
7.21.0
Prelude
Release on: 2020-07-16
- Please refer to the 7.21.0 tag on integrations-core for the list of changes on the Core Checks
Upgrade Notes
- APM: The maximum allowed payload size by the agent was increased
from 10MB to 50MB. This could result in traffic increases for users
which were affected by this issue. - APM: The maximum connection limit over a 30s period was removed.
This can result in an increase of tracing data for users that were
affected by this limitation.
New Features
- Add support of new DatadogMetric CRD in DCA. Allows to autoscale
based on any valid Datadog query. - Add packages scripts for dogstatsd that have the same features as
the agent: create symlink for binary, create dd-agent user and
group, setup the service and cleanup those when uninstalling. - Adds OOM Kill probe to ebpf package and corresponding corecheck to
the agent. - The Datadog IoT Agent is now available for 32 bit ARM architecture
(armv7l/armhf). - Add Compliance agent in Cluster Agent to monitor Kubernetes objects
- Add docker.cpu.limit and containerd.cpu.limit metrics, reporting
maximum cpu time (hz or ns) available for each container based on
their limits. (Only supported on Linux) - Addition of a gRPC server and a hostname resolution endpoint,
including a grpc-gateway that exposes said endpoint as a REST
service. - Adding a 'log_format_rfc3339' option to use the RFC3339 format for
the log time. - Compliance Agent implementing scheduling of compliance checks for
Docker and Kubernetes benchmarks. - Expose agent's sql obfuscation to python checks via new
datadog_agent.obfuscate_sql method - Support installing non-core integrations with the
integration
command, such as those located in theintegrations-extras
repository.
Enhancement Notes
- The Agent
status
command now includes the flavor of the Agent that
is running. - The Agent GUI now includes the flavor of the Agent that is running.
- Adds Tagger information to Datadog Agent flare for support
investigations. - Add a static collector in the tagger package for tags that do not
change after pod start (such as those from an environment variable). - Add
autodiscovery_subnet
to available SNMP template extra configs - When enabling collect_ec2_tags or collect_gce_tags option,
EC2/GCE tags are now cached to avoid missing tags when user exceed
his AWS/GCE quotas. - Chocolatey package can be installed on Domain Controller
- The Agent now collects the Availability Zone a Fargate Task (using
platform version 1.4 or later) is running in as an
"availability_zone" tag. - Enabled the collection of the init-containers by bumping the
agent-payload dep. and collecting the init-containers. - The Agent now collects recommended "app.kubernetes.io" Kubernetes
labels as tags by default, and exposes them under a "kube_app"
prefix. - Docker and Containerd checks now support filtering containers by
kube_namespace. - Add support for sampling to distribution metrics
- Flare now includes the permission information for parents of config
and log file directories. - Collect processes namespaced PID.
- You can now enable or disable the dogstatsd-stats troubleshooting
feature at runtime using theconfig set dogstatsd_stats
command of
the Agent. - API Keys are now sanitized for logs_config and
additional_endpoints. - Upgrade gosnmp to support more authentication and privacy protocols
for v3 connections. - Use the standard tag 'service' as a log collection attribute for
container's logs collected from both kubernetes and docker log
sources. - agent check returns non zero exit code when trace malloc is enabled
(tracemalloc_debug: true) when using python 2 - Added the checksum type to the checksum key itself, as it is
deprecated to have a separate checksum_type key. - Add
lowercase_device_tag
option to the systemio
core check on
Windows. When enabled, sends metrics with a lowercaseddevice
tag,
which is consistent with thesystem.io.*
metrics of Agent v5 and
thesystem.disk.*
metrics of all Agent versions.
Bug Fixes
-
Fix missing values from cluster-agent status command.
-
Add missing
device_name
tag in iostats_pdh -
Fixes an issue where DD_TAGS were not applied to EKS Fargate pods
and containers. -
Add
freetds
linux dep needed for SQL Server to run in Docker
Agent. -
APM : Fix parsing of non-ASCII numerals in the SQL obfuscator.
Previously unicode characters for which unicode.IsDigit returns true
could cause a hang in the SQL obfuscator -
APM: correctly obfuscate AUTH command.
-
Dogstatsd standalone: when running on a systemd-based system, do not
stop Dogstatsd when journald is stopped or restarted. -
Fix missing logs and metrics for docker-labels based autodiscovery
configs after container restart. -
Fix bugs introduced in 7.20.0/6.20.0 in the Agent 5 configuration
import command: the command would not import some Agent config
settings, includingapi_key
, and would write some Docker &
Kubernetes config settings to wrongly-located files. -
Fixes tag extraction from Kubernetes pod labels when using patterns
on certain non-alphanumeric label names (e.g.
app.kubernetes.io/managed-by). -
Fixes the /ready health endpoint on the cluster-agent.
The /ready health endpoint was reporting 200 at the cluster-agent
startup and was then, permanently reporting 500 even though the
cluster-agent was experiencing no problem. In the body of the
response, we could see that a healthcheck component was failing.
This change fixes this issue. -
This fix aims to cover the case when the agent is running inside GKE
with workload identity enabled. If workload identity is enabled,
access to /instance/name is forbidden, resulting into an empty host
alias. -
Fix hostname resolution issue preventing the Process and APM agents
from picking up a valid hostname on some containerized environments -
Fix a bug which causes certain configuration options to be ignored
by theprocess-agent
in the presence of asystem-probe.yaml
. -
Process agent and system probe now correctly accept multiple API
keys per endpoint. -
The
device_name
tag is not used anymore to populate theDevice
field of a series. Only thedevice
tag is considered. -
Fixes problem on Windows where ddagentuser home directory is left
behind. -
Revert upgrade of GoSNMP and addition of extra authentication
protocols. -
Add support for examining processes inside Docker containers running
under systemd cgroups. This also reduces agent logging volume as
it's able to capture those statistics going forward. -
APM: The agent now exits with code 0 when the API key is not
specified. This is so to prevent the Windows SCM from restarting the
process.
Other Notes
- All Agents binaries are now compiled with Go
1.13.11
. - In Debug mode, DogStatsD log a warning message when a burst of
metrics is detected. - JMXFetch upgraded to
0.38.0 - JQuery, used in the web-based agent GUI, has been upgraded to 3.5.1
6.21.0
6.21.0 ships the same features as 7.21.0 except for the Python versions it supports.
Please refer to the 7.21.0 changelog.
7.20.2
Prelude
Release on: 2020-06-16
- Please refer to the 7.20.2 tag on integrations-core for the list of changes on the Core Checks
7.20.1
Prelude
Release on: 2020-06-11
- Please refer to the 7.20.1 tag on integrations-core for the list of changes on the Core Checks
7.20.0
Prelude
Release on: 2020-06-09
- Please refer to the 7.20.0 tag on integrations-core for the list of changes on the Core Checks
New Features
- Pod and container tags autodiscovered via pod annotations now support multiple values for the same key.
- Install script creates
install_info
report - Agent detects
install_info
report and sends it through Host metadata - Adding logic to get standard
service
tag from Pod Metadata Labels. - APM: A new endpoint was added which helps augment and forward profiles to Datadog's intake.
- APM: Information about APM is now included in the agent's status output (both in the GUI and in the 'agent status' command).
- Introducing the 'cloud_provider_metadata' option in the Agent configuration to restrict which cloud provider metadata endpoints will be queried.
- Add collector for Garden containers running applications in CloudFoundry environment to view them in the live container list and container map.
- JMXFetch (helper for JMX checks) is now restarted if it crashes on Windows.
- Add scaffold for security/compliance agent CLI.
container_exclude_metrics
andcontainer_include_metrics
can be used to filter metrics collection for autodiscovered containers.container_exclude_logs
andcontainer_include_logs
can be used to filter logs collection for autodiscovered containers.- Support SNMP autodiscovery via a new configuration listener, with new template variables.
- Support Tencent Cloud provider.
Enhancement Notes
- When installing the Agent using Chocolatey, information about the installation is saved for diagnostic and telemetry purposes.
- The Agent's flare now includes information about the method used to install the Agent.
- Ignore AKS pause containers hosted in the aksrepos.azurecr.io container registry.
- On Linux and MacOS, add a new
device_name
tag on IOstats and disk checks. - Windows installer can use the command line key
HOSTNAME_FQDN_ENABLED
to set the config value ofhostname_fqdn
. - Add missing
device_name
tags on docker, containerd and network checks. Make series managedevice_name
tag ifdevice
is missing. - Support custom tagging of docker container data via an autodiscovery "tags" label key.
- Improved performances in metric aggregation logic. Use 64 bits context keys instead of 128 bits in order to benefit from better performances using them as map keys (fast path methods) + better performances while computing the hash thanks to inlining.
- Count of DNS responses with error codes are tracked for each connection.
- Latency of successful and failed DNS queries are tracked for each connection. Queries that time out are also tracked separately.
- Enrich dogstatsd metrics with task_arn tag if DD_DOGSTATSD_TAG_CARDINALITY=orchestrator.
- More pause containers from
ecr
,gcr
andmcr
are excluded automatically by the Agent. - Improve cluster name auto-detection on Azure AKS.
- APM: Improve connection reuse with HTTP keep-alive in trace agent.
- Increase default timeout to collect metadata from GCE endpoint.
- Use insertion sort in the aggregator context keys generator as it provides better performances than the selection sort. In cases where the insertion sort was already used, improved its threshold selecting between it and Go stdlib sort.
- Expose distinct endpoints for liveness and readiness probes.
- The liveness probe (
/live
) fails in case of unrecoverable error that deserve an agent restart. (Ex.: goroutine deadlock or premature exit) - The readiness probe (
/ready
) fails in case of recoverable errors or errors for which an agent restart would be more nasty than useful (Ex.: the forwarder fails to connect to DataDog)
- The liveness probe (
- Exclude automatically pause containers for OpenShift, EKS and AKS Windows
- Introduce
kube_cluster_name
andecs_cluster_name
tags in addition tocluster_name
. Add the possibility to stop sending thecluster_name
tag using the parameterdisable_cluster_name_tag_key
in Agent config. The Agent keeps sendingkube_cluster_name
and ecs_cluster_name tags regardless of disable_cluster_name_tag_key. - Configure additional process and orchestrator endpoints by environment variable.
- The process agent can be canfigured to collect containers from multiple sources (e.g kubelet and docker simultaneously).
- Upgrading the embedded Python 2 to the latest, and final, 2.7.18 release.
- Improve performance of system-probe conntracker.
- Throttle netlink socket on workloads with high connection churn.
Deprecation Notes
container_exclude
replacesac_exclude
.container_include
replacesac_include
.ac_exclude
andac_include
will keep being supported but the Agent ignores them in favor ofcontainer_exclude
andcontainer_include
if they're defined.
Bug Fixes
- APM: Fix a small programming error causing the "superfluous response.WriteHeader call" warning.
- Fixes missing container stats in ECS Fargate 1.4.0.
- Ensure Python checks are always garbage-collected after they're unscheduled by AutoDiscovery.
- Fix for autodiscovered checks not being rescheduled after container restart.
- On Windows, fix calculation of the
system.swap.pct_free
metric. - Fix a bug in the file tailer on Windows where the log-agent would
keep a lock on the file preventing users from renaming the it.
Other Notes
- Upgrade embedded ntplib to
0.3.4
- JMXFetch upgraded to 0.36.2
- Rebranded puppy agent as iot-agent.
6.20.0
6.20.0
ships the same features as 7.20.0
except for the Python versions it supports.
Please refer to the 7.20.0 changelog.