Skip to content

Releases: DataDog/datadog-agent

6.13.0

24 Jul 16:28
df8e880
Compare
Choose a tag to compare

Prelude

Released on: 2019-07-24

Upgrade Notes

  • The port option in the NTP check configuration is now parsed as an integer instead of a string.

New Features

  • APM: add support for Unix Domain Sockets by means of the apm_config.receiver_socket configuration. It is off by default. When set, it must point to a valid sock file.
  • APM: API emitted metrics now have a lang_vendor tag when the Datadog-Meta-Lang-Vendor HTTP header is sent by clients.
  • APM: Resource-based rate limiting in the API can now be completely disabled by setting apm_config.max_memory and/or apm_config.max_cpu_percent to the value 0.
  • Add support for environment variables in checks' config files using the format "%%env_XXXX%%".
  • Add new systemd integration to monitor systemd itself and the units managed by systemd.
  • The total number of bytes received by dogstatsd is now reported by the dogstatsd-udp/Bytes and dogstatsd-uds/Bytes expvar.
  • Adds the ability to use DD_TAGS to set global tags in Fargate.
  • Added a support for the new pod log directory pattern introduced in version 1.14 of Kubernetes to make sure the agent keeps on collecting logs after upgrade of a Kubernetes cluster.

Enhancement Notes

  • Add a kube_cronjob tag in the tagger. It applies to container metrics, autodiscovery metrics and logs.
  • Change the prefix of entity IDs to make it easier to query the tagger without knowing what the container runtime is.
  • APM: reduce memory usage in high traffic by up to 10x.
  • APM: Services are no longer aggregated in the agent, nor written to the Datadog API. Instead, they are now automatically extracted on the backend based on the received traces.
  • APM: The default interval at which the agent watches its resource usage has been reduced from 20s to 10s.
  • APM: Improved processing concurrency and as a result, CPU usage decreased by 20% in some scenarios.
  • APM: Queued sender was rewritten to improve performance around scenarios where network problems are present.
  • APM: Code clean up around configuration and writer.
  • The datadog-agent version command now prints the version of Golang the agent was compiled with.
  • Display Go version in output of status command
  • Upgraded JMXFetch to 0.30.0. See https://github.com/DataDog/jmxfetch/releases/tag/0.30.0
  • APM: the trace agent now lets through a wider variety of traces, automatically correcting some malformed traces instead of dropping them. The following fields are now replaced with reasonable defaults if invalid or empty and truncated if exceeding max length: span.service, span.name, span.resource, span.type. span.duration=0 is now allowed. Missing span start date now defaults to duration - now. The datadog.trace_agent.receiver.traces_dropped metric is now tagged with a reason tag explaining the reason it was dropped. There is a new datadog.trace_agent.receiver.spans_malformed metric also tagged by reason explaining how the span was malformed.
  • Refactored permissions check in the integration command.
  • Support Python 3 for the integration command.

Deprecation Notes

  • APM: The presampler has been rebranded as a "rate limiter" to avoid confusing it with other sampling mechanisms.
  • APM: The datadog.trace_agent.presampler_rate metric has been deprecated in favor of datadog.trace_agent.receiver.ratelimit.

Security Issues

  • On Windows, quote the service name when registering service. Mitigates CVE-2014-5455. Note that since the Agent is not running as admin, even a successful attack would not give admin rights as specified in the CVE.

Bug Fixes

  • Fix the tagger behavior returning None when no tags are present for the kubelet and fargate integration.
  • APM: metrics generated by the processing function (such as *.traces_priority) now contain language specific tags.
  • APM: Memory spikes when retry queue grows have been fixed.
  • Fix 'vcruntime140.dll is being held in use by the following process.
  • System-probe s6 services: ensure that the system-probe binary is bundled before trying to run it / stop it. This is to ensure that the s6-services definitions will be backward compatible with older builds that didn't have the system-probe yet.
  • Fix a bug in the log scanning logic of the JMXFetch wrapper that would make JMXFetch hang if it logged a very large log entry
  • Fixed an issue where logs collected from kubernetes using '/var/log/pods' would show up with a wrong format '{"log":"x","stream":"y","time":"z"}' on the logs explorer when using docker as container runtime.
  • Fix TLS connection handshake that hang forever making the whole logs pipeline to be stucked resulting in logs not being tailed and file descriptor not being closed.
  • On Windows, fixes bug in which Agent can't start if the Go runtime can't determine the ddagentuser's profile directory. This information isn't used, so shouldn't cause a failure
  • The External Metrics Setter no longer stops trying to get metrics after 3 failed attempts. Instead, it will retry indefinitely.
  • Removes an unused duplicate copy of the system-probe binary from the Linux packages
  • The NTP check now properly uses the port configuration option.

Other Notes

  • Logs informing about check runs and payload submission are now displayed once every 500 events instead of every 20 events.

6.12.2

04 Jul 11:35
e6fdde2
Compare
Choose a tag to compare

Prelude

Release on: 2019-07-03

This release is only available on Windows and contains all the changes introduced in 6.12.0 and 6.12.1.

6.12.1

28 Jun 22:41
Compare
Choose a tag to compare

Prelude

Release on: 2019-06-28

This release is not available on Windows.

Bug Fixes

  • Fixed a bug in the kubelet and fargate integrations preventing the collection of the kubernetes.cpu.* and kubernetes.memory.* metrics.

6.12.0

27 Jun 21:41
84d39e6
Compare
Choose a tag to compare

Known Issues

Some metrics from the kubernetes and kubelet integrations (kubernetes.cpu.* and kubernetes.memory.*) are missing for certain configurations.
A fix will be released in v6.12.1. Meanwhile if downgrading to 6.11.3 is not an option we recommend using the runtime metrics (ex: docker.cpu.*, docker.mem.*, containerd.cpu.*, ...).

Prelude

Release on: 2019-06-26

This release is not available on Windows.

  • Please refer to the `6.12.0 tag on integrations-core for the list of changes on the Core Checks

Upgrade Notes

  • APM: Log throttling is now automatically enabled by default when
    log_level differs from debug. A maximum of no more than 10 error
    messages every 10 seconds will be displayed. If you had it enabled before,
    it can now be removed from the config file.

  • On Windows, the path of the embedded python.exe binary has changed from %ProgramFiles%\Datadog\Datadog Agent\embedded\python.exe to %ProgramFiles%\Datadog\Datadog Agent\embedded2\python.exe. If you use this path from your provisioning scripts, please update it accordingly.
    Note: on Windows, to call the embedded pip directly, please use %ProgramFiles%\Datadog\Datadog Agent\embedded2\python.exe -m pip.

  • Logs: Breaking Change for Kubernetes log collection - In the version 6.11.2 logic was added in the Agent to first look for K8s container files if /var/log/pods was not available and then to go for the Docker socket.
    This created some permission issues as /var/log/pods can be a symlink in some configuration and the Agent also needed access to the symlink directory.

    This logic is reverted to its prior behaviour which prioritise the Docker socket for container log collection.
    It is still possible to force the agent to go for the K8s log files even if the Docker socket is mounted by using the logs_config.k8s_container_use_file' or DD_LOGS_CONFIG_K8S_CONTAINER_USE_FILE`. parameter.
    This is recommended when more than 10 containers are running on the same pod.

New Features

  • A count named datadog.agent.started is now sent with a value of 1 when the agent starts.

  • APM: Maximum allowed CPU percentage usage is now
    configurable via DD_APM_MAX_CPU_PERCENT.

  • Node Agent can now perform checks on kubernetes service endpoints.
    It consumes the check configs from the Cluster Agent API via the
    endpointschecks config provider.
    Versions 1.3.0+ of the Cluster Agent are required for this feature.

  • Logs can now be collected from init and stopped containers (possibly short-lived).

  • Allow tracking pod labels and annotations value change to update labels/annotations_as_tags.
    Make the explicit tagging feature dynamic (introduced in #3024).

Enhancement Notes

  • APM: the writer will now flush based on an estimated number of bytes
    in accumulated buffer size, as opposed to a maximum number of spans.

  • APM: traces are not dropped anymore because or rate limiting due to
    performance issues. Instead, the trace is kept in a queue awaiting to
    be processed.

  • Logs docker container ID when parse invalid docker log in DEBUG level.

  • Set the User-Agent string to include the agent name and version string.

  • Adds host tags in the Hostname section of the
    agent status command and the status tab of the GUI.

  • Expose the number of logs processed and sent to the agent status

  • Added a warning message on agent status command and status gui
    tab when ntp offset is too large and may result in metrics
    ignored by Datadog.

  • APM: minor improvements to CPU performance.

  • APM: improved trace writer performance by introducing concurrent writing.

  • APM: the stats writer now writes concurrently to the Datadog API, improving resource usage and processing speed of the trace-agent.

  • Extends the docker check to accommodate the kernel memory usage metric.
    This metric shows the cgroup current kernel memory allocation.

  • Ask confirmation before overwriting the output file while using
    the dogstatsd-stats command.

  • Do not ship autotools within the Agent package.

  • The datadog-agent integration subcommand is now capable of installing prereleases of official integration wheels

  • Upgraded JMXFetch to 0.29.1. See https://github.com/DataDog/jmxfetch/releases/tag/0.28.0,
    https://github.com/DataDog/jmxfetch/releases/tag/0.29.0 and
    https://github.com/DataDog/jmxfetch/releases/tag/0.29.1

  • Added validity checks to NTP responses

  • Allow the '--check_period' flag of jmxfetch to be overriden by the
    DD_JMX_CHECK_PERIOD environment variable.

  • Ship integrations and their dependencies on Python 3 in Omnibus.

  • Added a warning about unknown keys in datadog.yaml.

Deprecation Notes

  • APM: the yaml setting apm_config.trace_writer.max_spans_per_payload
    is no longer in use; writes are now based solely on accumulated byte
    size.

Bug Fixes

  • Updated the DataDog/gopsutil library to include changes related to excessive DEBUG logging in the process agent

  • The computeMem is only called in the check when we ensure that it does not get passed with an empty pointer.
    But if someone was to reuse it without checking for the nil pointer it could cause a segfault.
    This PR moves the nil checking logic inside the function to ensure it is safe.

  • APM: Fixed a bug where normalize tag would not truncate tags correctly
    in some situations.

  • APM: Fixed a small issue with normalizing tags that contained the
    unicode replacement character.

  • APM: fixed a bug where modulo operators caused SQL obfuscation to fail.

  • Fix issue on process agent for DD_PROCESS_AGENT_ENABLED where 'false' did not turn off process/container collection.

  • Fix an error when adding a custom check config through the GUI
    when the folder where the config will reside does not
    exist yet.

  • APM: on macOS, trace-agent is now enabled by default, and, similarly to other
    platforms, can be enabled/disabled with the apm_config.enabled config setting
    or the DD_APM_ENABLED env var

  • Fix a bug where when the log agent is mis-configured, it temporarily hog on resources after being killed

  • Fix a potential crash when doing a configcheck while the agent was not properly initialized yet.

  • Fix a crash that could occur when having trouble connecting to the Kubelet.

  • Fix nil pointer access for container without memory cgroups.

  • Improved credentials scrubbing logic.

  • The datadog-agent integration show subcommand now properly accepts only Datadog integrations as argument

  • Fix incorrectly reported IO metrics when OS counters wrap in Linux.

  • Fixed JMXFetch process not being terminated on Windows in certain cases.

  • Empty logs could appear when collecting Docker logs in addition
    to the actual container logs. This was due to the way the Agent
    handles the header Docker adds to the logs. The process has been
    changed to make sure that no empty logs are generated.

  • Fix bug when docker container terminate the last logs are missing
    and partially recovered from restart.

  • Properly move configuration files for wheels installed locally via the integration command.

  • Reduced memory usage of the flare command

  • Use a custom patch for a costly regex in PyYAML,
    see yaml/pyyaml#301.

  • On Windows, restore the system.mem.pagefile.pct_free metric

Other Notes

  • The 'integration freeze' cli subcommand now only
    displays datadog packages instead of the complete
    result of the 'pip freeze' command.

6.11.3 / 2019-06-04

04 Jun 09:13
Compare
Choose a tag to compare

6.11.3

Prelude

Release on: 2019-06-04

  • Please refer to the 6.11.3 tag on process-agent <https://github.com/DataDog/datadog-process-agent/releases/tag/6.11.3>_ for the list of changes on the Process Agent.

Upgrade Notes

  • Upgrade JMXFetch to 0.27.1

Bug Fixes

  • APM: fixed a bug where secrets in environment variables were ignored.

6.11.2 / 2019-05-23

23 May 10:19
Compare
Choose a tag to compare

6.11.2

Prelude

Release on: 2019-05-23

Enhancement Notes

  • Add option cf_os_hostname_aliasing to send the OS hostname as an alias when using the BOSH agent on Cloud Foundry.

Bug Fixes

  • Fixes problem in which Windows Agent wouldn't install on non-English machines due to assumption that "Performance Monitor Users" didn't need to be localized.
  • Windows Installer is now more resilient to missing domain controller.

6.11.1 / 2019-05-06

06 May 09:14
Compare
Choose a tag to compare

6.11.1

Release on: 2019-05-06

Upgrade Notes

  • Change the prioritization between the two logic that we have to collect logs on Kubernetes.
    Now attempt first to collect logs on '/var/log/pods' and fallback to using the docker socket if the initialization failed.

Bug Fixes

  • Fix a bug where short image name wouldn't be properly set on old docker versions
  • Properly handle docker container logs in multiline mode in case of infrequence log messages, log file rotations or agent restart

6.11.0

17 Apr 14:39
6985f35
Compare
Choose a tag to compare

Important: 6.11.0 is not marked as latest for Windows: we are investigating some cases where 6.11.0 is not installing correctly on Windows. Downloading datadog-agent-6-latest.amd64.msi will give you version 6.10.1.

Prelude

Release on: 2019-04-17

Upgrade Notes

  • APM: move flush notifications from level "INFO" to "DEBUG"

  • APM: logging format has been changed to match the format of the core agent.

  • Metrics coming through dogstatsd with the following internal prefixes: activemq, activemq_58, cassandra, jvm, presto, solr, tomcat, kafka, datadog.trace_agent, datadog.process, datadog.agent, datadog.dogstatsd are no longer affected by the statsd_metric_namespace option.

  • Removed the internal ability to send logs to a specific logset at agent level.

  • On Windows, the Datadog Agent now runs as a non-privileged user (ddagentuser by default) rather than LOCAL_SYSTEM. Please refer to our dedicated docs for more information

  • The Windows installer will no longer allow direct downgrades; if a downgrade is required, the user must uninstall the newer version and install the older version.

New Features

  • Secrets beta feature is now available on windows allowing users to pull secrets from secret management services.

  • APM: JSON logging is now supported using the log_format_json: true setting.

  • Collect container thread count and thread limit

  • JMXFetch upgraded to 0.27.0. See 0.27.0 https://github.com/DataDog/jmxfetch/releases/tag/0.27.0 for more details.

  • The agent now ignores pod that exited more than 15 minutes ago to reduce its resource footprint when pods are not garbage-collected.
    This is configurable with the kubernetes_pod_expiration_duration option.

  • Now support CRI-O container runtime for log collection on Kubernetes.

  • Automatically add a "dirname" tag representing the directory of logs tailed from a wildcard path.

Enhancement Notes

  • AutoDiscovery can now monitor unready pods.
    It looks for a new pod annotation "ad.datadoghq.com/tolerate-unready" which, if set to true will make AutoDiscovery monitor that pod regardless of its readiness state.

  • Add the ability for the datadog-agent check command to have Python checks start an interactive debugging session.

  • Change the logging format to include the name of the logging agent instead of appending it in the agent container logs.

  • Add /metrics to the bare endpoints the agent can access.
    This is required to support querying endpoints protected by RBAC, by kube-rbac-proxy for instance.

  • APM: errors reported by the receiver's HTTP server are now shown in the logs.

  • APM: slightly improved normalization error logs.

  • On Windows, allows Agent to be installed to nonstandard directories.
    Uses APPLICATIONDATADIRECTORY to set the root of the configuration file tree, and PROJECTLOCATION to set the root of the binary tree. Please refer to the docs for more details

  • In order to decrease the number of API DCA request, the Agent now uses a different API endpoint to call the DCA's API only once in order to retrieve the Pods metadata.

  • Host metadata payloads are now zlib-compressed

  • Log file size and number of rotation is now configurable.

  • Add a command dogstatsd-stats to the agent to get basic stats about the processed metrics.

  • Support JSON arrays within environment variables, in addition to space separated values.

  • On Google Compute Engine, the Agent now reports <instance_name>.<project_id> as a host alias instead of <hostname_prefix>.<prefix_id>, which improves the uniqueness and relevance of the host alias when the GCE instance has a custom hostname.

  • The import command doesn't stop anymore when there is no conf.d or auto_conf directory.

  • Kubernetes event collection timeout can now be configured.

  • Improve status page by splitting errors and warnings from the Logs agent

  • Secrets are no longer decrypted in agent command when it's not needed (commands like hostname, launchgui, configuration ...). This reduce the number of times the 'secret_command_backend' executable will be called.

  • Improved memory efficiency on hosts sending very high numbers of metrics.

  • Resolve once the DNS name given by docker and try the associated IP to reach the kubelet.
    Prioritize HTTPS over HTTP to connect to kubelet.
    Prioritize communication using IPs over hostnames to spare DNS servers accross the cluster.

Deprecation Notes

  • Removal of largely unused go SNMP check. SNMP support still provided by the python variant.

Bug Fixes

  • Fix an auto-discovery annotation value parsing limitation in version 6 compared to version 5.
    Now, ad.datadoghq.com/*.instances annotation key supports value like [[{"foo":"bar1"}, {"foo":"bar2"}], {"name":"bar3"}]

  • The agent container will now output valid JSON when using JSON log format.

  • APM: Multiple value "Content-Type" headers are now parsed correctly for media type in the HTTP receiver.

  • APM: always reply with correct Content-Type in API responses.

  • APM: when a span's resource is empty, the error "Resource can not be empty" will be returned instead of the wrong "Resource is invalid UTF-8".

  • APM: sensitive information is now scrubbed from logs.

  • APM: Fix issue with --version flag when API key is unset.

  • APM: Ensure UTF-8 characters are not cut mid-way when truncating
    span fields.

  • Metrics coming through dogstatsd with the following internal prefixes: activemq, activemq_58, cassandra, jvm, presto, solr, tomcat, kafka, datadog.trace_agent, datadog.process, datadog.agent, datadog.dogstatsd are no longer affected by the statsd_metric_namespace option.

  • Fixes ec2 tags collection when datadog agent is deployed into a kubernetes cluster along with kube2iam.

  • Fixes bug in which upgrading from agent5 doesn't correctly import the configuration

  • Fix a race condition in gohai that could make the Agent crash while collecting the host's filesystem metadata

  • Hostnames containing characters that are invalid for a filename no longer prevent the agent from generating a flare.

  • Allow macOS users to invoke the datadog-agent integration command as root since the installation directory is owned by root.

  • Change to a randomized exponential backoff in case of connection failure

  • Ignore empty logs_dd_url to fall back on default config for logs agent.

  • Detect and handle Docker logs with only header and empty content

  • To mitigate issues with the hostname detection on AKS, hostnames gathered from the metadata endpoints of AWS, GCE, Azure, and Alibaba cloud are no longer considered valid if their length exceeds 255 characters.

Other Notes

  • Bump embedded Python to 2.7.16

6.10.2

20 Mar 15:52
Compare
Choose a tag to compare

Prelude

Release on: 2019-03-20

Bug Fixes

  • Fix a race condition in Autodiscovery leading to some checks not
    being unscheduled on container exit

6.10.1

07 Mar 18:27
5e1bec3
Compare
Choose a tag to compare

Prelude

Release on: 2019-03-07

Bug Fixes

  • APM: Mixing cases in apm_config.analyzed_spans and apm_config.analyzed_rate_by_service
    entries is now allowed. Service names and operation names will be treated as case insensitive.

  • Refactor the ContainerdUtil so that each call to the containerd api has a dedicated timeout.