Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use VM hostname for logs, traces and metrics #129

Merged
merged 15 commits into from
Jun 8, 2022

Conversation

NouemanKHAL
Copy link
Member

@NouemanKHAL NouemanKHAL commented Jun 7, 2022

Send traces, logs and metrics using the Host VM's hostname and tagging them accordingly with their application_name, instance_index...

@NouemanKHAL NouemanKHAL added the changelog/Changed Changed features results into a major version bump label Jun 7, 2022
@NouemanKHAL NouemanKHAL requested a review from a team as a code owner June 7, 2022 16:19
lib/run-datadog.sh Show resolved Hide resolved
lib/dist/datadog.yaml Outdated Show resolved Hide resolved
lib/dist/datadog.yaml Outdated Show resolved Hide resolved
lib/scripts/create_logs_config.py Outdated Show resolved Hide resolved
@NouemanKHAL NouemanKHAL force-pushed the noueman/minimal-changes branch from c1cf122 to b1bb120 Compare June 8, 2022 13:24
lib/run-datadog.sh Outdated Show resolved Hide resolved
@NouemanKHAL NouemanKHAL requested a review from therve June 8, 2022 14:09
@github-actions github-actions bot added the documentation Documentation related changes label Jun 8, 2022
@NouemanKHAL NouemanKHAL merged commit eb887f0 into master Jun 8, 2022
@NouemanKHAL NouemanKHAL deleted the noueman/minimal-changes branch June 8, 2022 14:48
@dominikmueller
Copy link

dominikmueller commented Jun 15, 2022

Hi @NouemanKHAL,

We came across this merge request since it might be related to an issue that we are currently debugging regarding the newest update of the Datadog buildpack in our cloud foundry installation.

As a background:

  • We're running a Cloud Foundry installation with the Datadog Cluster Monitoring and Datadog Application Monitoring Tile (/buildpack).
  • We have a unique tag across our IT landscape that gets used to assign costs to products (e. g. Datadog costs for metrics get charged to the product teams/customers that are producing them). Let's name this tag product for this explanation.
  • We've set this tag in the Datadog Cluster Agent configuration to assign all infrastructure monitoring costs of our Cloud Foundry installation to the platform team (e. g. product:platform) .
  • The product teams/customers set the same tag using DD_TAGS for their containers/applications (e.g. product:myapp).

Now to the issue:
Before updating the buildpack, all apps/containers had their own tags (only those configured using DD_TAGS and some cloud foundry specific tags).

After updating the buildpack (which included this change), the logs and metrics of those containers don't only get tagged with the application tags but additionally with tags configured for the Cluster Monitoring.

We assume this happened since this MR sets the hostname of the logs to the hostname of our Diego Cells. The container/application logs are assigned to be logs of the infrastructure host instead of the container instance.

Since the tags of our cloud foundry containers and the tags configured for the cluster agent configuration get joined, we don't have unique tags anymore (e. g. the billing assignment tag gets assigned twice: once coming from the infrastructure host mentioning the platform team -> product:platform; and once coming from the container configuration coming from the devs -> product:myapp)

This issue affects our billing and some more logic that relies on the uniqueness of tags.

Is there a specific reason for this change or is it possible to add a switch to enable/disable this hostname identification?
Another solution might be that our cloud foundry containers get identified as containers running on the infrastructure host, but I think that this is out of scope (Diego/garden as container runtime/scheduler vs. k8s).
We've been happy with the container hostnames before and might need to change our whole tagging concept when the implementation stays like this.

We will also open a Datadog ticket that links to this MR.

Best regards
Dominik

@therve
Copy link

therve commented Jun 15, 2022

Hi,

We did consider containers being seen as host a bug, which is the reason that change. Presumably you don't have DD_ENABLE_CHECKS sets to true in your configuration?

We can introduce a flag to change the buildpack behavior if that solves your issue. Thanks.

@dominikmueller
Copy link

Hi,

We understand that containers showing up as hosts is a bug. We would also prefer to have our cloud foundry containers show up as "containers running on a host" (which currently is not the case - neither with this change nor without).

We did not set DD_ENABLE_CHECKS to true since this would give us host metrics (such as CPU usage, load average, etc.) for our Cloud Foundry containers. Since these metrics get collected from the host system (in our case Diego Cells), they do not match the container resources and therefore are not useful for us (it shows 32 GB usable memory instead of 500 MB assigned memory).

It would be great if the Cloud Foundry containers would get recognized as containers running on a host. However, unlike k8s, this does not seem to be supported for Cloud Foundry.

The best solution would be to fix the handling of containers for cloud foundry.
Until then, a flag to change the buildpack behavior to restore the previous behavior would be a workaround that would work for us.

@therve
Copy link

therve commented Jun 20, 2022

It would be great if the Cloud Foundry containers would get recognized as containers running on a host. However, unlike k8s, this does not seem to be supported for Cloud Foundry.

It is supported, but it's possible there is an issue in your setup. Do you mind opening a ticket so we can track this down? Thanks.

@dominikmueller
Copy link

Thanks for the update! We've updated the ticket referencing this PR to figure out potential issues in our setup.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
changelog/Changed Changed features results into a major version bump documentation Documentation related changes
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants