Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Agent log output in JSON? #1109

Open
dpratt opened this issue Nov 22, 2019 · 26 comments
Open

Agent log output in JSON? #1109

dpratt opened this issue Nov 22, 2019 · 26 comments

Comments

@dpratt
Copy link
Contributor

dpratt commented Nov 22, 2019

We run all our services in docker containers in a k8s cluster, as I suspect quite a few Datadog users do, and as such we have our standard boilerplate set to emit all logs from all sources (channeled through slf4j) as properly formatted single line JSON. This allows for the Datadog agent to pick up and properly index logs and deal with things like stack traces in a sane fashion.

For the most part, this works quite well, except for log messages printed to standard out by the datadog java agent. Since the slf4j infrastructure at that level is loaded and configured in a different classloader, we don't really have any way to influence it's output - it appears to just be using the stock simple slf4j logger implementation, which only really allows for coarse grained configuration of log levels.

Is there a mechanism by which we could have the java agent use logback or some other slf4j binding that outputs JSON instead? We have people getting woken up at all hours of the night due to errors generating alerts because the DD log infrastructure interprets anything that can't be parsed as JSON at the ERROR level. We've mitigated a bunch of this by adding -Ddatadog.slf4j.simpleLogger.defaultLogLevel=WARN to our JVM launch parameters, but there's still a bunch of stuff that occasionally gets through.

@jeremy-lq
Copy link
Member

Apologies for the delay @dpratt. Can you open a support ticket so we can properly track this feature request? Thanks.

@svetozar02
Copy link

I'm interested in this as well.
May I point out that DD docs recommend logging in logstash json format but the same can't be accomplished easily with DD agent...

@matt-canty-dragon
Copy link

Is there upside that you might be able to share @dpratt ? I have same issue.

@raj-saxena
Copy link

@jeremy-lq has this been implemented? Is there another ticket/issue where this is being tracked?

@philippneugebauer
Copy link

@svetozar02 cannot agree more with your statement. DD does a statement in the docs and completely contradicts it with its software

@asharma3007
Copy link

Supporting development of this feature (which I see as a bug) - for the reasons others have described. Our monitoring team has to spend time looking at every alert, and these are fruitless ones. We are increasing the risk of alert fatigue as well.

@robinglen
Copy link

robinglen commented Jan 28, 2022

This has become a huge issue for us enabling Datadog application security. Enabling it means you get potential logs like:

WARN com.datadog.appsec.gateway.AppSecRequestContext - xxx

You can clearly see in the log its a warning, in this case about a potential security issue but I quote from support case "It looks like these logs may be going to stderr, is that right? If so, we assign the ERROR status to all stderr logs by default (noted in the Important Notes here: https://docs.datadoghq.com/agent/docker/log/?tab=hostagent#installation )."

This means our applications are now flooded with error logs that we can't really do anything with.

@ValentinZakharov
Copy link
Contributor

ValentinZakharov commented Jan 28, 2022

@thearegee I've seen support case, you are talking about. I'm sorry for that bad experience. There was an AppSec issue which cause spams WARN message for every requests. But now it's fixed and will be available in coming release (today).
In addition, we are going to rework logging system, to make it more neat in future releases.

@robinglen
Copy link

@ValentinZakharov thank you thats great news and thanks for getting this in so quickly. We will upgrade this today and see how it looks. If we can help with logging please let me know

@franciscosantana
Copy link

Hi @ValentinZakharov ! Any update about the rework on logging system?

@jon-p
Copy link

jon-p commented Jul 25, 2022

Any update about the rework on logging system?

Any movement on this? 3 years later and we're still having to configure datadog agent to log to System.out so as not to spam fake errors to the datadog log... which is kinda funny/annoying.

@ValentinZakharov
Copy link
Contributor

I'm sorry for the delay. Reworking of current logging system been postponed due critical impact on existing customers. However, we are trying to fix customer's issues with minor fixes.
@jon-p If I understood the problem correctly, because of Java tracer sends messages into seterr, they are detected as an errors in DataDog dashboard, even if they are just messages. If I'm correct, then you can try to configure Java tracer logs to be written into dedicated log file with the option:

-Ddatadog.slf4j.simpleLogger.logFile=<NEW_LOG_FILE_PATH>

Ref: documentation
In this case, I suppose, Agent should ignore messages from the Java tracer.

Please, let me know, if that way, can't solve the problem.

@dpratt
Copy link
Contributor Author

dpratt commented Aug 2, 2022

The solution above doesn't really solve the problem, in that if you redirect the log messages to a file, they'll likely just be ignored since they're not going to stderr/stdout. Additionally, if you're running in a container, this is likely an anti-pattern since you'll be writing to an ephemeral file that will eventually just be discarded when your container exits. The real solution here would be to do one of two things -

  • Have dd-java-agent accept a configuration option (perhaps datadog.slf4j.logFormat?) that would allow you to change the default output to well-formatted JSON instead of raw text, which make it output logs in the way that Datadog itself reccomends as a best practice.
  • Add a default configuration to the datadog log pipelines that would detect messages of this specific format and have a default glob pattern that extracts the level, message and timestamp so that things 'just work' in the default out-of-the-box configuration. Note: I do not like this solution either.

@ValentinZakharov
Copy link
Contributor

@dpratt thanks for sharing details of proposed solution.
I've discussed details of the feature request with colleagues. Unfortunately, due to limited resources, new features are implemented in accordance with the priorities set by the product management team. Mostly priorities are depend on the number and size of customers. At the moment, logger improvements is placed in the backlog and has a low priority, so I cannot provide any time estimation for this feature.

@bkmjaris
Copy link

Just in case it helps anyone else -- Our office has also filed a feature request ticket with DD Support (892588) providing a link to this GitHub issue. No new timing or announcements are available, but we've added our voice in support of a solution.

@mrzor
Copy link

mrzor commented Nov 18, 2022

I'm adding our account to the list of affected customers who would like to see a resolution.

Given how hard it is to build dd-trace-java on our side, and the fact that the agent's classloader is fully locked, short of monkeypatching classfiles we have no way of getting properly structured logs out of the APM agent.

@der-eismann
Copy link

We also stumbled about this in our trial phase. DataDog recommends JSON logging in its own docs, but then the agent logs get marked as errors in the DataDog web ui.

@briceburg
Copy link

briceburg commented Mar 7, 2023

Does setting the DD_LOG_FORMAT_JSON environment variable or applying the log_format_json datadog.yml configuration output the desired structured logs? These would be agent logs only (e.g. https://hub.docker.com/r/datadog/agent) -- it's up to individual services/applications to format their own logs. Afaik there's no "java" only agent that the OP is refering to?

https://github.com/DataDog/datadog-agent/blob/a3d970f0eb8359f7a4676768a91f0165e8203887/pkg/config/config_template.yaml#L2050-L2054

@greg-p-slingshot
Copy link

@briceburg, I assume the above posters are talking about the dd-java-agent.jar that the documentation instructs you to download and run with your java app:
https://docs.datadoghq.com/tracing/trace_collection/dd_libraries/java/#add-the-java-tracer-to-the-jvm
Calling it the "java agent" seem reasonable to me, although might not be the correct term. And yes, I've tried setting that the DD_LOG_FORMAT_JSON=true environment variable, the yml config, and a few other things, all to no avail.

@briceburg
Copy link

OK yes, I struggle with this too across the various ddtrace implementions.

In an effort to "have all logs in JSON", I reduced the output of ddtrace to "WARNING" by setting the DD_TRACE_LOG_FILE_LEVEL=WARNING environment variable wherever it gets included (e.g. the dd-java-agent.jar [why that is not called dd-java-trace.jar is another question ;)]), and then datadog seems to do a good job in its log processors to properly tag the message; but yeah, super annoying its not in JSON.

@jakubkalicki-tink
Copy link

jakubkalicki-tink commented Jun 6, 2023

When can we expect this problem to be fixed? 😞

I have tried DD_TRACE_LOG_FILE_LEVEL=WARNING mentioned by @briceburg, but I still see info logs

@gabrielhof
Copy link

gabrielhof commented Jun 13, 2023

I was able to configure the java agent log messages to go to stdout and also get the messages correctly parsed by Datadog.

TL;DR

If you just want Datadog to stop identifying the agent logs messages as errors, add the following when running your application via java -jar:

-Ddatadog.slf4j.simpleLogger.logFile=System.out

If you also want these logs to get parsed by datadog, add the following:

-Ddatadog.slf4j.simpleLogger.dateTimeFormat="yyyy-MM-dd HH:mm:ss.SSS"

Explanation:

About logs going to stderr:

The default value for datadog.slf4j.simpleLogger.logFile is System.err, which causes logs go to stderr:

Changing this property to use System.out fixes it.

About getting Datadog to parse the logs

The default for datadog.slf4j.simpleLogger.dateTimeFormat is '[dd.trace 'yyyy-MM-dd HH:mm:ss:SSS Z']':

The default date format causes the agent log messages not to be accepted by any of Datadog's default formats for parsing Java logs:

image

image

Simply changing the log date format to yyyy-MM-dd HH:mm:ss.SSS is enough for one of the default formats to parse the message.

image

@ls-rein-martha
Copy link

Another workaround (only for user that uses DataDog LogExplorer feature), is to create a new Grok processor in your DataDog log pipeline. You also can add new rules to existing default Java processors.

I added the same as the default Java and update the date pattern

Rule:

irregular_dd_format \[dd\.trace %{_irregular_date}\]\s+\[%{_thread_name}\]\s+%{_status}\s+%{_logger_name}\s*(%{_context}\s*)?- (?>%{word:dd.trace_id} %{word:dd.span_id} - )?%{data:message}((\n|\t)%{data:error.stack})?

Helper rule:

_irregular_date %{date("yyyy-MM-dd HH:mm:ss:SSS Z"):timestamp}
_duration %{integer:duration}
_thread_name %{notSpace:logger.thread_name}
_status %{word:level}
_logger_name %{notSpace:logger.name}
_context %{notSpace:logger.context}
_line %{integer:line}

@SJrX
Copy link

SJrX commented Oct 13, 2023

Thank you @gabrielhof , I think that the system property should not have quotes, or if it should, then you should quote the whole thing. At least for me in a Dockerfile the quotes just cause quotes to appear around the timestamp.

i.e., I think changing this:

-Ddatadog.slf4j.simpleLogger.dateTimeFormat="yyyy-MM-dd HH:mm:ss.SSS"

to

"-Ddatadog.slf4j.simpleLogger.dateTimeFormat=yyyy-MM-dd HH:mm:ss.SSS"

@nasilrgb
Copy link

nasilrgb commented Jul 17, 2024

I got this error from javaagent, why it is showing as error ?

Screenshot 2024-07-17 at 10 05 27 PM

@sausti
Copy link

sausti commented Nov 13, 2024

I got this error from javaagent, why it is showing as error ?

Screenshot 2024-07-17 at 10 05 27 PM

@nasilrgb Were you able to prevent that log from appearing at the error level? Updating our Dockerfile with the suggestions above, i.e.:

ENTRYPOINT ["java", "-Ddatadog.slf4j.simpleLogger.logFile=System.out", "-Ddatadog.slf4j.simpleLogger.dateTimeFormat=yyyy-MM-dd HH:mm:ss.SSS", "-jar", "/application.jar"]

We can ensure all other Datadog agent logs have the correct log level. This, for some reason, still logs at the error level.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests