Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

quarkus-logging-gelf: mapper_parsing_exception when include-log-message-parameters is true (default) #15370

Closed
Shohou opened this issue Mar 1, 2021 · 13 comments · Fixed by #18069
Labels
kind/enhancement New feature or request
Milestone

Comments

@Shohou
Copy link

Shohou commented Mar 1, 2021

Describe the bug
When using quarkus-logging-gelf with graylog I get

ElasticsearchException[Elasticsearch exception [type=mapper_parsing_exception, reason=failed to parse field [MessageParam3] of type [long] in document with id '1448dac1-75c8-11eb-8964-00505681de37'. Preview of field's value: '1.10.5.Final']]; nested: ElasticsearchException[Elasticsearch exception [type=illegal_argument_exception, reason=For input string: "1.10.5.Final"]];

This happens when first appearance of message parameter is long, Graylog (or elasticsearch) remembers the type and next messages with string parameters fail indexing process.

Expected behavior
Message indexing does not fail

Actual behavior
I get indexing errors and messages are lost

To Reproduce
try to log message like
log.info("My long types {}, {}, {}, {}, {}", 1, 2, 3, 4, 5);
and then message with strings
log.info("My string types {}, {}, {}, {}, {}", "a", "b", "c", "d", "e");

Setup logging to graylog and check it indexing errors

Environment (please complete the following information):

  • Quarkus version or git rev: 1.12.0.Final

Additional context
Disabling message parameters fixes the issue
quarkus.log.handler.gelf.include-log-message-parameters=false

I first filled an issue in logstash-gelf - mp911de/logstash-gelf#264
They expect users of their library to set parameter types. Bad decision IMHO. This parameters are generated automatically and type also should be set automatically...
Quarkus also doesn't allow to set types only. If I add quarkus.log.handler.gelf.additional-field.MessageParam0.type quarkus also requires to set quarkus.log.handler.gelf.additional-field.MessageParam0.value

@Shohou Shohou added the kind/bug Something isn't working label Mar 1, 2021
@gsmet
Copy link
Member

gsmet commented Mar 1, 2021

/cc @loicmathieu

@loicmathieu
Copy link
Contributor

This happens when first appearance of message parameter is long, Graylog (or elasticsearch) remembers the type and next messages with string parameters fail indexing process.

This is not an issue but a limitation of Elasticsearch, when it first encounters a field, if it has no mapping for it, it will create a mapping for this field based on the filed type (by automatically detecting the type). The type of the field cannot be changed in the mapping.

This is not something we can fix or workaround, this is how Elasticsearch works.

As I understand it, you already ask the upstream library for a way to fix this, and the only way is to tell Elasticsearch which type is your field to avoid the automatic mapping, you can do it in two ways:

  • Specifyin the type of the parameter via additionalFieldTypes
  • Manually creating the mapping on Elasticsearch prior to send any log

I cannot see anything we can do on our side.

@Shohou
Copy link
Author

Shohou commented Mar 2, 2021

@loicmathieu So how do I specify the type of parameters via additionalFieldTypes using quarkus-logging-gelf extension?

@Shohou
Copy link
Author

Shohou commented Mar 2, 2021

extension states that it supports graylog, but literally every person who wants to use it with default settings will face the issue and will have to set this parameters. I believe this deserves a mention in the docs at least

@gsmet
Copy link
Member

gsmet commented Mar 2, 2021

@loicmathieu @Shohou is saying that he can't define the types because the additional field config requires a value and from what I can see, he's right.

So if it's possible with Gelf to only define the type for additional fields (I have no idea if it does but looks like that's what Mark is saying in the other issue), we should also allow this and make the value optional.

@loicmathieu
Copy link
Contributor

I'm not sure configuring additionalFieldTypes will do the trick, it is for static fields added via configuration (for example, you can add a 'component' fileld with the name of your application). And it is for all messages so you will configure a type 'message0' for all logs and not only for the one that needs a long.

I would suggest to disable includeLogMessageParameters, this property will include each message parameter on a separate field, disabling it will do the trick while still having the value of the parameter inside Graylog (as it will be inside the message).

Another solution would be to only use String on your log message parameter, so using String.valueOf().

There is still the possibility to create manually the Elasticsearcg mapping, this would be the prefered way to do it from an Elasticssearch perspective.

@gsmet
Copy link
Member

gsmet commented Mar 2, 2021

But it looks like Mark is saying you can do it with additionalFieldTypes here: mp911de/logstash-gelf#264 (comment) so it might be worth a try. At least, our config should allow that if it's a supported feature of Gelf.

And it is for all messages so you will configure a type 'message0' for all logs and not only for the one that needs a long.

Not sure I understand this one. Your types need to be consistent across all messages given you need the schema to accept all log messages.

@Shohou
Copy link
Author

Shohou commented Mar 2, 2021

And it is for all messages so you will configure a type 'message0' for all logs and not only for the one that needs a long.

I definitely don't want to configure type for each message separately (is it even possible?). I only want all my messages getting to graylog. So String type for messageParameterX would be absolutely ok for me. Even if it's long I'm absolutely fine if it will have String type in graylog, because I understand that this parameters type is completely unpredictable.

Setting quarkus.log.handler.gelf.include-log-message-parameters to false is what I do right now.

Making all logging parameters String, it's not possible... Will you also change types of parameters everywhere in quarkus? In 3rd party libraries...

I found how to create elasticsearch mapping - https://docs.graylog.org/en/3.2/pages/configuration/elasticsearch.html#custom-index-mappings
If you think this is the right way, maybe you are right. But it needs to be in the documentation in section about graylog. I will stick with include-log-message-parameters=false though

@gsmet gsmet added kind/enhancement New feature or request and removed kind/bug Something isn't working labels Mar 2, 2021
@loicmathieu
Copy link
Contributor

Not sure I understand this one. Your types need to be consistent across all messages given you need the schema to accept all log messages.

Imagine the following snippet:

LOG.info("some {} message {} with {} param", 1, 2, 3);
LOG.info("other {} message {} with {} param", true, true, true);

These two logs will need to have the types of there messages configured.
As pointed out by @Shohou if you configure uarkus.log.handler.gelf.additional-field.MessageParam0.type will you configure it as int or boolean ? Because it will be used by the two log message.
So you will only be able to configure it for String?

We may need to ask more clarification to Mark.

Not sure I understand this one. Your types need to be consistent across all messages given you need the schema to accept all log messages.

True and this is the more complex part, that's why I usggest disabling individual message parameter inclusion inside the generated elasticsearch document, because you would always want String in this case as utilmately everything can be converted to a String.

@Shohou
Copy link
Author

Shohou commented Jun 16, 2021

A note - you shouldn't enable quarkus.log.handler.gelf.include-full-mdc for the same reason

@loicmathieu
Copy link
Contributor

@Shohou I created #18069 with some explaination of what's happening and how to fix it.

I don't think we can really do something more about it, it's the developer's (or the ops) responsibilty to configure the Elasticsearch index when using a tool that send data to Elasticsearch so a documentation section should be enought to help.
So I propose to close this issue now, are you OK with that ?

@Shohou
Copy link
Author

Shohou commented Jun 22, 2021

Yes, I read the docs and it explain the situation and points to further reading, so I think it's OK

@loicmathieu
Copy link
Contributor

@Shohou thanks for your feedback.

loicmathieu added a commit to loicmathieu/quarkus that referenced this issue Jun 22, 2021
Fixes quarkusio#15370
WIP Update docs/src/main/asciidoc/centralized-log-management.adoc
@quarkus-bot quarkus-bot bot added this to the 2.1 - main milestone Jun 22, 2021
@gsmet gsmet modified the milestones: 2.1 - main, 2.0.0.Final Jun 22, 2021
gsmet pushed a commit to gsmet/quarkus that referenced this issue Jun 22, 2021
Fixes quarkusio#15370
WIP Update docs/src/main/asciidoc/centralized-log-management.adoc

(cherry picked from commit 3e89e28)
patrox pushed a commit to patrox/quarkus that referenced this issue Jun 23, 2021
Fixes quarkusio#15370
WIP Update docs/src/main/asciidoc/centralized-log-management.adoc
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/enhancement New feature or request
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants