Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fluent-bit 3.0.4 throwing repeated warnings ( [ warn] [record accessor] translation failed, root key=hec_token) #8859

Closed
Nithinrao9696 opened this issue May 22, 2024 · 10 comments

Comments

@Nithinrao9696
Copy link

Bug Report

Describe the bug
After upgrading fluent-bit from 3.0.3 to 3.0.4, we have noticed fluent-bit pods started throwing repeated warnings in the logs

To Reproduce
upgrade to 3.0.4 version of fluent-bit

  • Rubular link if applicable:
  • Example log message if applicable:

[2024/05/22 13:12:09] [ warn] [record accessor] translation failed, root key=hec_token
[2024/05/22 13:12:09] [ warn] [record accessor] translation failed, root key=hec_token
[2024/05/22 13:12:09] [ warn] [record accessor] translation failed, root key=hec_token
[2024/05/22 13:12:09] [ warn] [record accessor] translation failed, root key=hec_token
[2024/05/22 13:12:09] [ warn] [record accessor] translation failed, root key=hec_token

@digarok
Copy link

digarok commented May 22, 2024

I think there's a problem with this PR.
https://github.com/fluent/fluent-bit/pull/8793/files

We're getting spammed with the same message as the user above, millions of times per minute:
[2024/05/22 16:08:04] [ warn] [record accessor] translation failed, root key=hec_token

@olfway
Copy link

olfway commented May 22, 2024

We have same error

@yuliyan-valchev-ft
Copy link

yuliyan-valchev-ft commented May 23, 2024

Same error
[2024/05/22 15:06:30] [ warn] [record accessor] translation failed, root key=hec_token
[2024/05/22 15:06:30] [debug] [output:splunk:splunk.0] Could not find hec_token in metadata

It started appearing right after we started using the latest helm chart 0.46.7 with app version 3.0.4

@edsiper
Copy link
Member

edsiper commented May 23, 2024

we are looking into this.

edsiper added a commit that referenced this issue May 23, 2024
The following patch perform 2 changes in the code that helps to fix the
problems found with Splunk hec token handling:

1. In the recent PR #8793, when using the record accessor API flb_ra_translate_check()
   to validate if the hec_token field exists, leads to noisy log messages since
   that function warns the issue if the field is not found. Most of users are not
   using hec_token set by Splunk input plugin, so their logging gets noisy.

   This patch replaces that call with flb_ra_translate() which fixes the problem.

2. If hec_token was set in the record metadata, it was being store in the main
   context of the plugin, however the flush callbacks that formats and deliver the
   data runs in separate/parallel threads that could lead to a race condition if
   more than onen thread tries to manipulate the value.

   This patch adds protection to the context value so it becomes thread safe.

Signed-off-by: Eduardo Silva <[email protected]>
@edsiper
Copy link
Member

edsiper commented May 23, 2024

Fix will be there shortly: #8864

@nehjain17
Copy link

We are facing security issue and need to upgrade to 3.0.4 asap. Getting the same error.

@nehjain17
Copy link

With 3.0.5 getting below error.
[2024/05/24 05:47:13] [ warn] [output:splunk:splunk.16] http_status=401:
{"text":"Token is required","code":2}

@edsiper
Copy link
Member

edsiper commented May 24, 2024

With 3.0.5 getting below error. [2024/05/24 05:47:13] [ warn] [output:splunk:splunk.16] http_status=401: {"text":"Token is required","code":2}

@nehjain17 What is your config?

@nehjain17
Copy link

Output Config:

[OUTPUT]
Name splunk
Match index1*
event_index index1
host splunk_host_name
Port 443
splunk_token ${TOKEN}
tls On
tls.verify Off
Retry_Limit 5
net.connect_timeout 30
net.keepalive on

@nehjain17
Copy link

nehjain17 commented May 24, 2024 via email

markuman pushed a commit to markuman/fluent-bit that referenced this issue May 29, 2024
The following patch perform 2 changes in the code that helps to fix the
problems found with Splunk hec token handling:

1. In the recent PR fluent#8793, when using the record accessor API flb_ra_translate_check()
   to validate if the hec_token field exists, leads to noisy log messages since
   that function warns the issue if the field is not found. Most of users are not
   using hec_token set by Splunk input plugin, so their logging gets noisy.

   This patch replaces that call with flb_ra_translate() which fixes the problem.

2. If hec_token was set in the record metadata, it was being store in the main
   context of the plugin, however the flush callbacks that formats and deliver the
   data runs in separate/parallel threads that could lead to a race condition if
   more than onen thread tries to manipulate the value.

   This patch adds protection to the context value so it becomes thread safe.

Signed-off-by: Eduardo Silva <[email protected]>
Signed-off-by: Markus Bergholz <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

6 participants