Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

in_splunk: Add switch for storing in metadata or records and handle multiple tokens on in splunk #8900

Conversation

cosmo0920
Copy link
Contributor

@cosmo0920 cosmo0920 commented May 31, 2024

In this enhancement, I implemented capabilities for storing metadata into records and handle multiple HEC tokens in configuration.


Enter [N/A] in the box, if an item is not applicable to your change.

Testing
Before we can approve your change; please submit the following in a comment:

  • Example configuration file for the change

For injecting HEC tokens into records case

[INPUT]
    Name splunk
    Tag splunk.test.ingest
    HTTP2 off
    store_token_in_metadata off
    Splunk_Token 7bba8847-4aee-4e62-ba7b-08c6139e42b9,4e63c0c9-c3b5-4a0a-bf4d-bfd5bc0d0070
[OUTPUT]
    Name stdout
    Match *

For injecting HEC tokens into metadata case (default behavior)

[INPUT]
    Name splunk
    Tag splunk.test.ingest
    HTTP2 off
    store_token_in_metadata on
    Splunk_Token 7bba8847-4aee-4e62-ba7b-08c6139e42b9,4e63c0c9-c3b5-4a0a-bf4d-bfd5bc0d0070
[OUTPUT]
    Name stdout
    Match *
  • Debug log output from testing the change

For injecting HEC tokens into records case

Fluent Bit v3.0.7
* Copyright (C) 2015-2024 The Fluent Bit Authors
* Fluent Bit is a CNCF sub-project under the umbrella of Fluentd
* https://fluentbit.io

___________.__                        __    __________.__  __          ________  
\_   _____/|  |  __ __   ____   _____/  |_  \______   \__|/  |_  ___  _\_____  \ 
 |    __)  |  | |  |  \_/ __ \ /    \   __\  |    |  _/  \   __\ \  \/ / _(__  < 
 |     \   |  |_|  |  /\  ___/|   |  \  |    |    |   \  ||  |    \   / /       \
 \___  /   |____/____/  \___  >___|  /__|    |______  /__||__|     \_/ /______  /
     \/                     \/     \/               \/                        \/ 

[2024/06/05 15:16:58] [ info] Configuration:
[2024/06/05 15:16:58] [ info]  flush time     | 1.000000 seconds
[2024/06/05 15:16:58] [ info]  grace          | 5 seconds
[2024/06/05 15:16:58] [ info]  daemon         | 0
[2024/06/05 15:16:58] [ info] ___________
[2024/06/05 15:16:58] [ info]  inputs:
[2024/06/05 15:16:58] [ info]      splunk
[2024/06/05 15:16:58] [ info] ___________
[2024/06/05 15:16:58] [ info]  filters:
[2024/06/05 15:16:58] [ info] ___________
[2024/06/05 15:16:58] [ info]  outputs:
[2024/06/05 15:16:58] [ info]      stdout.0
[2024/06/05 15:16:58] [ info] ___________
[2024/06/05 15:16:58] [ info]  collectors:
[2024/06/05 15:16:58] [ info] [fluent bit] version=3.0.7, commit=8e004d495b, pid=230460
[2024/06/05 15:16:58] [debug] [engine] coroutine stack size: 24576 bytes (24.0K)
[2024/06/05 15:16:58] [ info] [storage] ver=1.1.6, type=memory, sync=normal, checksum=off, max_chunks_up=128
[2024/06/05 15:16:58] [ info] [cmetrics] version=0.9.0
[2024/06/05 15:16:58] [ info] [ctraces ] version=0.5.1
[2024/06/05 15:16:58] [ info] [input:splunk:splunk.0] initializing
[2024/06/05 15:16:58] [ info] [input:splunk:splunk.0] storage_strategy='memory' (memory only)
[2024/06/05 15:16:58] [debug] [splunk:splunk.0] created event channels: read=21 write=22
[2024/06/05 15:16:58] [ info] [output:stdout:stdout.0] worker #0 started
[2024/06/05 15:16:58] [debug] [downstream] listening on 0.0.0.0:8088
[2024/06/05 15:16:58] [debug] [stdout:stdout.0] created event channels: read=24 write=25
[2024/06/05 15:16:58] [ info] [sp] stream processor started
[2024/06/05 15:17:00] [debug] [input:splunk:splunk.0] Mark as unknown type for ingested payloads
[2024/06/05 15:17:00] [debug] [task] created task=0x5f1d1d0 id=0 OK
[2024/06/05 15:17:00] [debug] [output:stdout:stdout.0] task_id=0 assigned to thread #0
[0] splunk.test.ingest: [[1717568220.883936807, {}], {"event"=>"Pony 1 has left the barn", "@splunk_token"=>"Splunk 7bba8847-4aee-4e62-ba7b-08c6139e42b9"}]
[1] splunk.test.ingest: [[1717568220.883936807, {}], {"event"=>"Pony 2 has left the barn", "@splunk_token"=>"Splunk 7bba8847-4aee-4e62-ba7b-08c6139e42b9"}]
[2] splunk.test.ingest: [[1717568220.883936807, {}], {"event"=>"Pony 3 has left the barn", "nested"=>{"key1"=>"value1"}, "@splunk_token"=>"Splunk 7bba8847-4aee-4e62-ba7b-08c6139e42b9"}]
[2024/06/05 15:17:00] [debug] [out flush] cb_destroy coro_id=0
[2024/06/05 15:17:00] [debug] [task] destroy task=0x5f1d1d0 (task_id=0)
^C[2024/06/05 15:17:02] [engine] caught signal (SIGINT)
[2024/06/05 15:17:02] [ warn] [engine] service will shutdown in max 5 seconds
[2024/06/05 15:17:02] [ info] [input] pausing splunk.0
[2024/06/05 15:17:02] [ info] [engine] service has stopped (0 pending tasks)
[2024/06/05 15:17:02] [ info] [input] pausing splunk.0
[2024/06/05 15:17:02] [ info] [output:stdout:stdout.0] thread worker #0 stopping...
[2024/06/05 15:17:02] [ info] [output:stdout:stdout.0] thread worker #0 stopped

For injecting HEC tokens into metadata case (default behavior)

Fluent Bit v3.0.7
* Copyright (C) 2015-2024 The Fluent Bit Authors
* Fluent Bit is a CNCF sub-project under the umbrella of Fluentd
* https://fluentbit.io

___________.__                        __    __________.__  __          ________  
\_   _____/|  |  __ __   ____   _____/  |_  \______   \__|/  |_  ___  _\_____  \ 
 |    __)  |  | |  |  \_/ __ \ /    \   __\  |    |  _/  \   __\ \  \/ / _(__  < 
 |     \   |  |_|  |  /\  ___/|   |  \  |    |    |   \  ||  |    \   / /       \
 \___  /   |____/____/  \___  >___|  /__|    |______  /__||__|     \_/ /______  /
     \/                     \/     \/               \/                        \/ 

[2024/06/05 15:15:49] [ info] Configuration:
[2024/06/05 15:15:49] [ info]  flush time     | 1.000000 seconds
[2024/06/05 15:15:49] [ info]  grace          | 5 seconds
[2024/06/05 15:15:49] [ info]  daemon         | 0
[2024/06/05 15:15:49] [ info] ___________
[2024/06/05 15:15:49] [ info]  inputs:
[2024/06/05 15:15:49] [ info]      splunk
[2024/06/05 15:15:49] [ info] ___________
[2024/06/05 15:15:49] [ info]  filters:
[2024/06/05 15:15:49] [ info] ___________
[2024/06/05 15:15:49] [ info]  outputs:
[2024/06/05 15:15:49] [ info]      stdout.0
[2024/06/05 15:15:49] [ info] ___________
[2024/06/05 15:15:49] [ info]  collectors:
[2024/06/05 15:15:49] [ info] [fluent bit] version=3.0.7, commit=8e004d495b, pid=230194
[2024/06/05 15:15:49] [debug] [engine] coroutine stack size: 24576 bytes (24.0K)
[2024/06/05 15:15:49] [ info] [output:stdout:stdout.0] worker #0 started
[2024/06/05 15:15:49] [ info] [storage] ver=1.1.6, type=memory, sync=normal, checksum=off, max_chunks_up=128
[2024/06/05 15:15:49] [ info] [cmetrics] version=0.9.0
[2024/06/05 15:15:49] [ info] [ctraces ] version=0.5.1
[2024/06/05 15:15:49] [ info] [input:splunk:splunk.0] initializing
[2024/06/05 15:15:49] [ info] [input:splunk:splunk.0] storage_strategy='memory' (memory only)
[2024/06/05 15:15:49] [debug] [splunk:splunk.0] created event channels: read=21 write=22
[2024/06/05 15:15:49] [debug] [downstream] listening on 0.0.0.0:8088
[2024/06/05 15:15:49] [debug] [stdout:stdout.0] created event channels: read=24 write=25
[2024/06/05 15:15:49] [ info] [sp] stream processor started
[2024/06/05 15:15:53] [ warn] [input:splunk:splunk.0] wrong credentials in request headers
[2024/06/05 15:16:04] [debug] [input:splunk:splunk.0] Mark as unknown type for ingested payloads
[2024/06/05 15:16:04] [debug] [task] created task=0x5f7cea0 id=0 OK
[2024/06/05 15:16:04] [debug] [output:stdout:stdout.0] task_id=0 assigned to thread #0
[0] splunk.test.ingest: [[1717568164.904838207, {"hec_token"=>"Splunk 7bba8847-4aee-4e62-ba7b-08c6139e42b9"}], {"event"=>"Pony 1 has left the barn"}]
[1] splunk.test.ingest: [[1717568164.904838207, {"hec_token"=>"Splunk 7bba8847-4aee-4e62-ba7b-08c6139e42b9"}], {"event"=>"Pony 2 has left the barn"}]
[2] splunk.test.ingest: [[1717568164.904838207, {"hec_token"=>"Splunk 7bba8847-4aee-4e62-ba7b-08c6139e42b9"}], {"event"=>"Pony 3 has left the barn", "nested"=>{"key1"=>"value1"}}]
[2024/06/05 15:16:04] [debug] [out flush] cb_destroy coro_id=0
[2024/06/05 15:16:04] [debug] [task] destroy task=0x5f7cea0 (task_id=0)
^C[2024/06/05 15:16:08] [engine] caught signal (SIGINT)
[2024/06/05 15:16:08] [ warn] [engine] service will shutdown in max 5 seconds
[2024/06/05 15:16:08] [ info] [input] pausing splunk.0
[2024/06/05 15:16:08] [ info] [engine] service has stopped (0 pending tasks)
[2024/06/05 15:16:08] [ info] [input] pausing splunk.0
[2024/06/05 15:16:08] [ info] [output:stdout:stdout.0] thread worker #0 stopping...
[2024/06/05 15:16:08] [ info] [output:stdout:stdout.0] thread worker #0 stopped
  • Attached Valgrind output that shows no leaks or memory corruption was found

For injecting HEC tokens into records case

==230460== 
==230460== HEAP SUMMARY:
==230460==     in use at exit: 0 bytes in 0 blocks
==230460==   total heap usage: 3,057 allocs, 3,057 frees, 933,540 bytes allocated
==230460== 
==230460== All heap blocks were freed -- no leaks are possible
==230460== 
==230460== For lists of detected and suppressed errors, rerun with: -s
==230460== ERROR SUMMARY: 0 errors from 0 contexts (suppressed: 0 from 0)

For injecting HEC tokens into metadata case (default behavior)

==230194== 
==230194== HEAP SUMMARY:
==230194==     in use at exit: 0 bytes in 0 blocks
==230194==   total heap usage: 3,154 allocs, 3,154 frees, 1,360,792 bytes allocated
==230194== 
==230194== All heap blocks were freed -- no leaks are possible
==230194== 
==230194== For lists of detected and suppressed errors, rerun with: -s
==230194== ERROR SUMMARY: 0 errors from 0 contexts (suppressed: 0 from 0)

If this is a change to packaging of containers or native binaries then please confirm it works for all targets.

  • Run local packaging test showing all targets (including any new ones) build.
  • Set ok-package-test label to test for all targets (requires maintainer to do).

Documentation

  • Documentation required for this feature

fluent/fluent-bit-docs#1387

Backporting

  • Backport to latest stable release.

Fluent Bit is licensed under Apache 2.0, by submitting this pull request I understand that this code will be released under the terms of that license.

@cosmo0920
Copy link
Contributor Author

cosmo0920 commented May 31, 2024

Note that this PR is built on top of #8883 commit. Because raw endpoint unit testing should fail without that PR's change.

@cosmo0920 cosmo0920 force-pushed the cosmo0920-add-switch-for-storing-in-metadata-or-records-and-handle-multiple-tokens-on-in_splunk branch from c08585c to 4c2493e Compare May 31, 2024 03:13
@edsiper
Copy link
Member

edsiper commented Jun 2, 2024

FYI: #8883 has been merged.

cosmo0920 added 3 commits June 3, 2024 14:44
In metadata case, we didn't support for formatting metadata in out_lib.
So, we didn't write down the tests for them.

Signed-off-by: Hiroshi Hatake <[email protected]>
/* Link to parent list */
mk_list_add(&splunk_token->_head, &ctx->auth_tokens);

token = strtok(NULL, ",");
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we avoid using strtok(2) function since it modifies the original buffer.

@@ -236,6 +236,18 @@ static struct flb_config_map config_map[] = {
"Set valid Splunk HEC tokens for the requests"
},

{
FLB_CONFIG_MAP_BOOL, "store_token_to_metadata", "true",
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would suggest a little change to `store_token_in_metadata"

if (tmp) {
tmp_tokens = flb_strdup(tmp);

token = strtok(tmp_tokens, ",");
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

avoid strtok(). you can use flb_utls_split instead


tmp = flb_input_get_property("splunk_token", ctx->ins);
if (tmp) {
tmp_tokens = flb_strdup(tmp);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

validate the return value

}
else {
return SPLUNK_AUTH_UNAUTHORIZED;
mk_list_foreach_safe(head, tmp, &ctx->auth_tokens) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if the list entry won't be deleted, you can use mk_list_foreach() , the _safe_ version aims to provide an extra pointer to deal with cases where entries/nodes are being removed.

@edsiper edsiper added this to the Fluent Bit v3.0.7 milestone Jun 5, 2024
cosmo0920 added 2 commits June 5, 2024 14:34
Signed-off-by: Hiroshi Hatake <[email protected]>
@edsiper edsiper merged commit 023cd6f into master Jun 5, 2024
45 checks passed
@edsiper edsiper deleted the cosmo0920-add-switch-for-storing-in-metadata-or-records-and-handle-multiple-tokens-on-in_splunk branch June 5, 2024 17:50
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants