-
Notifications
You must be signed in to change notification settings - Fork 1.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
in_otel: support to add resource of log #8294
Conversation
Hi @nokute78, here is a guide so that you can test this and any other future OTLP PRs in the future: Download an OpenTelemetry Collector from https://github.com/open-telemetry/opentelemetry-collector-releases. You will want to download the latest The way I tested it was by running one instance of I start these in reverse order. First, I start the collector that will eventually receive OTLP from Fluent Bit. receivers:
otlp:
protocols:
http:
endpoint: 127.0.0.1:4318
exporters:
file:
path: out.json
service:
telemetry:
logs:
level: debug
pipelines:
logs:
receivers: [otlp]
exporters: [file] Start the collector with the command Then I start the Fluent Bit that will receive and send OTLP back out.
Create a file called Finally, start a collector reading from a file and sending to Fluent Bit: receivers:
filelog:
include: [a.txt]
start_at: beginning
processors:
resourcedetection/system:
detectors: ["system"]
system:
hostname_sources: ["os"]
transform/add_scope:
log_statements:
- context: scope
statements:
- set(name, "test_scope")
- set(version, "1")
exporters:
otlphttp:
endpoint: http://127.0.0.1:6969
file:
path: in.json
service:
telemetry:
metrics:
address: ":8889" # This is so the two collectors don't try and bind the same self metrics port
pipelines:
logs:
receivers: [filelog]
processors: ["resourcedetection/system"]
exporters: [file, otlphttp] (The This pipeline will produce two log files, |
If the goal of this PR is only to resolve the input portion of the problem, then it's a good start. I think it would be a good idea to include Instrumentation Scope as well. In the config for the collector reading from a file, I also added some scope attributes. These need to be preserved as well as resource attributes on input. For the setup I posted above, this PR will not solve the entire pipeline, since the problem is also that scope and resource are erased on output, but as long as this also preserved scope attributes on input then it's a good start. |
@braydonk I think it would be good to get all this into an automated set up: #8294 (comment) We can add it to fluent-bit-ci then invoke it as required. |
15bfb35
to
2f9e2c4
Compare
Signed-off-by: Takahiro Yamashita <[email protected]>
2f9e2c4
to
0f0e64f
Compare
@braydonk Thank you for information and comment. I can test using Output of fluent-bit:
in.json: {"resourceLogs":[{"resource":{"attributes":[{"key":"host.name","value":{"stringValue":"taka-VirtualBox"}},{"key":"os.type","value":{"stringValue":"linux"}}]},"scopeLogs":[{"scope":{},"logRecords":[{"observedTimeUnixNano":"1703289599189943390","body":{"stringValue":"{\"resourceLogs\":[{\"resource\":{},\"scopeLogs\":[{\"scope\":{},\"logRecords\":[{\"timeUnixNano\":\"1703287606977379083\",\"body\":{\"kvlistValue\":{\"values\":[{\"key\":\"message\",\"value\":{\"stringValue\":\"dummy\"}}]}},\"traceId\":\"\",\"spanId\":\"\"}]}]}]}"},"attributes":[{"key":"log.file.name","value":{"stringValue":"a.txt"}}],"traceId":"","spanId":""}]}],"schemaUrl":"https://opentelemetry.io/schemas/1.6.1"}]} a.conf:
Valgrind output:
|
That's a good start, but that structure in Fluent Bit would be extremely difficult to work with if there was going to be a future implementation of #8206. The hierarchical structure should likely be retained the same way in Fluent Bit as it is when it comes in. Having This structure would also fall apart if there were logs from two different resources, or two different scopes, in the same payload. The only way for this to work is for the Fluent Bit internal representation to match the structure of the OTLP payload, so that all information can be retained upon conversion. |
@braydonk Thank you for comment. I tried to modify metadata structure like hierarchical OTLP payload.
I tested using
A following output is simple text a.log:
output of fluent-bit:
|
https://opentelemetry.io/docs/specs/otel/logs/data-model/ Timestamp, ObservedTimestamp, TraceId, SpanId, TraceFlags, SeverityText, SeverityNumber, Resource, InstrumentationScope and Attributes. |
I updated this PR. Example output:
Valgrind output
|
This function is to store fields other than body as metadata. The fields are defined at https://opentelemetry.io/docs/specs/otel/logs/data-model/#definitions-used-in-this-document Signed-off-by: Takahiro Yamashita <[email protected]>
ccdf39b
to
48eae01
Compare
@nokute78 I think that structure looks pretty good to me. I am wondering if you could also produce an example that contains Scope Attributes so we could see what that looks like? @edsiper I see 3 possibilities:
I think any of these 3 could work fine, though there is the vague chance that just doing |
@braydonk Thank you for checking.
I don't know the way to produce such examples. I'll check opentelemetry-collector repo. @edsiper How about supporting a property to specify the key of metadata/record ? Too long property name :( |
In this comment above in this PR, you can borrow the |
@braydonk Thank you for comment. output:
receivers:
filelog:
include: [a.log]
start_at: beginning
processors:
resourcedetection/system:
detectors: ["system"]
system:
hostname_sources: ["os"]
transform/add_scope:
log_statements:
- context: scope
statements:
- set(name, "test_scope")
- set(version, "1")
exporters:
otlphttp:
endpoint: http://127.0.0.1:6969
file:
path: in.json
service:
telemetry:
metrics:
address: ":8889" # This is so the two collectors don't try and bind the same self metrics port
pipelines:
logs:
receivers: [filelog]
processors: ["transform/add_scope"]
exporters: [file, otlphttp] |
Thanks @nokute78! I think this looks pretty good. It's probably worth noting that unfortunately this method of packing these labels directly in the metadata in every single log, OTLP Logs are going to be very expensive for Fluent Bit to process. In an OTLP payload it may be one Resource and one Scope with 100 logs, but once flattened to Fluent Bit schema that same Resource and Scope data will be on all 100 logs separately. However I don't really see another easy way to do this, so I think this is good and that will be a tradeoff we might have to accept. |
That being said I have no additional comments from my side. |
@edsiper ping |
It would be great if we could get this landed - this would close a major gap in making fluentbit a viable OTEL collector |
thanks everybody. I think the remaining part which is out of the scope of this PR is to address this:
at a chunk level we support metadata, actually that's where the Tag is stored. we might think about how to group records that shares metadata... not for this PR, just thinking loudly... |
For those interested into what's coming in OTel support, here is the next big PR: |
Relates to #8205
This patch is to modify metadata of in_opentelemetry based on opentelemetry logs spec .
From:
There is only a map of "attributes" of "logRecords".
To:
Redefined metadata map. It may be a breaking change.
Enter
[N/A]
in the box, if an item is not applicable to your change.Testing
Before we can approve your change; please submit the following in a comment:
If this is a change to packaging of containers or native binaries then please confirm it works for all targets.
ok-package-test
label to test for all targets (requires maintainer to do).Documentation
Backporting
Configuration
Debug/Valgrind output
telemetrygen logs --otlp-http --otlp-insecure --telemetry-attributes "key=\"value\""
Note: I installed telemetrygen using following command.
Reported leak should be fixed by #8293
Fluent Bit is licensed under Apache 2.0, by submitting this pull request I understand that this code will be released under the terms of that license.