Support for extracting attributes/labels from log body #14938

lsampras · 2022-10-13T19:23:52Z

Is your feature request related to a problem? Please describe.

I wanna set loki labels for my logs based on a key-value pair in my JSON formatted logs,
we provide an option to set loki labels from attributes but I'm not able to extract the body value to an attribute...

E.g
assuming the logs to be something like

{
    "message":"Adding oranges to basket",
    "service":"my-service-1",
    "log_type":"orange_log"
    ...
},
{
    "message":"Peeling apples for Milkshake",
    "service":"my-service-1",
    "log_type":"apple_log"
    ...
}

I wanna extract the the log_type and set it as a loki label...

Promtail supports this behaviour via its pipeline_stages configuration

  pipeline_stages:
  - json:
      expressions:
        log_type:
  - labels:
      log_type:

Describe the solution you'd like

I see that we already have the attribute processor which supports setting loki labels and adding context based or static attributes,
we also have the transform processor which can be extended to read/parse from log bodies

I think it would be appropriate to extend the behaviour for one of these processors..

Although not sure if this requirement is loki specific (Since most logging solutions index the body and don't need an explicit label) and needs to be implemented in the loki exporter...

Describe alternatives you've considered

For now the alternatives seem to be either not adding these labels (which would lead to inefficient queries in loki) or using another tool (maybe kafka) to handle this processing...

Additional context

No response

The text was updated successfully, but these errors were encountered:

github-actions · 2022-10-14T20:17:46Z

Pinging code owners: @gramidt @gouthamve @jpkrohling @kovrus @mar4uk. See Adding Labels via Comments if you do not have permissions to add labels yourself.

github-actions · 2022-10-14T20:17:46Z

Pinging code owners: @boostchicken. See Adding Labels via Comments if you do not have permissions to add labels yourself.

github-actions · 2022-10-14T20:17:46Z

Pinging code owners: @djaglowski. See Adding Labels via Comments if you do not have permissions to add labels yourself.

lsampras · 2022-10-19T08:19:56Z

I've added a minimal version here(#15282) which modifies transformprocessor for this behaviour (which satisfies my usecase),
This still is far from completion, will need more direction/consideration before we choose a solution and cover all cases from a feature standpoint...

djaglowski · 2022-10-19T12:48:13Z

@lsampras, please take a look at #9410.

We have some fairly robust capabilities to parse and manipulate logs, but only in receivers that use pkg/stanza. The issue above is an attempt to spec out the rough interface that transformprocessor would need to implement in order to support equivalent functionality. The main thing to note about this implementation is that we should separate "parsing" from further manipulation.

In the case of plucking a value from a json log, I think we should have a function that parses the json and assigns the resulting object to a specified field. Separately, we should have a function that moves values from one field to another. This kind of design allows us to compose functions.

jpkrohling · 2022-10-19T17:06:39Z

cc @mar4uk and @kovrus: I think this belongs to a processor instead of to the Loki exporter, but I wonder what your opinions are.

kovrus · 2022-10-20T10:53:16Z

cc @mar4uk and @kovrus: I think this belongs to a processor instead of to the Loki exporter, but I wonder what your opinions are.

Agree, it should not be a part of the Loki exporter scope, but the processor components.

mar4uk · 2022-11-10T21:27:37Z

After merging the promtail receiver pipeline_stages will be available the same way they are available in promtail itself.
Just checked it and it works. Here is config:

receivers:
  promtail:
    config:
        - job_name: loki_push
          loki_push_api:
            server:
              http_listen_port: 3101
              grpc_listen_port: 3600
            labels:
              pushserver: push1
            use_incoming_timestamp: true
          pipeline_stages:
            - json:
                expressions:
                  http_code: http_code
            - labels:
                http_code:

      target_config:
        sync_period: 10s

processors:
  attributes:
    actions:
      - action: insert
        key: loki.attribute.labels
        value: http_code

exporters:
  loki:
    endpoint: http://localhost:3100/loki/api/v1/push
service:
  pipelines:
    logs:
      receivers: [ promtail ]
      processors: [ attributes ]
      exporters: [ loki ]

So when using promtail receiver -> loki exporter labels will work out of the box.
If another receiver is used then I agree that extraction belongs to a processor. But which one? transformprocessor? logstransformprocessor?

djaglowski · 2022-11-11T16:31:36Z

If another receiver is used then I agree that extraction belongs to a processor. But which one? transformprocessor? logstransformprocessor?

It is intended that logstransformprocessor will be deprecated as soon as its functionality is available in transformprocessor, so I would recommend any new functionality should be added to transformprocessor.

github-actions · 2023-01-11T03:30:37Z

This issue has been inactive for 60 days. It will be closed in 60 days if there is no activity. To ping code owners by adding a component label, see Adding Labels via Comments, or if you are unsure of which component this issue relates to, please ping @open-telemetry/collector-contrib-triagers. If this issue is still relevant, please ping the code owners or leave a comment explaining why it is still relevant. Otherwise, please close it.

Pinging code owners:

processor/attributes: @boostchicken
processor/transform: @TylerHelmuth @kentquirk @bogdandrutu @evan-bradley
exporter/loki: @gramidt @gouthamve @jpkrohling @kovrus @mar4uk

See Adding Labels via Comments if you do not have permissions to add labels yourself.

We've had discussion recently around how to handle functions and converters (previously factory functions) that share the same functionality (see #16571) This PR suggests a different approach that should reduce the need to chain converters/functions. Before we chained converters together because OTTL had no place to store data except for the telemetry payload itself. While attributes could be used, it results in the need for cleanup and potentially unwanted transformations based on condition resolution. This PR introduces the concept of tmp to the logs contexts (and future contexts if we like the solution) that statements can use as a "staging" location for complex transformations. Before to handle situations like the one in #14938 we would have to chain together functions. Pretending we had KeepKeys converter the statement would look like merge_maps(attributes, KeepKeys(ParseJSON(body), ["log_type"]), "upsert"). These types of statements are tricky to write and can difficult to comprehend, especially for new users. Each Converter we add on increased the burden. Adding a single extra function, like Flatten, really makes a difference: merge_maps(attributes, Flatten(KeepKeys(ParseJSON(body), ["log_type"]), "."), "upsert") Co-authored-by: Evan Bradley <[email protected]>

TylerHelmuth · 2023-01-19T23:06:05Z

@lsampras this capability should be available via the next release of the collector via the ParseJson, merge_maps, and keep_keys/delete_key functions and the cache path accessor. Since we have the bare-minimum functionality I am going to close this issue. If you think thats not the case please ping me and we can reopen. If you have related enhancements you'd like to see to the OTTL/transformprocessor for logs please open a new issue.

lsampras added enhancement New feature or request needs triage New item requiring triage labels Oct 13, 2022

evan-bradley added processor/attributes Attributes processor processor/transform Transform processor exporter/loki Loki Exporter pkg/stanza and removed needs triage New item requiring triage labels Oct 14, 2022

lsampras mentioned this issue Oct 19, 2022

[processor/transform] Add support for parsing json log bodies #15282

Closed

evan-bradley mentioned this issue Oct 19, 2022

Transform processor should allow JSON structure in log body #14977

Closed

djaglowski removed the pkg/stanza label Oct 20, 2022

This was referenced Nov 16, 2022

[pkg/ottl] Add ParseJSONIntoMap function #16340

Closed

[pkg/ottl] Add ParseJSON factory function #16444

Merged

[pkg/ottl] Add merge_maps function #16461

Merged

This was referenced Nov 30, 2022

[processor/transform] Add merge_maps and ParseJSON functions #16551

Merged

[pkg/ottl] Experiment with Factory Function definition of existing Function #16571

Closed

TylerHelmuth mentioned this issue Dec 13, 2022

[pkg/ottl] Add cache path #16994

Merged

github-actions bot added the Stale label Jan 11, 2023

TylerHelmuth removed the Stale label Jan 11, 2023

TylerHelmuth closed this as completed Jan 19, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Support for extracting attributes/labels from log body #14938

Support for extracting attributes/labels from log body #14938

lsampras commented Oct 13, 2022

github-actions bot commented Oct 14, 2022

github-actions bot commented Oct 14, 2022

github-actions bot commented Oct 14, 2022

lsampras commented Oct 19, 2022 •

edited

Loading

djaglowski commented Oct 19, 2022

jpkrohling commented Oct 19, 2022

kovrus commented Oct 20, 2022

mar4uk commented Nov 10, 2022

djaglowski commented Nov 11, 2022

github-actions bot commented Jan 11, 2023

TylerHelmuth commented Jan 19, 2023 •

edited

Loading

Support for extracting attributes/labels from log body #14938

Support for extracting attributes/labels from log body #14938

Comments

lsampras commented Oct 13, 2022

Is your feature request related to a problem? Please describe.

Describe the solution you'd like

Describe alternatives you've considered

Additional context

github-actions bot commented Oct 14, 2022

github-actions bot commented Oct 14, 2022

github-actions bot commented Oct 14, 2022

lsampras commented Oct 19, 2022 • edited Loading

djaglowski commented Oct 19, 2022

jpkrohling commented Oct 19, 2022

kovrus commented Oct 20, 2022

mar4uk commented Nov 10, 2022

djaglowski commented Nov 11, 2022

github-actions bot commented Jan 11, 2023

TylerHelmuth commented Jan 19, 2023 • edited Loading

lsampras commented Oct 19, 2022 •

edited

Loading

TylerHelmuth commented Jan 19, 2023 •

edited

Loading