From e8d4b6eb172f3dbd7f379014eb08f621f01b1939 Mon Sep 17 00:00:00 2001 From: Hai Yan Date: Fri, 9 Feb 2024 12:13:44 -0600 Subject: [PATCH 1/2] Update dissect and user_agent readme Signed-off-by: Hai Yan --- .../dissect-processor/README.md | 128 +----------------- .../user-agent-processor/README.md | 45 +----- 2 files changed, 3 insertions(+), 170 deletions(-) diff --git a/data-prepper-plugins/dissect-processor/README.md b/data-prepper-plugins/dissect-processor/README.md index 84bd7c286e..c75d7f3176 100644 --- a/data-prepper-plugins/dissect-processor/README.md +++ b/data-prepper-plugins/dissect-processor/README.md @@ -1,129 +1,5 @@ # Dissect Processor -The Dissect processor is useful when dealing with log files or messages that have a known pattern or structure. It extracts specific pieces of information from the text and map them to individual fields based on the defined Dissect patterns. +The dissect processor extracts values from an event and maps them to individual fields based on user-defined dissect patterns. - -## Basic Usage - -To get started with dissect processor using Data Prepper, create the following `pipeline.yaml`. -```yaml -dissect-pipeline: - source: - file: - path: "/full/path/to/dissect_logs_json.log" - record_type: "event" - format: "json" - processor: - - dissect: - map: - log: "%{Date} %{Time} %{Log_Type}: %{Message}" - sink: - - stdout: -``` - -Create the following file named `dissect_logs_json.log` and replace the `path` in the file source of your `pipeline.yaml` with the path of this file. - -``` -{"log": "07-25-2023 10:00:00 ERROR: Some error"} -``` - -The Dissect processor will retrieve the necessary fields from the `log` message, such as `Date`, `Time`, `Log_Type`, and `Message`, with the help of the pattern `%{Date} %{Time} %{Type}: %{Message}`, configured in the pipeline. - -When you run Data Prepper with this `pipeline.yaml` passed in, you should see the following standard output. - -``` -{ - "log" : "07-25-2023 10:00:00 ERROR: Some error", - "Date" : "07-25-2023" - "Time" : "10:00:00" - "Log_Type" : "ERROR" - "Message" : "Some error" -} -``` - -The fields `Date`, `Time`, `Log_Type`, and `Message` have been extracted from `log` value. - -## Configuration -* `map` (Required): `map` is required to specify the dissect patterns. It takes a `Map` with fields as keys and respective dissect patterns as values. - - -* `target_types` (Optional): A `Map` that specifies what the target type of specific field should be. Valid options are `integer`, `double`, `string`, and `boolean`. By default, all the values are `string`. Target types will be changed after the dissection process. - - -* `dissect_when` (Optional): A Data Prepper Expression string following the [Data Prepper Expression syntax](../../docs/expression_syntax.md). When configured, the processor will evaluate the expression before proceeding with the dissection process and perform the dissection if the expression evaluates to `true`. - -## Field Notations - -Symbols like `?, +, ->, /, &` can be used to perform logical extraction of data. - -* **Normal Field** : The field without a suffix or prefix. The field will be directly added to the output Event. - - Ex: `%{field_name}` - - -* **Skip Field** : ? can be used as a prefix to key to skip that field in the output JSON. - * Skip Field : `%{}` - * Named skip field : `%{?field_name}` - - - - -* **Append Field** : To append multiple values and put the final value in the field, we can use + before the field name in the dissect pattern - * **Usage**: - - Pattern : "%{+field_name}, %{+field_name}" - Text : "foo, bar" - - Output : {"field_name" : "foobar"} - - We can also define the order the concatenation with the help of suffix `/` . - - * **Usage**: - - Pattern : "%{+field_name/2}, %{+field_name/1}" - Text : "foo, bar" - - Output : {"field_name" : "barfoo"} - - If the order is not mentioned, the append operation will take place in the order of fields specified in the dissect pattern.

- -* **Indirect Field** : While defining a pattern, prefix the field with a `&` to assign the value found with this field to the value of another field found as the key. - * **Usage**: - - Pattern : "%{?field_name}, %{&field_name}" - Text: "foo, bar" - - Output : {“foo” : “bar”} - - Here we can see that `foo` which was captured from the skip field `%{?field_name}` is made the key to value captured form the field `%{&field_name}` - * **Usage**: - - Pattern : %{field_name}, %{&field_name} - Text: "foo, bar" - - Output : {“field_name”:“foo”, “foo”:“bar”} - - We can also indirectly assign the value to an appended field, along with `normal` field and `skip` field. - -### Padding - -* `->` operator can be used as a suffix to a field to indicate that white spaces after this field can be ignored. - * **Usage**: - - Pattern : %{field1→} %{field2} - Text : “firstname lastname” - - Output : {“field1” : “firstname”, “field2” : “lastname”} - -* This operator should be used as the right most suffix. - * **Usage**: - - Pattern : %{fieldname/1->} %{fieldname/2} - - If we use `->` before `/`, the `->` operator will also be considered part of the field name. - - -## Developer Guide -This plugin is compatible with Java 14. See -- [CONTRIBUTING](https://github.com/opensearch-project/data-prepper/blob/main/CONTRIBUTING.md) -- [monitoring](https://github.com/opensearch-project/data-prepper/blob/main/docs/monitoring.md) +See the [`dissect` processor documentation](https://opensearch.org/docs/latest/data-prepper/pipelines/configuration/processors/dissect/). diff --git a/data-prepper-plugins/user-agent-processor/README.md b/data-prepper-plugins/user-agent-processor/README.md index 950161f46b..5bf25dcd15 100644 --- a/data-prepper-plugins/user-agent-processor/README.md +++ b/data-prepper-plugins/user-agent-processor/README.md @@ -1,47 +1,4 @@ # User Agent Processor This processor parses User-Agent (UA) string in an event and add the parsing result to the event. -## Basic Example -An example configuration for the process is as follows: -```yaml -... - processor: - - user_agent: - source: "ua" - target: "user_agent" -... -``` - -Assume the event contains the following user agent string: -```json -{ - "ua": "Mozilla/5.0 (iPhone; CPU iPhone OS 13_5_1 like Mac OS X) AppleWebKit/605.1.15 (KHTML, like Gecko) Version/13.1.1 Mobile/15E148 Safari/604.1" -} -``` - -The processor will parse the "ua" field and add the result to the specified target in the following format compatible with Elastic Common Schema (ECS): -``` -{ - "user_agent": { - "original": "Mozilla/5.0 (iPhone; CPU iPhone OS 13_5_1 like Mac OS X) AppleWebKit/605.1.15 (KHTML, like Gecko) Version/13.1.1 Mobile/15E148 Safari/604.1", - "os": { - "version": "13.5.1", - "full": "iOS 13.5.1", - "name": "iOS" - }, - "name": "Mobile Safari", - "version": "13.1.1", - "device": { - "name": "iPhone" - } - }, - "ua": "Mozilla/5.0 (iPhone; CPU iPhone OS 13_5_1 like Mac OS X) AppleWebKit/605.1.15 (KHTML, like Gecko) Version/13.1.1 Mobile/15E148 Safari/604.1" -} -``` - -## Configuration -* `source` (Required) — The key to the user agent string in the Event that will be parsed. -* `target` (Optional) — The key to put the parsing result in the Event. Defaults to `user_agent`. -* `exclude_original` (Optional) — Whether to exclude original user agent string from the parsing result. Defaults to false. -* `cache_size` (Optional) - Cache size to use in the parser. Should be a positive integer. Defaults to 1000. -* `tags_on_parse_failure` (Optional) - Tags to add to an event if the processor fails to parse the user agent string. +See the [`user_agent` processor documentation](https://opensearch.org/docs/latest/data-prepper/pipelines/configuration/processors/user-agent/). \ No newline at end of file From 50e82f06e37f2f81d55e35be7402ede1bfe98693 Mon Sep 17 00:00:00 2001 From: Hai Yan Date: Wed, 21 Feb 2024 10:49:05 -0600 Subject: [PATCH 2/2] Fix format issue Signed-off-by: Hai Yan --- data-prepper-plugins/user-agent-processor/README.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/data-prepper-plugins/user-agent-processor/README.md b/data-prepper-plugins/user-agent-processor/README.md index 5bf25dcd15..a250472e0d 100644 --- a/data-prepper-plugins/user-agent-processor/README.md +++ b/data-prepper-plugins/user-agent-processor/README.md @@ -1,4 +1,4 @@ # User Agent Processor This processor parses User-Agent (UA) string in an event and add the parsing result to the event. -See the [`user_agent` processor documentation](https://opensearch.org/docs/latest/data-prepper/pipelines/configuration/processors/user-agent/). \ No newline at end of file +See the [`user_agent` processor documentation](https://opensearch.org/docs/latest/data-prepper/pipelines/configuration/processors/user-agent/).