Skip to content

Commit

Permalink
[DOCS] Create a new page for grok content in scripting docs (#73118)
Browse files Browse the repository at this point in the history
* [DOCS] Moving grok to its own scripting page

* Adding examples

* Updating cross link for grok page

* Adds same runtime field in a search request for #73262

* Clarify titles and shift navigation

* Incorporating review feedback

* Updating cross-link to Painless
  • Loading branch information
Adam Locke authored May 27, 2021
1 parent 823b3cd commit 0aa0171
Show file tree
Hide file tree
Showing 6 changed files with 330 additions and 97 deletions.
39 changes: 0 additions & 39 deletions docs/reference/ingest/processors/grok.asciidoc
Original file line number Diff line number Diff line change
Expand Up @@ -8,52 +8,13 @@ Extracts structured fields out of a single text field within a document. You cho
extract matched fields from, as well as the grok pattern you expect will match. A grok pattern is like a regular
expression that supports aliased expressions that can be reused.

This tool is perfect for syslog logs, apache and other webserver logs, mysql logs, and in general, any log format
that is generally written for humans and not computer consumption.
This processor comes packaged with many
https://github.com/elastic/elasticsearch/blob/{branch}/libs/grok/src/main/resources/patterns[reusable patterns].

If you need help building patterns to match your logs, you will find the
{kibana-ref}/xpack-grokdebugger.html[Grok Debugger] tool quite useful!
The https://grokconstructor.appspot.com[Grok Constructor] is also a useful tool.

[[grok-basics]]
==== Grok Basics

Grok sits on top of regular expressions, so any regular expressions are valid in grok as well.
The regular expression library is Oniguruma, and you can see the full supported regexp syntax
https://github.com/kkos/oniguruma/blob/master/doc/RE[on the Oniguruma site].

Grok works by leveraging this regular expression language to allow naming existing patterns and combining them into more
complex patterns that match your fields.

The syntax for reusing a grok pattern comes in three forms: `%{SYNTAX:SEMANTIC}`, `%{SYNTAX}`, `%{SYNTAX:SEMANTIC:TYPE}`.

The `SYNTAX` is the name of the pattern that will match your text. For example, `3.44` will be matched by the `NUMBER`
pattern and `55.3.244.1` will be matched by the `IP` pattern. The syntax is how you match. `NUMBER` and `IP` are both
patterns that are provided within the default patterns set.

The `SEMANTIC` is the identifier you give to the piece of text being matched. For example, `3.44` could be the
duration of an event, so you could call it simply `duration`. Further, a string `55.3.244.1` might identify
the `client` making a request.

The `TYPE` is the type you wish to cast your named field. `int`, `long`, `double`, `float` and `boolean` are supported types for coercion.

For example, you might want to match the following text:

[source,txt]
--------------------------------------------------
3.44 55.3.244.1
--------------------------------------------------

You may know that the message in the example is a number followed by an IP address. You can match this text by using the following
Grok expression.

[source,txt]
--------------------------------------------------
%{NUMBER:duration} %{IP:client}
--------------------------------------------------

[[using-grok]]
==== Using the Grok Processor in a Pipeline

Expand Down
Loading

0 comments on commit 0aa0171

Please sign in to comment.