Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[DOCS] Create a new page for grok content in scripting docs #73118

Merged
merged 8 commits into from
May 27, 2021
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
39 changes: 0 additions & 39 deletions docs/reference/ingest/processors/grok.asciidoc
Original file line number Diff line number Diff line change
Expand Up @@ -8,52 +8,13 @@ Extracts structured fields out of a single text field within a document. You cho
extract matched fields from, as well as the grok pattern you expect will match. A grok pattern is like a regular
expression that supports aliased expressions that can be reused.

This tool is perfect for syslog logs, apache and other webserver logs, mysql logs, and in general, any log format
that is generally written for humans and not computer consumption.
This processor comes packaged with many
https://github.com/elastic/elasticsearch/blob/{branch}/libs/grok/src/main/resources/patterns[reusable patterns].

If you need help building patterns to match your logs, you will find the
{kibana-ref}/xpack-grokdebugger.html[Grok Debugger] tool quite useful!
The https://grokconstructor.appspot.com[Grok Constructor] is also a useful tool.

[[grok-basics]]
==== Grok Basics

Grok sits on top of regular expressions, so any regular expressions are valid in grok as well.
The regular expression library is Oniguruma, and you can see the full supported regexp syntax
https://github.com/kkos/oniguruma/blob/master/doc/RE[on the Oniguruma site].

Grok works by leveraging this regular expression language to allow naming existing patterns and combining them into more
complex patterns that match your fields.

The syntax for reusing a grok pattern comes in three forms: `%{SYNTAX:SEMANTIC}`, `%{SYNTAX}`, `%{SYNTAX:SEMANTIC:TYPE}`.

The `SYNTAX` is the name of the pattern that will match your text. For example, `3.44` will be matched by the `NUMBER`
pattern and `55.3.244.1` will be matched by the `IP` pattern. The syntax is how you match. `NUMBER` and `IP` are both
patterns that are provided within the default patterns set.

The `SEMANTIC` is the identifier you give to the piece of text being matched. For example, `3.44` could be the
duration of an event, so you could call it simply `duration`. Further, a string `55.3.244.1` might identify
the `client` making a request.

The `TYPE` is the type you wish to cast your named field. `int`, `long`, `double`, `float` and `boolean` are supported types for coercion.

For example, you might want to match the following text:

[source,txt]
--------------------------------------------------
3.44 55.3.244.1
--------------------------------------------------

You may know that the message in the example is a number followed by an IP address. You can match this text by using the following
Grok expression.

[source,txt]
--------------------------------------------------
%{NUMBER:duration} %{IP:client}
--------------------------------------------------

[[using-grok]]
==== Using the Grok Processor in a Pipeline

Expand Down
Loading