From 7af25f48074ab5028ea23885fb104011d8d1c8c1 Mon Sep 17 00:00:00 2001 From: dutzu Date: Thu, 4 Feb 2016 11:36:37 +0200 Subject: [PATCH 1/4] rebuild pages From 02f45f662f76ac638e693b5983bf8b7487055d91 Mon Sep 17 00:00:00 2001 From: dutzu Date: Thu, 4 Feb 2016 12:34:46 +0200 Subject: [PATCH 2/4] test code syntax highlighting --- _posts/2016-01-11-log-aggregation.md | 17 +++++++++-------- 1 file changed, 9 insertions(+), 8 deletions(-) diff --git a/_posts/2016-01-11-log-aggregation.md b/_posts/2016-01-11-log-aggregation.md index fd759ecb7c26a..0477b36cb03ac 100644 --- a/_posts/2016-01-11-log-aggregation.md +++ b/_posts/2016-01-11-log-aggregation.md @@ -73,16 +73,17 @@ This is a pain because if you want to properly visualize a set of log messages g Let's take a look at what fluentd sends to Elasticsearch. Here is a sample log file with 2 log messages: -~~~java +~~~ 2015-11-12 06:34:01,471 [ ajp-apr-127.0.0.1-8009-exec-3] LogInterceptor INFO ==== Request === 2015-11-12 06:34:01,473 [ ajp-apr-127.0.0.1-8009-exec-3] LogInterceptor INFO GET /monitor/broker/ HTTP/1.1 ~~~ +{: .language-java} A message sent to Elasticsearch from fluentd would contain these values: *-this isn't the exact message, this is the result of the stdout output plugin-* -~~~ +~~~ java 2015-11-12 06:34:01 -0800 tag.common: {"message":"[ ajp-apr-127.0.0.1-8009-exec-3] LogInterceptor INFO ==== Request ===","time_as_string":"2015-11-12 06:34:01 -0800"} 2015-11-12 06:34:01 -0800 tag.common: {"message":"[ ajp-apr-127.0.0.1-8009-exec-3] LogInterceptor INFO GET /monitor/broker/ HTTP/1.1\n","time_as_string":"2015-11-12 06:34:01 -0800"} @@ -98,7 +99,7 @@ In order to build it yourself you only need the `record_transformer` filter that Next you need to parse the timestamp of your logs into separate date, time and millisecond components (which is basically what the better-timestamp plugin asks you to do, to some extent), and then to create a filter that would match all the messages you will send to Elasticsearch and to create the `@timestamp` value by appending the 3 components. This makes use of the fact that fluentd also allows you to run ruby code within your record_transformer filters to accommodate for more special log manipulation tasks. -~~~ +~~~xml type record_transformer enable_ruby true @@ -111,7 +112,7 @@ Next you need to parse the timestamp of your logs into separate date, time and m The result is that the above sample will come out like this: -~~~ +~~~java 2015-12-12 05:26:15 -0800 akai.common: {"date_string":"2015-11-12","time_string":"06:34:01","msec":"471","message":"[ ajp-apr-127.0.0.1-8009-exec-3] LogInterceptor INFO ==== Request ===","@timestamp":"2015-11-12T06:34:01.471Z"} 2015-12-12 05:26:15 -0800 akai.common: {"date_string":"2015-11-12","time_string":"06:34:01","msec":"473","message":"[ ajp-apr-127.0.0.1-8009-exec-3] LogInterceptor INFO GET /monitor/broker/ HTTP/1.1\n","@timestamp":"2015-11-12T06:34:01.473Z"} ~~~ @@ -136,7 +137,7 @@ For instance, by using the record_transformer I would send the hostname and also Using this example configuration I tried to create a pie chart showing the number of messages per project for a dashboard. Here is what I got. -~~~ +~~~ xml type record_transformer enable_ruby true @@ -150,7 +151,7 @@ Using this example configuration I tried to create a pie chart showing the numbe Sample output from stdout: -~~~ +~~~ java 2015-12-12 06:01:35 -0800 clear: {"date_string":"2015-10-15","time_string":"06:37:32","msec":"415","message":"[amelJettyClient(0xdc64419)-706] jetty:test/test INFO totallyAnonymousContent: http://whyAreYouReadingThis?:)/history/3374425?limit=1","@timestamp":"2015-10-15T06:37:32.415Z","sourceProject":"Test-Analyzed-Field"} ~~~ @@ -169,7 +170,7 @@ And the solution is: When Elasticsearch creates a new index, it will rely on the And what you basically need to do is to do a curl put with that json content to ES and then all the indices created that are prefixed with `logstash-*` will use that template. Be aware that with the fluent-plugin-elasticsearch you can specify your own index prefix so make sure to adjust the template to match your prefix: -~~~ +~~~ java curl -XPUT localhost:9200/_template/template_doru -d '{ "template" : "logstash-*", "settings" : {.... @@ -200,4 +201,4 @@ The `not_analyzed` suffixed field is the one you can safely use in visualization # Have fun So, now you know what we went through here at [HaufeDev](http://haufe-lexware.github.io/) and what problems we faced and how we can overcome them. -If you want to give it a try you can take a look at [our docker templates on github](https://github.com/Haufe-Lexware/docker-templates), there you will find a [logaggregation template](https://github.com/Haufe-Lexware/docker-templates/tree/master/logaggregation) for an EFK setup + a shipper that can transfer messages securely to the EFK solution and you can have it up and running in a matter of minutes. +If you want to give it a try you can take a look at [our docker templates on github](https://github.com/Haufe-Lexware/docker-templates), there you will find a [logaggregation template](https://github.com/Haufe-Lexware/docker-templates/tree/master/logaggregation) for an EFK setup + a shipper that can transfer messages securely to the EFK solution and you can have it up and running in a matter of minutes. From 91628ac3bb5e74bd662e5eb0b1ade25ed7bcc10e Mon Sep 17 00:00:00 2001 From: dutzu Date: Thu, 4 Feb 2016 12:38:41 +0200 Subject: [PATCH 3/4] added language IAL for syntax highlighting --- _posts/2016-01-11-log-aggregation.md | 27 ++++++++++++++------------- 1 file changed, 14 insertions(+), 13 deletions(-) diff --git a/_posts/2016-01-11-log-aggregation.md b/_posts/2016-01-11-log-aggregation.md index 0477b36cb03ac..c1dc6c85c157a 100644 --- a/_posts/2016-01-11-log-aggregation.md +++ b/_posts/2016-01-11-log-aggregation.md @@ -83,11 +83,12 @@ A message sent to Elasticsearch from fluentd would contain these values: *-this isn't the exact message, this is the result of the stdout output plugin-* -~~~ java +~~~ 2015-11-12 06:34:01 -0800 tag.common: {"message":"[ ajp-apr-127.0.0.1-8009-exec-3] LogInterceptor INFO ==== Request ===","time_as_string":"2015-11-12 06:34:01 -0800"} 2015-11-12 06:34:01 -0800 tag.common: {"message":"[ ajp-apr-127.0.0.1-8009-exec-3] LogInterceptor INFO GET /monitor/broker/ HTTP/1.1\n","time_as_string":"2015-11-12 06:34:01 -0800"} ~~~ +{: .language-java} I added the `time_as_string` field in there just so you can see the literal string that is sent as the time value. @@ -99,7 +100,7 @@ In order to build it yourself you only need the `record_transformer` filter that Next you need to parse the timestamp of your logs into separate date, time and millisecond components (which is basically what the better-timestamp plugin asks you to do, to some extent), and then to create a filter that would match all the messages you will send to Elasticsearch and to create the `@timestamp` value by appending the 3 components. This makes use of the fact that fluentd also allows you to run ruby code within your record_transformer filters to accommodate for more special log manipulation tasks. -~~~xml +~~~ type record_transformer enable_ruby true @@ -108,15 +109,15 @@ Next you need to parse the timestamp of your logs into separate date, time and m ~~~ - +{: .language-xml} The result is that the above sample will come out like this: -~~~java +~~~ 2015-12-12 05:26:15 -0800 akai.common: {"date_string":"2015-11-12","time_string":"06:34:01","msec":"471","message":"[ ajp-apr-127.0.0.1-8009-exec-3] LogInterceptor INFO ==== Request ===","@timestamp":"2015-11-12T06:34:01.471Z"} 2015-12-12 05:26:15 -0800 akai.common: {"date_string":"2015-11-12","time_string":"06:34:01","msec":"473","message":"[ ajp-apr-127.0.0.1-8009-exec-3] LogInterceptor INFO GET /monitor/broker/ HTTP/1.1\n","@timestamp":"2015-11-12T06:34:01.473Z"} ~~~ - +{: .language-java} *__Note__: you can use the same record_transformer filter to remove the 3 separate time components after creating the `@timestamp` field via the `remove_keys` option.* ### Do not analyse @@ -137,7 +138,7 @@ For instance, by using the record_transformer I would send the hostname and also Using this example configuration I tried to create a pie chart showing the number of messages per project for a dashboard. Here is what I got. -~~~ xml +~~~ type record_transformer enable_ruby true @@ -147,14 +148,14 @@ Using this example configuration I tried to create a pie chart showing the numbe ~~~ - +{: .language-xml} Sample output from stdout: -~~~ java +~~~ 2015-12-12 06:01:35 -0800 clear: {"date_string":"2015-10-15","time_string":"06:37:32","msec":"415","message":"[amelJettyClient(0xdc64419)-706] jetty:test/test INFO totallyAnonymousContent: http://whyAreYouReadingThis?:)/history/3374425?limit=1","@timestamp":"2015-10-15T06:37:32.415Z","sourceProject":"Test-Analyzed-Field"} ~~~ - +{: .language-java} And here is the result of trying to use it in a visualization: {:.center} @@ -170,17 +171,17 @@ And the solution is: When Elasticsearch creates a new index, it will rely on the And what you basically need to do is to do a curl put with that json content to ES and then all the indices created that are prefixed with `logstash-*` will use that template. Be aware that with the fluent-plugin-elasticsearch you can specify your own index prefix so make sure to adjust the template to match your prefix: -~~~ java +~~~ curl -XPUT localhost:9200/_template/template_doru -d '{ "template" : "logstash-*", "settings" : {.... }' ~~~ - +{: .language-bash} The main thing to note in the whole template is this section: -~~~ json +~~~ "string_fields" : { "match" : "*", "match_mapping_type" : "string", @@ -193,7 +194,7 @@ The main thing to note in the whole template is this section: } } ~~~ - +{: .language-json} This tells Elasticsearch that for any field of type string that it receives it should create a mapping of type string that is analyzed + another field that adds a `.raw` suffix that will not be analyzed. The `not_analyzed` suffixed field is the one you can safely use in visualizations, but do keep in mind that this creates the scenario mentioned before where you can have up to 40% inflation in storage requirements because you will have both analyzed and not_analyzed fields in store. From cd5ee611df3b1e5e1557bf6d7b61614ef18ddfee Mon Sep 17 00:00:00 2001 From: dutzu Date: Thu, 4 Feb 2016 12:46:28 +0200 Subject: [PATCH 4/4] Code block syntax highlighting --- _posts/2016-01-18-fluentd-log-parsing.md | 13 +++++++------ 1 file changed, 7 insertions(+), 6 deletions(-) diff --git a/_posts/2016-01-18-fluentd-log-parsing.md b/_posts/2016-01-18-fluentd-log-parsing.md index 8dc44fc88a03a..f49f88192ddbc 100644 --- a/_posts/2016-01-18-fluentd-log-parsing.md +++ b/_posts/2016-01-18-fluentd-log-parsing.md @@ -26,7 +26,7 @@ The simplest approach is to just parse all messages using the common denominator In the case of a typical log file a configuration can be something like this (but not necessarily): -~~~ xml +~~~ type tail path /var/log/test.log @@ -39,7 +39,7 @@ In the case of a typical log file a configuration can be something like this (bu format1 /(?