From 2d6a13406655e4ba6bf86ce86e5c537d27fba6cb Mon Sep 17 00:00:00 2001 From: Karen Metts Date: Tue, 3 Jul 2018 16:40:29 -0400 Subject: [PATCH] Phase 2 rework --- docs/static/azure-module.asciidoc | 630 ++++++++++++++++++++++-------- 1 file changed, 468 insertions(+), 162 deletions(-) diff --git a/docs/static/azure-module.asciidoc b/docs/static/azure-module.asciidoc index 54ae6a574ae..11634a798cc 100644 --- a/docs/static/azure-module.asciidoc +++ b/docs/static/azure-module.asciidoc @@ -1,25 +1,10 @@ [role="xpack"] [[azure-module]] -=== Azure Module +=== Azure Module [Experimental] +experimental[] -++++ -Azure Module -++++ - -:username: username -:hostname: hostname -:event_hub_name: event_hub_name -:event_hub_key: event_hub_key -:event_hub_username: event_hub_username -:event_hub_namespace: event_hub_namespace -:partitions: partitions - -The Microsoft Azure module in Logstash helps you easily integrate your Azure -activity logs and SQL diagnostic logs with the Elastic Stack. The module taps directly into the Azure dashboard, -parses and indexes events into Elasticsearch, and installs a suite of Kibana -dashboards to help you start exploring your data immediately. - -TBD: Add links to MS def of activity logs and SQL diagnostics logs +The https://azure.microsoft.com/en-us/overview/what-is-azure/[Microsoft Azure] module in Logstash helps you easily integrate your Azure +activity logs and SQL diagnostic logs with the Elastic Stack. NOTE: The Logstash Azure module is an https://www.elastic.co/products/x-pack[{xpack}] feature under the Basic License @@ -27,18 +12,22 @@ and is therefore free to use. Please contact mailto:monitor-azure@elastic.co[monitor-azure@elastic.co] for questions or more information. -These instructions are designed to help you set up and demo the Azure module in your environment. For a production environment, additional security steps are recommended. +The Azure module uses the +{logstash-ref}/plugins-inputs-azure_event_hubs.html[Logstash Azure Event Hubs +plugin] to consume data from Azure Event Hubs. The module taps directly into the +Azure dashboard, parses and indexes events into Elasticsearch, and installs a +suite of {kib} dashboards to help you start exploring your data immediately. [[azure-dashboards]] ==== Dashboards -These dashboards are available and ready for you to use. +These {kib} dashboards are available and ready for you to use. -* *Overview*. A top-level view into your Azure operations, including info about users, resource groups, service health, access, activities, and alerts. +* *Overview*. Top-level view into your Azure operations, including info about users, resource groups, service health, access, activities, and alerts. * *Alerts*. Alert info, including activity, alert status (activated, resolved, succeeded), and alerts heatmap -* *SQL DB Overview*. A top-level view into your SQL databases, including counts for databases, servers, resource groups, and subscriptions. +* *SQL DB Overview*. Top-level view into your SQL databases, including counts for databases, servers, resource groups, and subscriptions. * *SQL DB Database View*. Detailed info about each SQL database, including wait time, errors, DTU and storage utilization, size, and read and write input/output. @@ -48,44 +37,53 @@ These dashboards are available and ready for you to use. You can use the dashboards they are, or tailor them to meet your needs. +==== Azure_event_hubs plugin + +The Azure module uses the `azure_event_hubs` plugin. Basic understanding of the +plugin is helpful when you set up the Azure module. See +{logstash-ref}/plugins-inputs-azure_event_hubs.html[azure_event_hubs plugin +documentation] for more information about configurations and options. + + [[azure-prereqs]] ==== Prerequisites -These instructions assume that Logstash, Elasticsearch, and Kibana are already +These instructions assume that Logstash, Elasticsearch, and Kibana are installed and running. The products are https://www.elastic.co/downloads[available to download] and easy to install. -The Elastic Stack 6.3 (or later) is required for this module. +The Elastic Stack 6.4 (or later) is required for this module. NOTE: Logstash, Elasticsearch, and Kibana must run locally. You can also run Elasticsearch, Kibana and Logstash on separate hosts to consume data from Azure. + [[azure-setup]] ==== Installation and setup To get started with the Azure module: - . Install the `azureeventhub` plugin. + . Install the {logstash-ref}/plugins-inputs-azure_event_hubs.html[azure_event_hubs + plugin]. . Set up the Azure module. [[azure-plugin-setup]] ===== Install the plugin TBD: From the LS directory? -To install the Azure plugin, run this command: +Run this command to install the plugin: -["source","shell",subs="attributes"] +["source","shell"] ----- -bin/logstash-plugin install logstash-input-azureeventhub +bin/logstash-plugin install logstash_input_azure_event_hubs ----- [[azure-module-setup]] ===== Set up the module -TBD: Check variables, formatting, etc. TBD: From the LS directory? -To set up the module, run this command: +Modify this command for your environment, and run it. ["source","shell",subs="attributes"] ----- @@ -98,39 +96,380 @@ bin/logstash --modules azure --setup \ -M "azure.var.kibana.host={hostname}" ----- -TBD: I added the backslashes. Is that correct? - -The `--modules azure` option starts a Logstash pipeline for ingestion into -Azure. The `--setup` option creates an `azure-*` index pattern in Elasticsearch -and imports Kibana dashboards and visualizations. +The `--modules azure` option starts a Logstash pipeline for ingestion from Azure +Event Hubs. The `--setup` option creates an `azure-*` index pattern in +Elasticsearch and imports Kibana dashboards and visualizations. NOTE: The `--setup` option is intended only for first-time setup. If you include `setup` on subsequent runs, your existing Kibana dashboards will be overwritten. -[[azure-settings]] -==== Add settings +[[configuring-azure]] +==== Configure the module + +You can specify <> for the Logstash Azure module in the +`logstash.yml` configuration file or with overrides through the command line. + +The azure_event_hubs plugin and the Azure module support two configuration +models: basic and advanced. *Basic configuration* is the default, and accepts +inputs from multiple Event Hubs. + +*Advanced configuration* is available for deployments where different Event Hubs +require different configurations.Advanced configuration is not necessary or +recommended for most use cases. -Add these settings to the logstash.yml file. +See {logstash-ref}/plugins-inputs-azure_event_hubs.html[azure_event_hubs plugin +documentation] for more information about basic and advanced configuration +models. + +===== Basic configuration samples + +All configuration is shared between Event Hubs ["source","shell",subs="attributes"] ----- modules: - name: azure - var.input.azureeventhub.eventhub: "event_hub_name" - var.input.azureeventhub.key: "event_hub_key" - var.input.azureeventhub.username: "event_hub_username" - var.input.azureeventhub.namespace: "event_hub_namespace" - var.input.azureeventhub.partitions: "partitions" var.elasticsearch.hosts: "localhost:9200" - var.elasticsearch.username: elastic" - var.elasticsearch.password: "{pwd}" - var.kibana.host: “localhost:5601” - var.kibana.username: "elastic" - var.kibana.password: "{pwd}" + var.kibana.host: "localhost:5601" + var.input.azure_event_hubs.threads: 8 + var.input.azure_event_hubs.decorate_events: true + var.input.azure_event_hubs.consumer_group: "logstash" + var.input.azure_event_hubs.storage_connection: "DefaultEndpointsProtocol=https;AccountName=example...." + var.input.azure_event_hubs.event_hub_connections: + - "Endpoint=sb://example1...EntityPath=insights-logs-errors" + - "Endpoint=sb://example2...EntityPath=insights-metrics-pt1m" ----- -If you want to specify additional options to control the behavior of the Azure -module, see <>. +**Command line for basic configuration** + +You can use the command line to set up the basic configuration for a single +Event Hub. + +["source","shell",subs="attributes"] +----- +bin/logstash --modules azure -M "azure.var.elasticsearch.host=es.mycloud.com" -M "azure.var.input.azure_event_hubs.threads=8" -M "azure.var.input.azure_event_hubs.consumer_group=logstash" -M "azure.var.input.azure_event_hubs.decorate_events=true" -M "azure.var.input.azure_event_hubs.event_hub_connections=Endpoint=sb://example1...EntityPath=insights-logs-errors" -M "azure.var.input.azure_event_hubs.storage_connection=DefaultEndpointsProtocol=https;AccountName=example...." +----- + +===== Advanced configuration sample + +Advanced configuration supports Event Hub specific options. +It is not necessary or recommended for most use cases. Use +it only if it is required for your deployment scenario. + +You must define the `header` array with `name` in the first position. You can +define other options in any order. The per Event Hub configuration takes +precedence. Any values not defined per Event Hub use the global config value. + +In this example 'consumer_group' will be applied to each of the configured Event +Hubs. Note that 'decorate_events' is defined in both the 'global' and per Event Hub +configuration. The per Event Hub configuration takes precedence, and the +global configuration is effectively ignored. + +["source","shell",subs="attributes"] +----- +modules: + - name: azure + var.elasticsearch.hosts: "localhost:9200" + var.kibana.host: "localhost:5601" + var.input.azure_event_hubs.threads: 8 + var.input.azure_event_hubs.decorate_events: true + var.input.azure_event_hubs.consumer_group: logstash + var.input.azure_event_hubs.event_hubs: + - ["name", "event_hub_connection", "storage_connection", "initial_position", "decorate_events"] + - ["insights-operational-logs", "Endpoint=sb://example1...", "DefaultEndpointsProtocol=https;AccountName=example1....", "HEAD", "true"] + - ["insights-metrics-pt1m", "Endpoint=sb://example2...", "DefaultEndpointsProtocol=https;AccountName=example2....", "TAIL", "true"] + - ["insights-logs-errors", "Endpoint=sb://example3...", "DefaultEndpointsProtocol=https;AccountName=example3....", "TAIL", "false"] + - ["insights-operational-logs", "Endpoint=sb://example4...", "DefaultEndpointsProtocol=https;AccountName=example4....", "HEAD", "true"] +----- + +[[azure_config_options]] +===== Configuration options + +NOTE: All Event Hubs options are common to both basic and advanced +configurations, with the following exceptions. The basic configuration uses +`event-hub-connections`. The the advanced configuration uses `event_hubs` and +`event_hub_connection`. + +[id="plugins-{type}s-{plugin}-config_mode"] +===== `config_mode` +* Value type is <> +* Valid entries are `basic` or `advanced` +* Default value is `basic` + +[source,ruby] +---- +azure_event_hubs { + event_hub_connections => ["Endpoint=sb://example1...;EntityPath=event_hub_name1" , "Endpoint=sb://example2...;EntityPath=event_hub_name2" ] +} +---- + +[id="plugins-{type}s-{plugin}-event_hubs"] +===== `event_hubs` +* Value type is <> +* No default value +* Ignored for basic configuration +* Required for advanced configuration + +Defines the Event Hubs to be read. An array of hashes where each entry is a +hash of the Event Hub name and its configuration options. + +[source,ruby] +---- +azure_event_hubs { + config_mode => "advanced" + event_hubs => [ + { "event_hub_name1" => { + event_hub_connection => "Endpoint=sb://example1..." + }}, + { "event_hub_name2" => { + event_hub_connection => "Endpoint=sb://example2..." + storage_connection => "DefaultEndpointsProtocol=https;AccountName=example...." + storage_container => "my_container" + }} + ] + consumer_group => "logstash" # shared across all Event Hubs +} +---- + +[id="plugins-{type}s-{plugin}-event_hub_connections"] +===== `event_hub_connections` +* Value type is <> +* No default value +* Required for basic configuration + +List of connection strings that identifies the Event Hubs to be read. Connection +strings include the EntityPath for the Event Hub. + +The `event_hub_connections` option is defined +per Event Hub. All other configuration options are shared among Event Hubs. + +[source,ruby] +---- +azure_event_hubs { + event_hub_connections => ["Endpoint=sb://example1...;EntityPath=event_hub_name1" , "Endpoint=sb://example2...;EntityPath=event_hub_name2" ] +} +---- + +[id="plugins-{type}s-{plugin}-event_hub_connection"] +===== `event_hub_connection` +* Value type is <> +* No default value +* Valid only for advanced configuration + +Connection string that identifies the Event Hub to be read. Advanced +configuration options can be set per Event Hub. This option modifies +`event_hub_name`, and should be nested under it. (See sample.) This option +accepts only one connection string. + +[source,ruby] +---- +azure_event_hubs { + config_mode => "advanced" + event_hubs => [ + { "event_hub_name1" => { + event_hub_connection => "Endpoint=sb://example1...;EntityPath=event_hub_name1" + }} + ] +} +---- + +[id="plugins-{type}s-{plugin}-checkpoint_interval"] +===== `checkpoint_interval` +* Value type is <> +* Default value is `5` seconds +* Set to `0` to disable. + +Interval in seconds to write checkpoints during batch processing. Checkpoints +tell {ls} where to resume processing after a restart. Checkpoints are +automatically written at the end of each batch, regardless of this setting. + +Writing checkpoints too frequently can slow down processing unnecessarily. + +[source,ruby] +---- +azure_event_hubs { + event_hub_connections => ["Endpoint=sb://example1...;EntityPath=event_hub_name1"] + checkpoint_interval => 5 +} +---- + +[id="plugins-{type}s-{plugin}-consumer_group"] +===== `consumer_group` +* Value type is <> +* Default value is `$Default` + +Consumer group used to read the Event Hub(s). Create a consumer group +specifically for Logstash. Then ensure that all instances of Logstash use that +consumer group so that they can work together properly. + +[source,ruby] +---- +azure_event_hubs { + event_hub_connections => ["Endpoint=sb://example1...;EntityPath=event_hub_name1"] + consumer_group => "logstash" +} +---- + +[id="plugins-{type}s-{plugin}-decorate_events"] +===== `decorate_events` + +* Value type is <> +* Default value is `false` + +Adds metadata about the Event Hub, including Event Hub name, consumer_group, +processor_host, partition, offset, sequence, timestamp, and event_size. + +[source,ruby] +---- +azure_event_hubs { + event_hub_connections => ["Endpoint=sb://example1...;EntityPath=event_hub_name1"] + decorate_events => true +} +---- + +[id="plugins-{type}s-{plugin}-initial_position"] +===== `initial_position` +* Value type is <> +* Valid arguments are `beginning`, `end`, `look_back` +* Default value is `beginning` + +When first reading from an Event Hub, start from this position: + +* `beginning` reads all pre-existing events in the Event Hub +* `end` does not read any pre-existing events in the Event Hub +* `look_back` reads `end` minus a number of seconds worth of pre-existing events. +You control the number of seconds using the `initial_position_look_back` option. + +Note: If `storage_connection` is set, the `initial_position` value is used only +the first time Logstash reads from the Event Hub. + +[source,ruby] +---- +azure_event_hubs { + event_hub_connections => ["Endpoint=sb://example1...;EntityPath=event_hub_name1"] + initial_position => "beginning" +} +---- + +[id="plugins-{type}s-{plugin}-initial_position_look_back"] +===== `initial_position_look_back` +* Value type is <> +* Default value is `86400` +* Used only if `initial_position` is set to `look-back` + +Number of seconds to look back to find the initial position for pre-existing +events. This option is used only if `initial_position` is set to `look_back`. If +`storage_connection` is set, this configuration applies only the first time {ls} +reads from the Event Hub. + +[source,ruby] +---- +azure_event_hubs { + event_hub_connections => ["Endpoint=sb://example1...;EntityPath=event_hub_name1"] + initial_position => "look_back" + initial_position_look_back => 86400 +} +---- + +[id="plugins-{type}s-{plugin}-max_batch_size"] +===== `max_batch_size` + +* Value type is <> +* Default value is `125` + +Maximum number of events retrieved and processed together. A checkpoint is +created after each batch. Increasing this value may help with performance, but +requires more memory. + +[source,ruby] +---- +azure_event_hubs { + event_hub_connections => ["Endpoint=sb://example1...;EntityPath=event_hub_name1"] + max_batch_size => 125 +} +---- + +[id="plugins-{type}s-{plugin}-storage_connection"] +===== `storage_connection` +* Value type is <> +* No default value + +Connection string for blob account storage. Blob account storage persists the +offsets between restarts, and ensures that multiple instances of Logstash +process different partitions. +When this value is set, restarts resume where processing left off. +When this value is not set, the `initial_position` value is used on every restart. + +We strongly recommend that you define this value for production environments. + + +[source,ruby] +---- +azure_event_hubs { + event_hub_connections => ["Endpoint=sb://example1...;EntityPath=event_hub_name1"] + storage_connection => "DefaultEndpointsProtocol=https;AccountName=example...." +} +---- + +[id="plugins-{type}s-{plugin}-storage_container"] +===== `storage_container` +* Value type is <> +* Defaults to the Event Hub name if not defined + +Name of the storage container used to persist offsets and allow multiple instances of {ls} +to work together. + +[source,ruby] +---- +azure_event_hubs { + event_hub_connections => ["Endpoint=sb://example1...;EntityPath=event_hub_name1"] + storage_connection => "DefaultEndpointsProtocol=https;AccountName=example...." + storage_container => "my_container" +} +---- + +To avoid overwriting offsets, you can use different storage containers. This is +particularly important if you are monitoring two Event Hubs with the same name. +You can use the advanced configuration model to configure different storage +containers. + +[source,ruby] +---- +azure_event_hubs { + config_mode => "advanced" + consumer_group => "logstash" + storage_connection => "DefaultEndpointsProtocol=https;AccountName=example...." + event_hubs => [ + {"insights-operational-logs" => { + event_hub_connection => "Endpoint=sb://example1..." + storage_container => "insights-operational-logs-1" + }}, + {"insights-operational-logs" => { + event_hub_connection => "Endpoint=sb://example2..." + storage_container => "insights-operational-logs-2" + }} + ] + } +---- + +[id="plugins-{type}s-{plugin}-threads"] +===== `threads` +* Value type is <> +* Minimum value is `2` +* Default value is `4` + +Total number of threads used to process events. The value you set here applies +to all Event Hubs. Even with advanced configuration, this value is a global +setting, and can't be set per event hub. + +[source,ruby] +---- +azure_event_hubs { + threads => 4 +} +---- + +include::shared-module-options.asciidoc[] [[run-azure]] ==== Start the module @@ -155,131 +494,98 @@ To explore your data with Kibana: . Select the dashboard you want to see. -[[configuring-azure]] -==== Configure the Module +==== Azure module schema -You can specify additional options for the Logstash Azure module in the -`logstash.yml` configuration file or with overrides through the command line. For more information about configuring modules, see -<>. +This module reads data from the Azure Event Hub and adds some additional structure to the data for Activity Logs and SQL Diagnostics. The original data is always preserved and any data added or parsed will be namespaced under 'azure'. For example, 'azure.subscription' may have been parsed from a longer more complex URN. -Configure these values in the `logstash.yml` file. +[cols="<,<,<",options="header",] +|======================================================================= +|Name |Description|Notes -["source","yaml",subs="attributes"] ------ -modules: - - name: azure - var.input.azureeventhub.eventhub: "event_hub_name" - var.input.azureeventhub.key: "event_hub_key" - var.input.azureeventhub.username: "event_hub_username" - var.input.azureeventhub.namespace: "event_hub_namespace" - var.input.azureeventhub.partitions: "partitions" - var.elasticsearch.hosts: "localhost:9200" - var.elasticsearch.username: "elastic" - var.elasticsearch.password: "{pwd}" - var.kibana.host: “localhost:5601” - var.kibana.username: "elastic" - var.kibana.password: "{pwd}" ------ -TBD: Verify values and formatting for variables in previous. +|azure.subscription |Azure subscription from which this data originates. |Some Activity Log events may not be associated with a subscription. +|azure.group |Primary type of data. |Current values are either 'activity_log' or 'sql_diagnostics' +|azure.category* |Secondary type of data specific to group from which the data originated | +|azure.provider |Azure provider | +|azure.resource_group |Azure resource group | +|azure.resource_type |Azure resource type | +|azure.resource_name |Azure resource name | +|azure.database |Azure database name, for display purposes |SQL Diagnostics only +|azure.db_unique_id |Azure database name that is garunteed to be unique |SQL Diagnostics only +|azure.server |Azure server for the database |SQL Diagnostics only +|azure.server_and_database |Azure server and database combined |SQL Diagnostics only +|azure.metadata |Any @metadata added by the plugins, for example var.input.azure_event_hubs.decorate_events: true +|======================================================================= -[[azure-config-options]] -===== Configuration options +Notes: -The Azure module provides settings for configuring its behavior. These settings -include Azure-specific options and common options that are supported by all -Logstash modules. +* Activity Logs can have the following categories: "Administrative", "ServiceHealth", "Alert", "Autoscale", "Security" +* SQL Diagnostics can have the following categories: "Metric", "Blocks", "Errors", "Timeouts", "QueryStoreRuntimeStatistics", "QueryStoreWaitStatistics", "DatabaseWaitStatistics", "SQLInsights" -If you override a setting at the command line, remember to prefix the -setting with the module name. For example, use `azure.var.inputs` instead of -`var.inputs`. +Microsoft documents Activity log schema +https://docs.microsoft.com/en-us/azure/monitoring-and-diagnostics/monitoring-activity-log-schema[here]. +The SQL Diagnostics data is documented +https://docs.microsoft.com/en-us/azure/sql-database/sql-database-metrics-diag-logging[here]. +Elastic does not own these data models, and as such, cannot make any +assurances of information accuracy or passivity. -If you don't specify configuration settings, Logstash uses the defaults. +===== Special note - Properties field -TBD-check singular or plural on var.inputs. Apply globally as appropriate. -TBD-Discuss. Should we repeat options here or only reference https://github.com/Azure/azure-diagnostics-tools/tree/master/Logstash/logstash-input-azureeventhub +Many of the logs contain a "properties" top level field. This is often where the +most interesting data lives. There is not a fixed schema between log types for +properties fields coming from different sources. This can cause mapping errors +when shipping the data to Elasticsearch. -*Azure Module Options* +For example, one log may have +properties.type where one log sets this a String type and another sets this an +Integer type. To avoid mapping errors, the original properties field is moved to +__properties.. +For example +properties.type may end up as sql_diagnostics_Errors_properties.type or +activity_log_Security_properties.type depending on the group/category from where +the event originated. -All `var.input.azureeventhub.*` options are documented in the https://github.com/Azure/azure-diagnostics-tools/tree/master/Logstash/logstash-input-azureeventhub[Event Hub plugin]. -*`var.inputs`*:: -+ --- -* Should prev be `var.input`* or is plural correct? Verify the default. I guessed. -* Value type is <> -* Default value is "azureeventhub" --- -+ --- -Set the input(s) to expose for the Logstash Azure module. Valid settings are -"TBD". --- - -*`var.input.azureeventhub.eventhub`*:: -+ --- -* Value type is <> -* Default value is "localhost:39092" --- -+ --- -Event hub name. --- - -*`var.input.azureeventhub.key`*:: -+ --- -* Value type is -* Default value is --- -+ --- -TBD: Add description --- -*`var.input.azureeventhub.username`*:: -+ --- -* Value type is -* Default value is --- -+ -Name of the shared access policy. - -*`var.input.azureeventhub.namespace`*:: -+ --- -* Value type is -* Default value is --- -+ -TBD: Add description - -*`var.input.azureeventhub.partitions`*:: -+ --- -* Value type is -* Default value is --- -+ -Partition count of the target hub. - -TBD: Look at list of shared module options. Doc implies that all are available -for every module. Is that true? -include::shared-module-options.asciidoc[] +==== Testing -[[azure-production]] -==== Deploying the module in production +Testing modules is easiest with Docker and Docker compose to stand up instances of Elasticsearch and Kibana. Below is a Docker compose file that can be used for quick testing. + +[source,shell] +---- +version: '3' + +# docker-compose up --force-recreate + +services: -TBD: Can we break demo and deployment out in this way? + elasticsearch: + image: docker.elastic.co/elasticsearch/elasticsearch:6.2.4 + ports: + - "9200:9200" + - "9300:9300" + environment: + ES_JAVA_OPTS: "-Xmx512m -Xms512m" + discovery.type: "single-node" + networks: + - ek -Use SSL security. + kibana: + image: docker.elastic.co/kibana/kibana:6.2.4 + ports: + - "5601:5601" + networks: + - ek + depends_on: + - elasticsearch + +networks: + ek: + driver: bridge +---- + +[[azure-production]] +==== Deploying the module in production +Use security best practices to secure your configuration. +See {stack-ov}/xpack-security.html for details and recommendations. -:username!: -:hostname!: -:event_hub_name!: -:event_hub_key!: -:event_hub_username!: -:event_hub_namespace!: -:partitions!: \ No newline at end of file