From 85d0329e988a5221cd84cae64b0e86815b9e3169 Mon Sep 17 00:00:00 2001 From: Karen Metts Date: Wed, 2 May 2018 10:16:55 -0400 Subject: [PATCH] Add azure module doc --- docs/index.asciidoc | 3 + docs/static/azure-module.asciidoc | 461 ++++++++++++++++++++++++++++++ 2 files changed, 464 insertions(+) create mode 100644 docs/static/azure-module.asciidoc diff --git a/docs/index.asciidoc b/docs/index.asciidoc index e8281cfd2c4..25352ba5866 100644 --- a/docs/index.asciidoc +++ b/docs/index.asciidoc @@ -156,6 +156,9 @@ include::static/arcsight-module.asciidoc[] :edit_url: https://github.com/elastic/logstash/edit/{branch}/docs/static/netflow-module.asciidoc include::static/netflow-module.asciidoc[] +:edit_url: https://github.com/elastic/logstash/edit/{branch}/docs/static/azure-module.asciidoc +include::static/azure-module.asciidoc[] + // Working with Filebeat Modules :edit_url: https://github.com/elastic/logstash/edit/{branch}/docs/static/filebeat-modules.asciidoc diff --git a/docs/static/azure-module.asciidoc b/docs/static/azure-module.asciidoc new file mode 100644 index 00000000000..778cef23417 --- /dev/null +++ b/docs/static/azure-module.asciidoc @@ -0,0 +1,461 @@ +[role="xpack"] +[[azure-module]] +=== Azure Module [Experimental] +experimental[] + +The https://azure.microsoft.com/en-us/overview/what-is-azure/[Microsoft Azure] +module in Logstash helps you easily integrate your Azure activity logs and SQL +diagnostic logs with the Elastic Stack. + +You can monitor your Azure cloud environments and SQL DB deployments with +deep operational insights across multiple Azure subscriptions. You can explore +the health of your infrastructure in real-time, accelerating root cause analysis +and decreasing overall time to resolution. The Azure module helps you: + +* Analyze infrastructure changes and authorization activity +* Identify suspicious behaviors and potential malicious actors +* Perform root-cause analysis by investigating user activity +* Monitor and optimize your SQL DB deployments. + +NOTE: The Logstash Azure module is an +https://www.elastic.co/products/x-pack[{xpack}] feature under the Basic License +and is therefore free to use. Please contact +mailto:monitor-azure@elastic.co[monitor-azure@elastic.co] for questions or more +information. + +The Azure module uses the +{logstash-ref}/plugins-inputs-azure_event_hubs.html[Logstash Azure Event Hubs +plugin] to consume data from Azure Event Hubs. The module taps directly into the +Azure dashboard, parses and indexes events into Elasticsearch, and installs a +suite of {kib} dashboards to help you start exploring your data immediately. + +[[azure-dashboards]] +==== Dashboards + +These {kib} dashboards are available and ready for you to use. You can use them as they are, or tailor them to meet your needs. + +===== Infrastructure activity monitoring + +* *Overview*. Top-level view into your Azure operations, including info about users, resource groups, service health, access, activities, and alerts. + +* *Alerts*. Alert info, including activity, alert status, and alerts heatmap + +* *User Activity*. Info about system users, their activity, and requests. + +===== SQL Database monitoring + +* *SQL DB Overview*. Top-level view into your SQL Databases, including counts for databases, servers, resource groups, and subscriptions. + +* *SQL DB Database View*. Detailed info about each SQL Database, including wait time, errors, DTU and storage utilization, size, and read and write input/output. + +* *SQL DB Queries*. Info about SQL Database queries and performance. + +==== Azure_event_hubs plugin + +The Azure module uses the `azure_event_hubs` plugin. Basic understanding of the +plugin and options is helpful when you set up the Azure module. See +{logstash-ref}/plugins-inputs-azure_event_hubs.html[azure_event_hubs plugin +documentation] for more information about configurations and options. + +[[azure-module-prereqs]] +==== Prerequisites + +Azure Monitor enabled with Azure Event Hubs and the Elastic Stack are required +for this module. + +[[azure-elk-prereqs]] +===== Elastic prereqs + +Logstash, Elasticsearch, and Kibana should be installed and running. The +products are https://www.elastic.co/downloads[available to download] and easy to +install. + +The Elastic Stack version 6.4 (or later) is required for this module. + +NOTE: Logstash, Elasticsearch, and Kibana must run locally. You can also run +Elasticsearch, Kibana and Logstash on separate hosts to consume data from Azure. + +[[azure-prereqs]] +===== Azure prereqs + +Azure Monitor should be configured to stream logs to one or more Event Hubs. +See <> at the end of this topic for links to Microsoft Azure documentation. + + +[[azure-module-setup]] +==== Set up the module + +Modify this command for your environment, and run it from the Logstash +directory. + +["source","shell",subs="attributes"] +----- +bin/logstash --modules azure --setup \ + -M "azure.var.elasticsearch.username={username}" \ + -M "azure.var.elasticsearch.password={pwd}" \ + -M "azure.var.kibana.username={username}" \ + -M "azure.var.kibana.password={pwd}" \ + -M "azure.var.elasticsearch.hosts={hostname}" \ + -M "azure.var.kibana.host={hostname}" +----- + +The `--modules azure` option starts a Logstash pipeline for ingestion from Azure +Event Hubs. The `--setup` option creates an `azure-*` index pattern in +Elasticsearch and imports Kibana dashboards and visualizations. + +NOTE: The `--setup` option is intended only for first-time setup. If you include +`--setup` on subsequent runs, your existing Kibana dashboards will be +overwritten. + + +[[configuring-azure]] +==== Configure the module + +You can specify <> for the Logstash Azure module in the +`logstash.yml` configuration file or with overrides through the command line. + +* *Command line configuration.* You can pass configuration options and run the module from the command line. +The command line configuration uses the configuration options you provide and supports only one Event Hub. + +* *Basic configuration.* You can use the `logstash.yml` file to configure inputs from multiple Event Hubs that share the same configuration. + +* *Advanced configuration.* The advanced configuration is available for deployments where different Event Hubs +require different configurations. The `logstash.yml` file holds your settings. Advanced configuration is not necessary or +recommended for most use cases. + +See {logstash-ref}/plugins-inputs-azure_event_hubs.html[azure_event_hubs plugin +documentation] for more information about basic and advanced configuration +models. + +[[command-line-sample]] +===== Command line configuration sample + +You can use the command line to set up the basic configuration for a single +Event Hub. This command starts the Azure module with command line arguments, +bypassing any settings in `logstash.yml`. + +["source","shell",subs="attributes"] +----- +bin/logstash --modules azure -M "azure.var.elasticsearch.host=es.mycloud.com" -M "azure.var.input.azure_event_hubs.threads=8" -M "azure.var.input.azure_event_hubs.consumer_group=logstash" -M "azure.var.input.azure_event_hubs.decorate_events=true" -M "azure.var.input.azure_event_hubs.event_hub_connections=Endpoint=sb://example1...EntityPath=insights-logs-errors" -M "azure.var.input.azure_event_hubs.storage_connection=DefaultEndpointsProtocol=https;AccountName=example...." +----- + +The command line configuration requires the `azure.var.input.azure_event_hubs.` prefix before a configuration option. +Notice the notation for the `threads` option. + +[[basic-config-sample]] +===== Basic configuration sample + +The configuration in the `logstash.yml` file is shared between Event Hubs. + +["source","shell",subs="attributes"] +----- +modules: + - name: azure + var.elasticsearch.hosts: "localhost:9200" + var.kibana.host: "localhost:5601" + var.input.azure_event_hubs.threads: 8 + var.input.azure_event_hubs.decorate_events: true + var.input.azure_event_hubs.consumer_group: "logstash" + var.input.azure_event_hubs.storage_connection: "DefaultEndpointsProtocol=https;AccountName=example...." + var.input.azure_event_hubs.event_hub_connections: + - "Endpoint=sb://example1...EntityPath=insights-logs-errors" + - "Endpoint=sb://example2...EntityPath=insights-metrics-pt1m" +----- + +The basic configuration requires the `var.input.azure_event_hubs.` prefix +before a configuration option. +Notice the notation for the `threads` option. + +[[adv-config-sample]] +===== Advanced configuration sample + +Advanced configuration in the `logstash.yml` file supports Event Hub specific options. +Advanced configuration is not necessary or recommended for most use cases. Use +it only if it is required for your deployment scenario. + +You must define the `header` array with `name` in the first position. You can +define other options in any order. The per Event Hub configuration takes +precedence. Any values not defined per Event Hub use the global config value. + +In this example 'consumer_group' will be applied to each of the configured Event +Hubs. Note that 'decorate_events' is defined in both the 'global' and per Event Hub +configuration. The per Event Hub configuration takes precedence, and the +global configuration is effectively ignored. + +["source","shell",subs="attributes"] +----- +modules: + - name: azure + var.elasticsearch.hosts: "localhost:9200" + var.kibana.host: "localhost:5601" + var.input.azure_event_hubs.threads: 8 + var.input.azure_event_hubs.decorate_events: true + var.input.azure_event_hubs.consumer_group: logstash + var.input.azure_event_hubs.event_hubs: + - ["name", "event_hub_connection", "storage_connection", "initial_position", "decorate_events"] + - ["insights-operational-logs", "Endpoint=sb://example1...", "DefaultEndpointsProtocol=https;AccountName=example1....", "HEAD", "true"] + - ["insights-metrics-pt1m", "Endpoint=sb://example2...", "DefaultEndpointsProtocol=https;AccountName=example2....", "TAIL", "true"] + - ["insights-logs-errors", "Endpoint=sb://example3...", "DefaultEndpointsProtocol=https;AccountName=example3....", "TAIL", "false"] + - ["insights-operational-logs", "Endpoint=sb://example4...", "DefaultEndpointsProtocol=https;AccountName=example4....", "HEAD", "true"] +----- + +The advanced configuration doesn't require a prefix before a per Event Hub +configuration option. Notice the notation for the `decorate_events` option. + +[[azure_best_practices]] +===== Best practices + +Here are some guidelines to help you avoid data conflicts that can cause lost +events. + +* **Create a {ls} consumer group.** +Create a new consumer group specifically for {ls}. Do not use the $default or +any other consumer group that might already be in use. Reusing consumer groups +among non-related consumers can cause unexpected behavior and possibly lost +events. All {ls} instances should use the same consumer group so that they can +work together for processing events. +* **Avoid overwriting offset with multiple Event Hubs.** +The offsets (position) of the Event Hubs are stored in the configured Azure Blob +store. The Azure Blob store uses paths like a file system to store the offsets. +If the paths between multiple Event Hubs overlap, then the offsets may be stored +incorrectly. +To avoid duplicate file paths, use the advanced configuration model and make +sure that at least one of these options is different per Event Hub: +** storage_connection +** storage_container (defaults to Event Hub name if not defined) +** consumer_group + + +[[azure_config_options]] +===== Configuration options + +NOTE: All Event Hubs options are common to both basic and advanced +configurations, with the following exceptions. The basic configuration uses +`event_hub_connections` to support multiple connections. The advanced +configuration uses `event_hubs` and `event_hub_connection` (singular). + +[[azure_event_hubs]] +===== `event_hubs` +* Value type is <> +* No default value +* Ignored for basic and command line configuration +* Required for advanced configuration + +Defines the per Event Hubs configuration for the <>. + +The advanced configuration uses `event_hub_connection` instead of `event_hub_connections`. +The `event_hub_connection` option is defined per Event Hub. + +[[azure_event_hub_connections]] +===== `event_hub_connections` +* Value type is <> +* No default value +* Required for basic and command line configuration +* Ignored for advanced configuration + +List of connection strings that identifies the Event Hubs to be read. Connection +strings include the EntityPath for the Event Hub. + + +[[azure_checkpoint_interval]] +===== `checkpoint_interval` +* Value type is <> +* Default value is `5` seconds +* Set to `0` to disable. + +Interval in seconds to write checkpoints during batch processing. Checkpoints +tell {ls} where to resume processing after a restart. Checkpoints are +automatically written at the end of each batch, regardless of this setting. + +Writing checkpoints too frequently can slow down processing unnecessarily. + + +[[azure_consumer_group]] +===== `consumer_group` +* Value type is <> +* Default value is `$Default` + +Consumer group used to read the Event Hub(s). Create a consumer group +specifically for Logstash. Then ensure that all instances of Logstash use that +consumer group so that they can work together properly. + + +[[azure_decorate_events]] +===== `decorate_events` + +* Value type is <> +* Default value is `false` + +Adds metadata about the Event Hub, including `Event Hub name`, `consumer_group`, +`processor_host`, `partition`, `offset`, `sequence`, `timestamp`, and `event_size`. + + +[[azure_initial_position]] +===== `initial_position` +* Value type is <> +* Valid arguments are `beginning`, `end`, `look_back` +* Default value is `beginning` + +When first reading from an Event Hub, start from this position: + +* `beginning` reads all pre-existing events in the Event Hub +* `end` does not read any pre-existing events in the Event Hub +* `look_back` reads `end` minus a number of seconds worth of pre-existing events. +You control the number of seconds using the `initial_position_look_back` option. + +Note: If `storage_connection` is set, the `initial_position` value is used only +the first time Logstash reads from the Event Hub. + + +[[azure_initial_position_look_back]] +===== `initial_position_look_back` +* Value type is <> +* Default value is `86400` +* Used only if `initial_position` is set to `look-back` + +Number of seconds to look back to find the initial position for pre-existing +events. This option is used only if `initial_position` is set to `look_back`. If +`storage_connection` is set, this configuration applies only the first time {ls} +reads from the Event Hub. + + +[[azure_max_batch_size]] +===== `max_batch_size` + +* Value type is <> +* Default value is `125` + +Maximum number of events retrieved and processed together. A checkpoint is +created after each batch. Increasing this value may help with performance, but +requires more memory. + + +[[azure_storage_connection]] +===== `storage_connection` +* Value type is <> +* No default value + +Connection string for blob account storage. Blob account storage persists the +offsets between restarts, and ensures that multiple instances of Logstash +process different partitions. +When this value is set, restarts resume where processing left off. +When this value is not set, the `initial_position` value is used on every restart. + +We strongly recommend that you define this value for production environments. + + +[[azure_storage_container]] +===== `storage_container` +* Value type is <> +* Defaults to the Event Hub name if not defined + +Name of the storage container used to persist offsets and allow multiple instances of {ls} +to work together. + +To avoid overwriting offsets, you can use different storage containers. This is +particularly important if you are monitoring two Event Hubs with the same name. +You can use the advanced configuration model to configure different storage +containers. + + +[[azure_threads]] +===== `threads` +* Value type is <> +* Minimum value is `2` +* Default value is `4` + +Total number of threads used to process events. The value you set here applies +to all Event Hubs. Even with advanced configuration, this value is a global +setting, and can't be set per event hub. + + +include::shared-module-options.asciidoc[] + +[[run-azure]] +==== Start the module + +. Be sure that the `logstash.yml` file is <>. +. Run this command from the Logstash directory: + +["source","shell",subs="attributes"] +----- +bin/logstash +----- + +[[exploring-data-azure]] +==== Explore your data +When the Logstash Azure module starts receiving events, you can begin using the +packaged Kibana dashboards to explore and visualize your data. + +To explore your data with Kibana: + +. Open a browser to http://localhost:5601[http://localhost:5601] (username: + "elastic"; password: "{pwd}") +. Click *Dashboard*. +. Select the dashboard you want to see. + + +==== Azure module schema + +This module reads data from the Azure Event Hub and adds some additional structure to the data for Activity Logs and SQL Diagnostics. The original data is always preserved and any data added or parsed will be namespaced under 'azure'. For example, 'azure.subscription' may have been parsed from a longer more complex URN. + +[cols="<,<,<",options="header",] +|======================================================================= +|Name |Description|Notes + +|azure.subscription |Azure subscription from which this data originates. |Some Activity Log events may not be associated with a subscription. +|azure.group |Primary type of data. |Current values are either 'activity_log' or 'sql_diagnostics' +|azure.category* |Secondary type of data specific to group from which the data originated | +|azure.provider |Azure provider | +|azure.resource_group |Azure resource group | +|azure.resource_type |Azure resource type | +|azure.resource_name |Azure resource name | +|azure.database |Azure database name, for display purposes |SQL Diagnostics only +|azure.db_unique_id |Azure database name that is guaranteed to be unique |SQL Diagnostics only +|azure.server |Azure server for the database |SQL Diagnostics only +|azure.server_and_database |Azure server and database combined |SQL Diagnostics only +|azure.metadata |Any @metadata added by the plugins, for example var.input.azure_event_hubs.decorate_events: true +|======================================================================= + +Notes: + +* Activity Logs can have the following categories: Administrative, ServiceHealth, Alert, Autoscale, Security +* SQL Diagnostics can have the following categories: Metric, Blocks, Errors, Timeouts, QueryStoreRuntimeStatistics, QueryStoreWaitStatistics, DatabaseWaitStatistics, SQLInsights + +Microsoft documents Activity log schema +https://docs.microsoft.com/en-us/azure/monitoring-and-diagnostics/monitoring-activity-log-schema[here]. +The SQL Diagnostics data is documented +https://docs.microsoft.com/en-us/azure/sql-database/sql-database-metrics-diag-logging[here]. +Elastic does not own these data models, and as such, cannot make any +assurances of information accuracy or passivity. + +===== Special note - Properties field + +Many of the logs contain a `properties` top level field. This is often where the +most interesting data lives. There is not a fixed schema between log types for +properties fields coming from different sources. + +For example, one log may have +`properties.type` where one log sets this a String type and another sets this an +Integer type. To avoid mapping errors, the original properties field is moved to +`__properties.`. +For example +`properties.type` may end up as `sql_diagnostics_Errors_properties.type` or +`activity_log_Security_properties.type` depending on the group/category where +the event originated. + +[[azure-production]] +==== Deploying the module in production + +Use security best practices to secure your configuration. +See {stack-ov}/xpack-security.html for details and recommendations. + +[[azure-resources]] +==== Microsoft Azure resources + +Microsoft is the best source for the most up-to-date Azure information. + +* https://docs.microsoft.com/en-us/azure/monitoring-and-diagnostics/monitoring-overview-azure-monitor[Overview of Azure Monitor] +* https://docs.microsoft.com/en-us/azure/sql-database/sql-database-metrics-diag-logging[Azure SQL Database metrics and diagnostics logging] +* https://docs.microsoft.com/en-us/azure/monitoring-and-diagnostics/monitoring-stream-activity-logs-event-hubs[Stream the Azure Activity Log to Event Hubs] +