Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add azure module doc #9510

Closed
wants to merge 1 commit into from
Closed

Add azure module doc #9510

wants to merge 1 commit into from

Conversation

karenzone
Copy link
Contributor

@karenzone karenzone commented May 2, 2018

Rework of unreleased Azure module content after redesign

@karenzone
Copy link
Contributor Author

@acchen97 @jakelandis Not ready for review, but I want to get your input on content and direction.
Jake, when we last spoke, you had some ideas for reworking your implementation that might change the docs.
I'm concerned about the squishiness between helping a user do a quick demo and deploying in a production environment. I'm sure I oversimplified by adding a Deployment section at the end, but wanted to get the conversation started.
Shall I set up some time for the three of us to discuss?

@dedemorton dedemorton mentioned this pull request May 23, 2018
49 tasks
=== Azure Module [Experimental]
experimental[]

The https://azure.microsoft.com/en-us/overview/what-is-azure/[Microsoft Azure] module in Logstash helps you easily integrate your Azure
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we put some more detail on what the module does? Maybe something like the below...

"The Logstash Azure module enables you to easily monitor your Azure cloud environments and SQL DB deployments with deep operational insights across multiple Azure subscriptions. Explore the health of your infrastructure in real-time to accelerate root cause analysis and decrease overall time to resolution.

  • Analyze infrastructure changes and authorization activity
  • Identify suspicious behaviors and potential malicious actors
  • Perform root-cause analysis by investigating user activity
  • Monitor and optimize your SQL DB deployments"



[[azure-prereqs]]
==== Prerequisites
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We also need to add or link to instructions for setting up Azure Monitor on the Azure side. Azure Monitor centralizes the logs to an Event Hub which we ingest from.

https://docs.microsoft.com/en-us/azure/monitoring-and-diagnostics/monitoring-overview-azure-monitor

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

suite of {kib} dashboards to help you start exploring your data immediately.

[[azure-dashboards]]
==== Dashboards
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good stuff here!


These {kib} dashboards are available and ready for you to use.

* *Overview*. Top-level view into your Azure operations, including info about users, resource groups, service health, access, activities, and alerts.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we group these into two logical sections?

  • "Infrastructure Activity Monitoring" - Overview, User Activity, and Alerts
  • "SQL DB Monitoring" - SQL DB Overview, SQL DB Database View, SQL DB Queries

@acchen97
Copy link
Contributor

acchen97 commented Jul 7, 2018

I like the direction, left a few comments. Happy to discuss live if you'd like.

To get started with the Azure module:

. Install the {logstash-ref}/plugins-inputs-azure_event_hubs.html[azure_event_hubs
plugin].
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

no need to install plugin. it is a default plugin as of 6.4


["source","shell"]
-----
bin/logstash-plugin install logstash_input_azure_event_hubs
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

no need to install plugin

[[azure-module-setup]]
===== Set up the module

TBD: From the LS directory?
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes

You can specify <<azure_config_options, options>> for the Logstash Azure module in the
`logstash.yml` configuration file or with overrides through the command line.

The azure_event_hubs plugin and the Azure module support two configuration
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

there are really 3 configuration modes, command line, basic, and advanced.

command line = 1 event hub
basic = multiple event hubs same config
advanced = multiple event hubs different config

I think it makes sense to document it this way instead of the 2 ways with the caveat that basic has two ways (command line vs. yml) and the command line only supports a single event hub.

`event_hub_connection`.

[id="plugins-{type}s-{plugin}-config_mode"]
===== `config_mode`
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the config mode is chosen automatically based on the yaml, no need to document here.


[source,ruby]
----
azure_event_hubs {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we should avoid using the plugin examples. the module maps the yaml configuration to the plugin configuration. I think the descriptions (for the most part) apply, but the examples, if any provided, should be in the yaml style.
For example:

    var.input.azure_event_hubs.event_hubs:
      - ["name",                      "event_hub_connection",      "storage_connection",                                      "initial_position", "decorate_events"]
      - ["insights-operational-logs", "Endpoint=sb://example1...", "DefaultEndpointsProtocol=https;AccountName=example1....", "HEAD",             "true"]
      - ["insights-metrics-pt1m",     "Endpoint=sb://example2...", "DefaultEndpointsProtocol=https;AccountName=example2....", "TAIL",             "true"]

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jakelandis Gotcha.
WRT: "the examples, if any provided..."

Before I do a lot of unnecessary reformatting, let's decide if we need the examples at all. If we need just a few examples, I'll focus on reworking those.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

maybe we can provide a an example of how to map the input the options to yaml, and then hyperlink each of the options to the input documentation. Maybe just a bullet list of the linked options ?

[[run-azure]]
==== Start the module

Run this command from the Logstash install directory:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Run this command from the Logstash install directory (with the logstash.yml properly configured) ?? ...or something like that.

Copy link
Contributor Author

@karenzone karenzone Jul 13, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jakelandis These are fantastic comments, and the updates are going well. Link ^^ isn't working for me. Can you point me in the right direction, please?

NM. I think I know what you're getting at.


Many of the logs contain a "properties" top level field. This is often where the
most interesting data lives. There is not a fixed schema between log types for
properties fields coming from different sources. This can cause mapping errors
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This can cause mapping errors ...

Can we nuke that sentence ? We cover it later with "To avoid mapping errors, ..."




==== Testing
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

let's leave this section out entirely ... i think it makes for the readme, but not the ascii doc.

@karenzone
Copy link
Contributor Author

@jakelandis @acchen97 Ready for another look.
@dedemorton FYI - I always appreciate your insights and comments if you have time to look.

Copy link
Contributor

@dedemorton dedemorton left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just did a quick scan. Nice work. Added a few comments about some of the asciidoc coding.

`event-hub-connections`. The the advanced configuration uses `event_hubs` and
`event_hub_connection`.

[id="plugins-{type}s-{plugin}-event_hubs"]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You shouldn't use the type and plugin attributes here. This convention is used in the plugin docs only. The attributes get set for each topic in the plugin docs and don't exist in the other parts of the Logstash Reference. The book probably builds without errors right now because the build ignores unresolved attributes 😱. However, if someone references this ID, the link will break. Use [[event_hubs]] instead, and make sure you change the other IDs in this topic. I'm guessing you must have copied this over from some plugin docs. :-) Easy mistake....


[cols="<,<,<",options="header",]
|=======================================================================
|Name |Description|Notes
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Tables can render strangely in our docs right now if there are long words (dotted names are often that problem) that result in text overrunning the navigation widget. Generally we try to avoid using complex tables (with more than two columns). At this point, I'd just check the output to see if there are overruns and use a different format strategy (maybe definition lists) if there are issues.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good on local build. Will verify again.


Microsoft is the best source for the most up-to-date Azure information.

* [Overview of Azure Monitor]https://docs.microsoft.com/en-us/azure/monitoring-and-diagnostics/monitoring-overview-azure-monitor
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is markdown, not asciidoc formatting. Ditto for the other items in this list.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Doh! Is this correct?
https://docs.microsoft.com/en-us/azure/monitoring-and-diagnostics/monitoring-overview-azure-monitor[Overview of Azure Monitor]

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks right to me! Using the wrong link format is a common mistake for the dev team because they have to switch between markdown and asciidoc all the time.


Connection string that identifies the Event Hub to be read. Advanced
configuration options can be set per Event Hub. This option modifies
`event_hub_name`, and should be nested under it. (See sample.) This option
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm just skimming for issues, but it seems like "See sample" is an incomplete sentence and perhaps needs a link

* Value type is <<boolean,boolean>>
* Default value is `false`

Adds metadata about the Event Hub, including Event Hub name, consumer_group,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should these be formatted in backticks? If not, maybe change this to say consumer group, processor host, and so on (without underscores).

properties fields coming from different sources.

For example, one log may have
properties.type where one log sets this a String type and another sets this an
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

usually we format field names in backticks.

- ["insights-operational-logs", "Endpoint=sb://example4...", "DefaultEndpointsProtocol=https;AccountName=example4....", "HEAD", "true"]
-----

===== Map input options to the logstash.yml file
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we leave out this section entirely, the examples above should be sufficient.

===== `event_hubs`
* Value type is <<array,array>>
* No default value
* Ignored for basic configuration
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: Ignored for basic and command line configuration.

* Ignored for basic configuration
* Required for advanced configuration

Defines the Event Hubs to be read. An array of hashes where each entry is a
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is slightly different for the yaml...

How about:

Defines the per Event Hubs configuration for the advanced configuration (<- link to above). 

===== `event_hub_connections`
* Value type is <<array,array>>
* No default value
* Required for basic configuration
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: Required for basic and command line configuration
and * Ignored for advanced configuration.

List of connection strings that identifies the Event Hubs to be read. Connection
strings include the EntityPath for the Event Hub.

The `event_hub_connections` option is defined
Copy link
Contributor

@jakelandis jakelandis Jul 18, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we can remove this sentence.

edit: both sentences, actually.



[id="plugins-{type}s-{plugin}-event_hub_connection"]
===== `event_hub_connection`
Copy link
Contributor

@jakelandis jakelandis Jul 18, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a bit tricky... and we may want to deviate from the plugin doc a bit here.

I would suggest to remove this section entirely, and take the note on line 208 and put it (or make a new similar one) under 'event_hubs' . I think that will sufficiently document this options.

Since we are de-emphasizing the advanced config, and for YAML this only appears in the table name row (not a top level config option like in the plugin), it feels better to leave this out.

* [Azure SQL Database metrics and diagnostics logging]https://docs.microsoft.com/en-us/azure/sql-database/sql-database-metrics-diag-logging
* [Stream the Azure Activity Log to Event Hubs]https://docs.microsoft.com/en-us/azure/monitoring-and-diagnostics/monitoring-stream-activity-logs-event-hubs


Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

-----

[[azure_config_options]]
===== Configuration options
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we add a quick blurb about prefixing the configuration options ? With an example or two?

For example:
threads configuration
Command line:

-M "azure.var.input.azure_event_hubs.threads=8"

Basic:

var.input.azure_event_hubs.threads: 8

Advanced:

var.input.azure_event_hubs.event_hubs:
  - ["name",                      "event_hub_connection",      "threads"]
  - ["insights-operational-logs", "Endpoint=sb://example1...", 8]

@jakelandis
Copy link
Contributor

@karenzone - this is looking really good. I left some comments, please take them as suggestions not must dos.

@karenzone
Copy link
Contributor Author

FYI - I'm still working to resolve the comment about prefixing config option. Wanted to get this back out to review quickly so @acchen97 can see where we are.

@jakelandis WRT #9510 (comment). Please see if I resolved this issue the way you wanted it.

* **Create a {ls} consumer group.**
Create a new consumer group specifically for {ls}}. Do not use the $default or
any other consumer group that might already be in use. Reusing consumer groups
among non-related consumers can cause expected behavior and possibly lost
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

typo: can cause unexpected behavior ?

Copy link
Contributor

@acchen97 acchen97 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good overall! Left some comments, though some of them are nits. Thanks for all your hard work on this @jakelandis @karenzone

[[azure-dashboards]]
==== Dashboards

These {kib} dashboards are available and ready for you to use. You can use them they are, or tailor them to meet your needs.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You can use them as they are


* *Overview*. Top-level view into your Azure operations, including info about users, resource groups, service health, access, activities, and alerts.

* *Alerts*. Alert info, including activity, alert status (activated, resolved, succeeded), and alerts heatmap
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry this is a nit but we can prob remove the "(activated, resolved, succeeded)" bit


* *User Activity*. Info about system users, their activity, and requests.

===== SQL database monitoring
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For branding purposes, can we capitalize "Database"?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same for the instances below as well


* *SQL DB Database View*. Detailed info about each SQL database, including wait time, errors, DTU and storage utilization, size, and read and write input/output.

* *SQL DB Queries*. Info about SQL database queries, including DTU Utilization, errors, and query duration and wait time.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's exclude the last bit here. Just "info about SQL Database queries and performance" is fine.

The Elastic Stack version 6.4 (or later) is required for this module.

NOTE: Logstash, Elasticsearch, and Kibana must run locally. You can also run
Elasticsearch, Kibana and Logstash on separate hosts to consume data from Azure.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm aware other modules like ArcSight have a similar statement here. To clarify the context, should be start this with "For these instructions below..."?

Copy link
Contributor Author

@karenzone karenzone Jul 23, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems like the Prereqs designation and the flow of the topic implies order. I avoid extra words whenever possible because "the more you write, the less people read."
If you feel strongly that there is risk of confusion, I will add.

[[azure-module-prereqs]]
==== Prerequisites

The Elastic Stack and Microsoft Azure Event Hubs are required for this module.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Perhaps more accurately for the Azure side: "must have Azure Monitor enabled with Azure Event Hubs"

Event Hubs. The `--setup` option creates an `azure-*` index pattern in
Elasticsearch and imports Kibana dashboards and visualizations.

NOTE: The `--setup` option is intended only for first-time setup. If you include `setup` on subsequent runs, your existing Kibana dashboards will be overwritten.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Second "setup" here is missing the "--"

Here are some guidelines to help you avoid data conflicts that can cause lost
events.

* **Create a {ls} consumer group.**
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jakelandis are these best practices on the Azure website that we could link to or did you curate them yourself? My concern is that this content may go stale based on new Event Hub releases.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I curated them. I don't think the existing suggestions here will ever go out of fashion since it is at the core of how the event hubs work. However, there may be new (or existing) best practices not represented here.

The only best practice I found from MS is https://docs.microsoft.com/en-us/azure/event-hubs/event-hubs-faq#best-practices and most of their documentation is standard howto, configuration, pricing, etc. Not much consumer guidance.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Got it, thanks for the insight.


NOTE: All Event Hubs options are common to both basic and advanced
configurations, with the following exceptions. The basic configuration uses
`event-hub-connections`. The the advanced configuration uses `event_hubs` and
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jakelandis this seems a bit confusing from a usability standpoint. The minor syntactical difference here may throw users off. Can you elaborate on why this is the case?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm happy to wait for more user feedback before we think too much into it though.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this seems a bit confusing from a usability standpoint

agreed. but needed.

The minor syntactical difference here may throw users off. Can you elaborate on why this is the case?

So... the configuration is either (pseudo configuration for illustration, not exact syntax):

event-hub-connections=[connection1, connection2, ...]
threads=8

or

event-hubs
|__ event-hub-connection=connection1, threads=8
|__ event-hub-connection=connection2, threads=4

the former supports many connections (hence the plural) the latter supports only 1 connection (hence the singular)

This is the same behavior as the input plugin for basic vs. advanced configuration. I think a user should see the discouragement of the advanced configuration (above) and not concern themselves with this ... or would copy/paste the example as a starting point (which has the correct syntax).

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Got it, then in that case there may be a typo for the advanced config option as it says event_hubs with an underscore rather than a dash. /cc @karenzone

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sorry, the underscores (and documentation) are correct. my above comment incorrectly used dashes instead of underscores.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

actually... there is a typo above event-hub-connections should be event_hub_connections @karenzone sorry if misled you. All options should use underscores not dashes.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Corrected typo here and in azure_event_hubs plugin doc. Called attention to plural/singular treatment for emphasis. Let me know if this helps clarify.


==== Azure module schema

This module reads data from the Azure Event Hub and adds some additional structure to the data for Activity Logs and SQL Diagnostics. The original data is always preserved and any data added or parsed will be namespaced under 'azure'. For example, 'azure.subscription' may have been parsed from a longer more complex URN.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Love this section.

@jakelandis
Copy link
Contributor

@karenzone - nice work! LGTM

Copy link
Contributor

@dedemorton dedemorton left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Very minor comment about note formatting. Otherwise, LGTM. Nice work!

* `look_back` reads `end` minus a number of seconds worth of pre-existing events.
You control the number of seconds using the `initial_position_look_back` option.

Note: If `storage_connection` is set, the `initial_position` value is used only
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Need to write NOTE not Note if you want this to render in a note box.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for looking. In this case, I wanted to call out the info, but not with as much emphasis as the note box.

@elasticsearch-bot
Copy link

karen.metts merged this into the following branches!

Branch Commits
master 89ffcd1
6.x 288ef36
6.4 a919ef2

elasticsearch-bot pushed a commit that referenced this pull request Jul 27, 2018
elasticsearch-bot pushed a commit that referenced this pull request Jul 27, 2018
@karenzone karenzone deleted the azure branch July 27, 2018 18:23
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants