Azure Module

Warning

This functionality is experimental and may be changed or removed completely in a future release. Elastic will take a best effort approach to fix any issues, but experimental features are not subject to the support SLA of official GA features.

The Microsoft Azuremodule in Logstash helps you easily integrate your Azure activity logs and SQLdiagnostic logs with the Elastic Stack.

Azure Work Flow

You can monitor your Azure cloud environments and SQL DB deployments withdeep operational insights across multiple Azure subscriptions. You can explorethe health of your infrastructure in real-time, accelerating root cause analysisand decreasing overall time to resolution. The Azure module helps you:

  • Analyze infrastructure changes and authorization activity
  • Identify suspicious behaviors and potential malicious actors
  • Perform root-cause analysis by investigating user activity
  • Monitor and optimize your SQL DB deployments.
Note

The Logstash Azure module is anX-Pack feature under the Basic Licenseand is therefore free to use. Please contactmonitor-azure@elastic.co for questions or moreinformation.

The Azure module uses theLogstash Azure Event Hubsinput plugin to consume data from Azure Event Hubs. The module taps directly into theAzure dashboard, parses and indexes events into Elasticsearch, and installs asuite of Kibana dashboards to help you start exploring your data immediately.

Dashboards

These Kibana dashboards are available and ready for you to use. You can use them as they are, or tailor them to meet your needs.

Infrastructure activity monitoring

  • Overview. Top-level view into your Azure operations, including info about users, resource groups, service health, access, activities, and alerts.
  • Alerts. Alert info, including activity, alert status, and alerts heatmap
  • User Activity. Info about system users, their activity, and requests.

SQL Database monitoring

  • SQL DB Overview. Top-level view into your SQL Databases, including counts for databases, servers, resource groups, and subscriptions.
  • SQL DB Database View. Detailed info about each SQL Database, including wait time, errors, DTU and storage utilization, size, and read and write input/output.
  • SQL DB Queries. Info about SQL Database queries and performance.

Prerequisites

Azure Monitor enabled with Azure Event Hubs and the Elastic Stack are requiredfor this module.

Elastic prerequisites

The instructions below assume that you have Logstash, Elasticsearch, and Kibana running locally.You can also run Logstash, Elasticsearch, and Kibana on separate hosts.

The Elastic Stack version 6.4 (or later) is required for this module.

The Azure module uses the azure_event_hubs input plugin to consume logs andmetrics from your Azure environment. It is installed by default with Logstash 6.4(or later). Basic understanding of the plugin and options is helpful when youset up the Azure module.See the azure_event_hubs inputplugin documentation for more information.

Elastic products are available to download andeasy to install.

Azure prerequisites

Azure Monitor should be configured to stream logs to one or more Event Hubs.Logstash will need to access these Event Hubs instances to consume your Azure logs and metrics.See Microsoft Azure resources at the end of this topic for links to Microsoft Azure documentation.

Configure the module

Specify options for the Logstash Azure module in thelogstash.yml configuration file.

  • Basic configuration. You can use the logstash.yml file to configure inputs from multiple Event Hubs that share the same configuration.Basic configuration is recommended for most use cases.
  • Advanced configuration. The advanced configuration is available for deployments where different Event Hubsrequire different configurations. The logstash.yml file holds your settings. Advanced configuration is not necessary orrecommended for most use cases.

See the azure_event_hubs input plugindocumentation for more information about basic and advanced configurationmodels.

Basic configuration sample

The configuration in the logstash.yml file is shared between Event Hubs.Basic configuration is recommended for most use cases

modules:  - name: azure    var.elasticsearch.hosts: localhost:9200    var.kibana.host: localhost:5601    var.input.azure_event_hubs.consumer_group: "logstash"     var.input.azure_event_hubs.storage_connection: "DefaultEndpointsProtocol=https;AccountName=instance1..."     var.input.azure_event_hubs.threads: 9     var.input.azure_event_hubs.event_hub_connections:      - "Endpoint=sb://...EntityPath=insights-operational-logs"       - "Endpoint=sb://...EntityPath=insights-metrics-pt1m"       - "Endpoint=sb://...EntityPath=insights-logs-blocks"      - "Endpoint=sb://...EntityPath=insights-logs-databasewaitstatistics"      - "Endpoint=sb://...EntityPath=insights-logs-errors"      - "Endpoint=sb://...EntityPath=insights-logs-querystoreruntimestatistics"      - "Endpoint=sb://...EntityPath=insights-logs-querystorewaitstatistics"      - "Endpoint=sb://...EntityPath=insights-logs-timeouts"

The consumer_group (optional) is highly recommended. See Best practices.

The storage_connection (optional) sets the Azure Blob Storage connection for tracking processing state for Event Hubs when scaling out a deployment with multiple Logstash instances. See Scale Event Hub consumption for additional details.

See Best practices for guidelines on choosing an appropriate number of threads.

This connection sets up the consumption of Activity Logs. By default, Azure Monitor uses the insights-operational-logs Event Hub name. Make sure this matches the name of the Event Hub specified for Activity Logs.

This connection and the ones below set up the consumption of SQL DB diagnostic logs and metrics. By default, Azure Monitor uses all these different Event Hub names.

The basic configuration requires the var.input.azure_event_hubs. prefixbefore a configuration option.Notice the notation for the threads option.

Advanced configuration sample

Advanced configuration in the logstash.yml file supports Event Hub specificoptions. Advanced configuration is available for more granular tuning ofthreading and Blob Storage usage across multiple Event Hubs. Advancedconfiguration is not necessary or recommended for most use cases. Use it only ifit is required for your deployment scenario.

You must define the header array with name in the first position. You candefine other options in any order. The per Event Hub configuration takesprecedence. Any values not defined per Event Hub use the global config value.

In this example threads, consumer_group, and storage_connection will beapplied to each of the configured Event Hubs. Note that decorate_events isdefined in both the global and per Event Hub configuration. The per Event Hubconfiguration takes precedence, and the global configuration is effectivelyignored when the per Event Hub setting is present.

modules:  - name: azure    var.elasticsearch.hosts: localhost:9200    var.kibana.host: localhost:5601    var.input.azure_event_hubs.decorate_events: true     var.input.azure_event_hubs.threads: 9     var.input.azure_event_hubs.consumer_group: "logstash"    var.input.azure_event_hubs.storage_connection: "DefaultEndpointsProtocol=https;AccountName=instance1..."    var.input.azure_event_hubs.event_hubs:      - ["name",                                    "initial_position",  "storage_container",  "decorate_events",  "event_hub_connection"]                                         - ["insights-operational-logs",                 "TAIL",              "activity-logs1",    "true",             "Endpoint=sb://...EntityPath=insights-operational-logs"]      - ["insights-operational-logs",                 "TAIL",              "activity_logs2",   "true",             "Endpoint=sb://...EntityPath=insights-operational-logs"]      - ["insights-metrics-pt1m",                     "TAIL",              "dbmetrics",         "true",             "Endpoint=sb://...EntityPath=insights-metrics-pt1m"]      - ["insights-logs-blocks",                      "TAIL",              "dbblocks",          "true",             "Endpoint=sb://...EntityPath=insights-logs-blocks"]      - ["insights-logs-databasewaitstatistics",      "TAIL",              "dbwaitstats",       "false",            "Endpoint=sb://...EntityPath=insights-logs-databasewaitstatistics"]      - ["insights-logs-errors",                      "HEAD",              "dberrors",          "true",             "Endpoint=sb://...EntityPath=insights-logs-errors"      - ["insights-logs-querystoreruntimestatistics", "TAIL",              "dbstoreruntime",    "true",             "Endpoint=sb://...EntityPath=insights-logs-querystoreruntimestatistics"]      - ["insights-logs-querystorewaitstatistics",    "TAIL",              "dbstorewaitstats",  "true",             "Endpoint=sb://...EntityPath=insights-logs-querystorewaitstatistics"]      - ["insights-logs-timeouts",                    "TAIL",              "dbtimeouts",        "true",             "Endpoint=sb://...EntityPath=insights-logs-timeouts"]

You can specify global Event Hub options. They will be overridden by any configurations specified in the event_hubs option.

See Best practices for guidelines on choosing an appropriate number of threads.

The header array must be defined with name in the first position. Other options can be defined in any order. The per Event Hub configuration takes precedence. Any values not defined per Event Hub use the global config value.

This enables consuming from a second Activity Logs Event Hub that uses a different Blob Storage container. This is necessary to avoid the offsets from the first insights-operational-logs from overwriting the offsets for the second insights-operational-logs.

The advanced configuration doesn’t require a prefix before a per Event Hubconfiguration option. Notice the notation for the initial_position option.

Scale Event Hub consumption

An Azure Blob Storageaccount is an essential part of Azure-to-Logstash configuration.It is required for users who want to scale out multiple Logstash instances to consume from Event Hubs.

A Blob Storage account is a central location that enables multiple instances ofLogstash to work together to process events. It records theoffset (location) of processed events. On restart, Logstash resumes processingexactly where it left off.

Configuration notes:

  • A Blob Storage account is highly recommended for use with this module, and islikely required for production servers.
  • The storage_connection option passes the blob storage connection string.
  • Configure all Logstash instances to use the same storage_connection to get thebenefits of shared processing.

Sample Blob Storage connection string:

DefaultEndpointsProtocol=https;AccountName=logstash;AccountKey=ETOPnkd/hDAWidkEpPZDiXffQPku/SZdXhPSLnfqdRTalssdEuPkZwIcouzXjCLb/xPZjzhmHfwRCGo0SBSw==;EndpointSuffix=core.windows.net

Find the connection string to Blob Storage here:Azure Portal-> Blob Storage account -> Access keys.

Best practices

Here are some guidelines to help you achieve a successful deployment, and avoiddata conflicts that can cause lost events.

  • Create a Logstash consumer group.Create a new consumer group specifically for Logstash. Do not use the $default orany other consumer group that might already be in use. Reusing consumer groupsamong non-related consumers can cause unexpected behavior and possibly lostevents. All Logstash instances should use the same consumer group so that they canwork together for processing events.
  • Avoid overwriting offset with multiple Event Hubs.The offsets (position) of the Event Hubs are stored in the configured Azure Blobstore. The Azure Blob store uses paths like a file system to store the offsets.If the paths between multiple Event Hubs overlap, then the offsets may be storedincorrectly.To avoid duplicate file paths, use the advanced configuration model and makesure that at least one of these options is different per Event Hub:

    • storage_connection
    • storage_container (defaults to Event Hub name if not defined)
    • consumer_group
  • Set number of threads correctly.The number of threads should equal the number of Event Hubs plus one (or more).Each Event Hub needs at least one thread. An additional thread is needed to helpcoordinate the other threads. The number of threads should not exceed the number of Event Hubs multiplied by thenumber of partitions per Event Hub plus one. Threads arecurrently available only as a global setting.

    • Sample: Event Hubs = 4. Partitions on each Event Hub = 3.Minimum threads is 5 (4 Event Hubs plus one). Maximum threads is 13 (4 EventHubs times 3 partitions plus one).
    • If you’re collecting activity logs from only one specified event hub instance,then only 2 threads (1 Event Hub plus one) are required.

Set up and run the module

Be sure that the logstash.yml file is configured correctly.

First time setup

Run this command from the Logstash directory:

bin/logstash --setup

The --modules azure option starts a Logstash pipeline for ingestion from AzureEvent Hubs. The --setup option creates an azure-* index pattern inElasticsearch and imports Kibana dashboards and visualizations.

Subsequent starts

Run this command from the Logstash directory:

bin/logstash
Note

The --setup option is intended only for first-time setup. If you include--setup on subsequent runs, your existing Kibana dashboards will beoverwritten.

Explore your data

When the Logstash Azure module starts receiving events, you can begin using thepackaged Kibana dashboards to explore and visualize your data.

To explore your data with Kibana:

  1. Open a browser to http://localhost:5601 (username: "elastic"; password: "YOUR_PASSWORD")
  2. Click Dashboard.
  3. Click [Azure Monitor] Overview.

Configuration options

Note

All Event Hubs options are common to both basic and advancedconfigurations, with the following exceptions. The basic configuration usesevent_hub_connections to support multiple connections. The advancedconfiguration uses event_hubs and event_hub_connection (singular).

event_hubs

  • Value type is array
  • No default value
  • Ignored for basic and command line configuration
  • Required for advanced configuration

Defines the per Event Hubs configuration for the advanced configuration.

The advanced configuration uses event_hub_connection instead of event_hub_connections.The event_hub_connection option is defined per Event Hub.

event_hub_connections

  • Value type is array
  • No default value
  • Required for basic and command line configuration
  • Ignored for advanced configuration

List of connection strings that identifies the Event Hubs to be read. Connectionstrings include the EntityPath for the Event Hub.

checkpoint_interval

  • Value type is number
  • Default value is 5 seconds
  • Set to 0 to disable.

Interval in seconds to write checkpoints during batch processing. Checkpointstell Logstash where to resume processing after a restart. Checkpoints areautomatically written at the end of each batch, regardless of this setting.

Writing checkpoints too frequently can slow down processing unnecessarily.

consumer_group

  • Value type is string
  • Default value is $Default

Consumer group used to read the Event Hub(s). Create a consumer groupspecifically for Logstash. Then ensure that all instances of Logstash use thatconsumer group so that they can work together properly.

decorate_events

  • Value type is boolean
  • Default value is false

Adds metadata about the Event Hub, including Event Hub name, consumer_group,processor_host, partition, offset, sequence, timestamp, and event_size.

initial_position

  • Value type is string
  • Valid arguments are beginning, end, look_back
  • Default value is beginning

When first reading from an Event Hub, start from this position:

  • beginning reads all pre-existing events in the Event Hub
  • end does not read any pre-existing events in the Event Hub
  • look_back reads end minus a number of seconds worth of pre-existing events.You control the number of seconds using the initial_position_look_back option.

If storage_connection is set, the initial_position value is used onlythe first time Logstash reads from the Event Hub.

initial_position_look_back

  • Value type is number
  • Default value is 86400
  • Used only if initial_position is set to look-back

Number of seconds to look back to find the initial position for pre-existingevents. This option is used only if initial_position is set to look_back. Ifstorage_connection is set, this configuration applies only the first time Logstashreads from the Event Hub.

max_batch_size

  • Value type is number
  • Default value is 125

Maximum number of events retrieved and processed together. A checkpoint iscreated after each batch. Increasing this value may help with performance, butrequires more memory.

storage_connection

  • Value type is string
  • No default value

Connection string for blob account storage. Blob account storage persists theoffsets between restarts, and ensures that multiple instances of Logstashprocess different partitions.When this value is set, restarts resume where processing left off.When this value is not set, the initial_position value is used on every restart.

We strongly recommend that you define this value for production environments.

storage_container

  • Value type is string
  • Defaults to the Event Hub name if not defined

Name of the storage container used to persist offsets and allow multiple instances of Logstashto work together.

To avoid overwriting offsets, you can use different storage containers. This isparticularly important if you are monitoring two Event Hubs with the same name.You can use the advanced configuration model to configure different storagecontainers.

threads

  • Value type is number
  • Minimum value is 2
  • Default value is 4

Total number of threads used to process events. The value you set here appliesto all Event Hubs. Even with advanced configuration, this value is a globalsetting, and can’t be set per event hub.

The number of threads should be the number of Event Hubs plus one or more.See Best practices for more information.

Common options

The following configuration options are supported by all modules:

var.elasticsearch.hosts
  • Value type is uri
  • Default value is "localhost:9200"

Sets the host(s) of the Elasticsearch cluster. For each host, you must specifythe hostname and port. For example, "myhost:9200". If given an array,Logstash will load balance requests across the hosts specified in the hostsparameter. It is important to exclude dedicated masternodes from the hosts list to prevent Logstash from sending bulk requests to themaster nodes. So this parameter should only reference either data or clientnodes in Elasticsearch.

Any special characters present in the URLs here MUST be URL escaped! This means #should be put in as %23 for instance.

var.elasticsearch.username
  • Value type is string
  • Default value is "elastic"

The username to authenticate to a secure Elasticsearch cluster.

var.elasticsearch.password
  • Value type is string
  • Default value is "changeme"

The password to authenticate to a secure Elasticsearch cluster.

var.elasticsearch.ssl.enabled
  • Value type is boolean
  • There is no default value for this setting.

Enable SSL/TLS secured communication to the Elasticsearch cluster. Leaving thisunspecified will use whatever scheme is specified in the URLs listed in hosts.If no explicit protocol is specified, plain HTTP will be used. If SSL isexplicitly disabled here, the plugin will refuse to start if an HTTPS URL isgiven in hosts.

var.elasticsearch.ssl.verification_mode
  • Value type is string
  • Default value is "strict"

The hostname verification setting when communicating with Elasticsearch. Set todisable to turn off hostname verification. Disabling this has serious securityconcerns.

var.elasticsearch.ssl.certificate_authority
  • Value type is string
  • There is no default value for this setting

The path to an X.509 certificate to use to validate SSL certificates whencommunicating with Elasticsearch.

var.elasticsearch.ssl.certificate
  • Value type is string
  • There is no default value for this setting

The path to an X.509 certificate to use for client authentication whencommunicating with Elasticsearch.

var.elasticsearch.ssl.key
  • Value type is string
  • There is no default value for this setting

The path to the certificate key for client authentication when communicatingwith Elasticsearch.

var.kibana.host
  • Value type is string
  • Default value is "localhost:5601"

Sets the hostname and port of the Kibana instance to use for importingdashboards and visualizations. For example: "myhost:5601".

var.kibana.scheme
  • Value type is string
  • Default value is "http"

Sets the protocol to use for reaching the Kibana instance. The options are:"http" or "https". The default is "http".

var.kibana.username
  • Value type is string
  • Default value is "elastic"

The username to authenticate to a secured Kibana instance.

var.kibana.password
  • Value type is string
  • Default value is "changeme"

The password to authenticate to a secure Kibana instance.

var.kibana.ssl.enabled
  • Value type is boolean
  • Default value is false

Enable SSL/TLS secured communication to the Kibana instance.

var.kibana.ssl.verification_mode
  • Value type is string
  • Default value is "strict"

The hostname verification setting when communicating with Kibana. Set todisable to turn off hostname verification. Disabling this has serious securityconcerns.

var.kibana.ssl.certificate_authority
  • Value type is string
  • There is no default value for this setting

The path to an X.509 certificate to use to validate SSL certificates whencommunicating with Kibana.

var.kibana.ssl.certificate
  • Value type is string
  • There is no default value for this setting

The path to an X.509 certificate to use for client authentication whencommunicating with Kibana.

var.kibana.ssl.key
  • Value type is string
  • There is no default value for this setting

The path to the certificate key for client authentication when communicatingwith Kibana.

Azure module schema

This module reads data from the Azure Event Hub and adds some additional structure to the data for Activity Logs and SQL Diagnostics. The original data is always preserved and any data added or parsed will be namespaced under azure. For example, azure.subscription may have been parsed from a longer more complex URN.

Name Description Notes

azure.subscription

Azure subscription from which this data originates.

Some Activity Log events may not be associated with a subscription.

azure.group

Primary type of data.

Current values are either activity_log or sql_diagnostics

azure.category*

Secondary type of data specific to group from which the data originated

azure.provider

Azure provider

azure.resource_group

Azure resource group

azure.resource_type

Azure resource type

azure.resource_name

Azure resource name

azure.database

Azure database name, for display purposes

SQL Diagnostics only

azure.db_unique_id

Azure database name that is guaranteed to be unique

SQL Diagnostics only

azure.server

Azure server for the database

SQL Diagnostics only

azure.server_and_database

Azure server and database combined

SQL Diagnostics only

Notes:

  • Activity Logs can have the following categories: Administrative, ServiceHealth, Alert, Autoscale, Security
  • SQL Diagnostics can have the following categories: Metric, Blocks, Errors, Timeouts, QueryStoreRuntimeStatistics, QueryStoreWaitStatistics, DatabaseWaitStatistics, SQLInsights

Microsoft documents Activity log schemahere.The SQL Diagnostics data is documentedhere.Elastic does not own these data models, and as such, cannot make anyassurances of information accuracy or passivity.

Special note - Properties field

Many of the logs contain a properties top level field. This is often where themost interesting data lives. There is not a fixed schema between log types forproperties fields coming from different sources.

For example, one log may haveproperties.type where one log sets this a String type and another sets this anInteger type. To avoid mapping errors, the original properties field is moved to<azure.group>_<azure_category>_properties.<original_key>.For exampleproperties.type may end up as sql_diagnostics_Errors_properties.type oractivity_log_Security_properties.type depending on the group/category wherethe event originated.

Deploying the module in production

Use security best practices to secure your configuration.See Securing the Elastic Stack for details and recommendations.

Microsoft Azure resources

Microsoft is the best source for the most up-to-date Azure information.