This functionality is experimental and may be changed or removed completely in a future release. Elastic will take a best effort approach to fix any issues, but experimental features are not subject to the support SLA of official GA features.
The Microsoft Azuremodule in Logstash helps you easily integrate your Azure activity logs and SQLdiagnostic logs with the Elastic Stack.
You can monitor your Azure cloud environments and SQL DB deployments withdeep operational insights across multiple Azure subscriptions. You can explorethe health of your infrastructure in real-time, accelerating root cause analysisand decreasing overall time to resolution. The Azure module helps you:
The Logstash Azure module is anX-Pack feature under the Basic Licenseand is therefore free to use. Please contactmonitor-azure@elastic.co for questions or moreinformation.
The Azure module uses theLogstash Azure Event Hubsinput plugin to consume data from Azure Event Hubs. The module taps directly into theAzure dashboard, parses and indexes events into Elasticsearch, and installs asuite of Kibana dashboards to help you start exploring your data immediately.
These Kibana dashboards are available and ready for you to use. You can use them as they are, or tailor them to meet your needs.
Azure Monitor enabled with Azure Event Hubs and the Elastic Stack are requiredfor this module.
The instructions below assume that you have Logstash, Elasticsearch, and Kibana running locally.You can also run Logstash, Elasticsearch, and Kibana on separate hosts.
The Elastic Stack version 6.4 (or later) is required for this module.
The Azure module uses the azure_event_hubs
input plugin to consume logs andmetrics from your Azure environment. It is installed by default with Logstash 6.4(or later). Basic understanding of the plugin and options is helpful when youset up the Azure module.See the azure_event_hubs inputplugin documentation for more information.
Elastic products are available to download andeasy to install.
Azure Monitor should be configured to stream logs to one or more Event Hubs.Logstash will need to access these Event Hubs instances to consume your Azure logs and metrics.See Microsoft Azure resources at the end of this topic for links to Microsoft Azure documentation.
Specify options for the Logstash Azure module in thelogstash.yml
configuration file.
logstash.yml
file to configure inputs from multiple Event Hubs that share the same configuration.Basic configuration is recommended for most use cases.logstash.yml
file holds your settings. Advanced configuration is not necessary orrecommended for most use cases.See the azure_event_hubs input plugindocumentation for more information about basic and advanced configurationmodels.
The configuration in the logstash.yml
file is shared between Event Hubs.Basic configuration is recommended for most use cases
modules: - name: azure var.elasticsearch.hosts: localhost:9200 var.kibana.host: localhost:5601 var.input.azure_event_hubs.consumer_group: "logstash" var.input.azure_event_hubs.storage_connection: "DefaultEndpointsProtocol=https;AccountName=instance1..." var.input.azure_event_hubs.threads: 9 var.input.azure_event_hubs.event_hub_connections: - "Endpoint=sb://...EntityPath=insights-operational-logs" - "Endpoint=sb://...EntityPath=insights-metrics-pt1m" - "Endpoint=sb://...EntityPath=insights-logs-blocks" - "Endpoint=sb://...EntityPath=insights-logs-databasewaitstatistics" - "Endpoint=sb://...EntityPath=insights-logs-errors" - "Endpoint=sb://...EntityPath=insights-logs-querystoreruntimestatistics" - "Endpoint=sb://...EntityPath=insights-logs-querystorewaitstatistics" - "Endpoint=sb://...EntityPath=insights-logs-timeouts"
The |
|
The |
|
See Best practices for guidelines on choosing an appropriate number of threads. |
|
This connection sets up the consumption of Activity Logs. By default, Azure Monitor uses the |
|
This connection and the ones below set up the consumption of SQL DB diagnostic logs and metrics. By default, Azure Monitor uses all these different Event Hub names. |
The basic configuration requires the var.input.azure_event_hubs.
prefixbefore a configuration option.Notice the notation for the threads
option.
Advanced configuration in the logstash.yml
file supports Event Hub specificoptions. Advanced configuration is available for more granular tuning ofthreading and Blob Storage usage across multiple Event Hubs. Advancedconfiguration is not necessary or recommended for most use cases. Use it only ifit is required for your deployment scenario.
You must define the header
array with name
in the first position. You candefine other options in any order. The per Event Hub configuration takesprecedence. Any values not defined per Event Hub use the global config value.
In this example threads
, consumer_group
, and storage_connection
will beapplied to each of the configured Event Hubs. Note that decorate_events
isdefined in both the global and per Event Hub configuration. The per Event Hubconfiguration takes precedence, and the global configuration is effectivelyignored when the per Event Hub setting is present.
modules: - name: azure var.elasticsearch.hosts: localhost:9200 var.kibana.host: localhost:5601 var.input.azure_event_hubs.decorate_events: true var.input.azure_event_hubs.threads: 9 var.input.azure_event_hubs.consumer_group: "logstash" var.input.azure_event_hubs.storage_connection: "DefaultEndpointsProtocol=https;AccountName=instance1..." var.input.azure_event_hubs.event_hubs: - ["name", "initial_position", "storage_container", "decorate_events", "event_hub_connection"] - ["insights-operational-logs", "TAIL", "activity-logs1", "true", "Endpoint=sb://...EntityPath=insights-operational-logs"] - ["insights-operational-logs", "TAIL", "activity_logs2", "true", "Endpoint=sb://...EntityPath=insights-operational-logs"] - ["insights-metrics-pt1m", "TAIL", "dbmetrics", "true", "Endpoint=sb://...EntityPath=insights-metrics-pt1m"] - ["insights-logs-blocks", "TAIL", "dbblocks", "true", "Endpoint=sb://...EntityPath=insights-logs-blocks"] - ["insights-logs-databasewaitstatistics", "TAIL", "dbwaitstats", "false", "Endpoint=sb://...EntityPath=insights-logs-databasewaitstatistics"] - ["insights-logs-errors", "HEAD", "dberrors", "true", "Endpoint=sb://...EntityPath=insights-logs-errors" - ["insights-logs-querystoreruntimestatistics", "TAIL", "dbstoreruntime", "true", "Endpoint=sb://...EntityPath=insights-logs-querystoreruntimestatistics"] - ["insights-logs-querystorewaitstatistics", "TAIL", "dbstorewaitstats", "true", "Endpoint=sb://...EntityPath=insights-logs-querystorewaitstatistics"] - ["insights-logs-timeouts", "TAIL", "dbtimeouts", "true", "Endpoint=sb://...EntityPath=insights-logs-timeouts"]
You can specify global Event Hub options. They will be overridden by any configurations specified in the event_hubs option. |
|
See Best practices for guidelines on choosing an appropriate number of threads. |
|
The header array must be defined with name in the first position. Other options can be defined in any order. The per Event Hub configuration takes precedence. Any values not defined per Event Hub use the global config value. |
|
This enables consuming from a second Activity Logs Event Hub that uses a different Blob Storage container. This is necessary to avoid the offsets from the first insights-operational-logs from overwriting the offsets for the second insights-operational-logs. |
The advanced configuration doesn’t require a prefix before a per Event Hubconfiguration option. Notice the notation for the initial_position
option.
An Azure Blob Storageaccount is an essential part of Azure-to-Logstash configuration.It is required for users who want to scale out multiple Logstash instances to consume from Event Hubs.
A Blob Storage account is a central location that enables multiple instances ofLogstash to work together to process events. It records theoffset (location) of processed events. On restart, Logstash resumes processingexactly where it left off.
Configuration notes:
storage_connection
option passes the blob storage connection string.storage_connection
to get thebenefits of shared processing.Sample Blob Storage connection string:
DefaultEndpointsProtocol=https;AccountName=logstash;AccountKey=ETOPnkd/hDAWidkEpPZDiXffQPku/SZdXhPSLnfqdRTalssdEuPkZwIcouzXjCLb/xPZjzhmHfwRCGo0SBSw==;EndpointSuffix=core.windows.net
Find the connection string to Blob Storage here:Azure Portal-> Blob Storage account -> Access keys
.
Here are some guidelines to help you achieve a successful deployment, and avoiddata conflicts that can cause lost events.
Avoid overwriting offset with multiple Event Hubs.The offsets (position) of the Event Hubs are stored in the configured Azure Blobstore. The Azure Blob store uses paths like a file system to store the offsets.If the paths between multiple Event Hubs overlap, then the offsets may be storedincorrectly.To avoid duplicate file paths, use the advanced configuration model and makesure that at least one of these options is different per Event Hub:
Set number of threads correctly.The number of threads should equal the number of Event Hubs plus one (or more).Each Event Hub needs at least one thread. An additional thread is needed to helpcoordinate the other threads. The number of threads should not exceed the number of Event Hubs multiplied by thenumber of partitions per Event Hub plus one. Threads arecurrently available only as a global setting.
Be sure that the logstash.yml
file is configured correctly.
Run this command from the Logstash directory:
bin/logstash --setup
The --modules azure
option starts a Logstash pipeline for ingestion from AzureEvent Hubs. The --setup
option creates an azure-*
index pattern inElasticsearch and imports Kibana dashboards and visualizations.
When the Logstash Azure module starts receiving events, you can begin using thepackaged Kibana dashboards to explore and visualize your data.
To explore your data with Kibana:
All Event Hubs options are common to both basic and advancedconfigurations, with the following exceptions. The basic configuration usesevent_hub_connections
to support multiple connections. The advancedconfiguration uses event_hubs
and event_hub_connection
(singular).
Defines the per Event Hubs configuration for the advanced configuration.
The advanced configuration uses event_hub_connection
instead of event_hub_connections
.The event_hub_connection
option is defined per Event Hub.
List of connection strings that identifies the Event Hubs to be read. Connectionstrings include the EntityPath for the Event Hub.
5
seconds0
to disable.Interval in seconds to write checkpoints during batch processing. Checkpointstell Logstash where to resume processing after a restart. Checkpoints areautomatically written at the end of each batch, regardless of this setting.
Writing checkpoints too frequently can slow down processing unnecessarily.
$Default
Consumer group used to read the Event Hub(s). Create a consumer groupspecifically for Logstash. Then ensure that all instances of Logstash use thatconsumer group so that they can work together properly.
false
Adds metadata about the Event Hub, including Event Hub name
, consumer_group
,processor_host
, partition
, offset
, sequence
, timestamp
, and event_size
.
beginning
, end
, look_back
beginning
When first reading from an Event Hub, start from this position:
beginning
reads all pre-existing events in the Event Hubend
does not read any pre-existing events in the Event Hublook_back
reads end
minus a number of seconds worth of pre-existing events.You control the number of seconds using the initial_position_look_back
option.If storage_connection
is set, the initial_position
value is used onlythe first time Logstash reads from the Event Hub.
86400
initial_position
is set to look-back
Number of seconds to look back to find the initial position for pre-existingevents. This option is used only if initial_position
is set to look_back
. Ifstorage_connection
is set, this configuration applies only the first time Logstashreads from the Event Hub.
125
Maximum number of events retrieved and processed together. A checkpoint iscreated after each batch. Increasing this value may help with performance, butrequires more memory.
Connection string for blob account storage. Blob account storage persists theoffsets between restarts, and ensures that multiple instances of Logstashprocess different partitions.When this value is set, restarts resume where processing left off.When this value is not set, the initial_position
value is used on every restart.
We strongly recommend that you define this value for production environments.
Name of the storage container used to persist offsets and allow multiple instances of Logstashto work together.
To avoid overwriting offsets, you can use different storage containers. This isparticularly important if you are monitoring two Event Hubs with the same name.You can use the advanced configuration model to configure different storagecontainers.
2
4
Total number of threads used to process events. The value you set here appliesto all Event Hubs. Even with advanced configuration, this value is a globalsetting, and can’t be set per event hub.
The number of threads should be the number of Event Hubs plus one or more.See Best practices for more information.
Common options
The following configuration options are supported by all modules:
var.elasticsearch.hosts
Sets the host(s) of the Elasticsearch cluster. For each host, you must specifythe hostname and port. For example, "myhost:9200". If given an array,Logstash will load balance requests across the hosts specified in the hostsparameter. It is important to exclude dedicated masternodes from the hosts list to prevent Logstash from sending bulk requests to themaster nodes. So this parameter should only reference either data or clientnodes in Elasticsearch.
Any special characters present in the URLs here MUST be URL escaped! This means #should be put in as %23 for instance.
var.elasticsearch.username
The username to authenticate to a secure Elasticsearch cluster.
var.elasticsearch.password
The password to authenticate to a secure Elasticsearch cluster.
var.elasticsearch.ssl.enabled
Enable SSL/TLS secured communication to the Elasticsearch cluster. Leaving thisunspecified will use whatever scheme is specified in the URLs listed in hosts
.If no explicit protocol is specified, plain HTTP will be used. If SSL isexplicitly disabled here, the plugin will refuse to start if an HTTPS URL isgiven in hosts.
var.elasticsearch.ssl.verification_mode
The hostname verification setting when communicating with Elasticsearch. Set todisable
to turn off hostname verification. Disabling this has serious securityconcerns.
var.elasticsearch.ssl.certificate_authority
The path to an X.509 certificate to use to validate SSL certificates whencommunicating with Elasticsearch.
var.elasticsearch.ssl.certificate
The path to an X.509 certificate to use for client authentication whencommunicating with Elasticsearch.
var.elasticsearch.ssl.key
The path to the certificate key for client authentication when communicatingwith Elasticsearch.
var.kibana.host
Sets the hostname and port of the Kibana instance to use for importingdashboards and visualizations. For example: "myhost:5601".
var.kibana.scheme
Sets the protocol to use for reaching the Kibana instance. The options are:"http" or "https". The default is "http".
var.kibana.username
The username to authenticate to a secured Kibana instance.
var.kibana.password
The password to authenticate to a secure Kibana instance.
var.kibana.ssl.enabled
Enable SSL/TLS secured communication to the Kibana instance.
var.kibana.ssl.verification_mode
The hostname verification setting when communicating with Kibana. Set todisable
to turn off hostname verification. Disabling this has serious securityconcerns.
var.kibana.ssl.certificate_authority
The path to an X.509 certificate to use to validate SSL certificates whencommunicating with Kibana.
var.kibana.ssl.certificate
The path to an X.509 certificate to use for client authentication whencommunicating with Kibana.
var.kibana.ssl.key
The path to the certificate key for client authentication when communicatingwith Kibana.
This module reads data from the Azure Event Hub and adds some additional structure to the data for Activity Logs and SQL Diagnostics. The original data is always preserved and any data added or parsed will be namespaced under azure. For example, azure.subscription may have been parsed from a longer more complex URN.
Name | Description | Notes |
---|---|---|
azure.subscription |
Azure subscription from which this data originates. |
Some Activity Log events may not be associated with a subscription. |
azure.group |
Primary type of data. |
Current values are either activity_log or sql_diagnostics |
azure.category* |
Secondary type of data specific to group from which the data originated |
|
azure.provider |
Azure provider |
|
azure.resource_group |
Azure resource group |
|
azure.resource_type |
Azure resource type |
|
azure.resource_name |
Azure resource name |
|
azure.database |
Azure database name, for display purposes |
SQL Diagnostics only |
azure.db_unique_id |
Azure database name that is guaranteed to be unique |
SQL Diagnostics only |
azure.server |
Azure server for the database |
SQL Diagnostics only |
azure.server_and_database |
Azure server and database combined |
SQL Diagnostics only |
Notes:
Microsoft documents Activity log schemahere.The SQL Diagnostics data is documentedhere.Elastic does not own these data models, and as such, cannot make anyassurances of information accuracy or passivity.
Many of the logs contain a properties
top level field. This is often where themost interesting data lives. There is not a fixed schema between log types forproperties fields coming from different sources.
For example, one log may haveproperties.type
where one log sets this a String type and another sets this anInteger type. To avoid mapping errors, the original properties field is moved to<azure.group>_<azure_category>_properties.<original_key>
.For exampleproperties.type
may end up as sql_diagnostics_Errors_properties.type
oractivity_log_Security_properties.type
depending on the group/category wherethe event originated.
Use security best practices to secure your configuration.See Securing the Elastic Stack for details and recommendations.
Microsoft is the best source for the most up-to-date Azure information.