Dead Letter Queues

Note

The dead letter queue feature is currently supported for theelasticsearch output only. Additionally, The deadletter queue is only used where the response code is either 400or 404, both of which indicate an event that cannot be retried.Support for additional outputs will be available in future releases of theLogstash plugins. Before configuring Logstash to use this feature, refer tothe output plugin documentation to verify that the plugin supports the deadletter queue feature.

By default, when Logstash encounters an event that it cannot process because thedata contains a mapping error or some other issue, the Logstash pipelineeither hangs or drops the unsuccessful event. In order to protect against dataloss in this situation, you can configure Logstash to writeunsuccessful events to a dead letter queue instead of dropping them.

Each event written to the dead letter queue includes the original event, alongwith metadata that describes the reason the event could not be processed,information about the plugin that wrote the event, and the timestamp for whenthe event entered the dead letter queue.

To process events in the dead letter queue, you simply create a Logstashpipeline configuration that uses thedead_letter_queue input plugin to readfrom the queue.

Diagram showing pipeline reading from the dead letter queue

See Processing Events in the Dead Letter Queue for more information.

Configuring Logstash to Use Dead Letter Queues

Dead letter queues are disabled by default. To enable dead letter queues, setthe dead_letter_queue_enable option in the logstash.ymlsettings file:

dead_letter_queue.enable: true

Dead letter queues are stored as files in the local directory of the Logstashinstance. By default, the dead letter queue files are stored inpath.data/dead_letter_queue. Each pipeline has a separate queue. For example,the dead letter queue for the main pipeline is stored inLOGSTASH_HOME/data/dead_letter_queue/main by default. The queue files arenumbered sequentially: 1.log, 2.log, and so on.

You can set path.dead_letter_queue in the logstash.yml file tospecify a different path for the files:

path.dead_letter_queue: "path/to/data/dead_letter_queue"
Note

You may not use the same dead_letter_queue path for two differentLogstash instances.

File Rotation

Dead letter queues have a built-in file rotation policy that manages the filesize of the queue. When the file size reaches a preconfigured threshold, a newfile is created automatically.

By default, the maximum size of each dead letter queue is set to 1024mb. Tochange this setting, use the dead_letter_queue.max_bytes option. Entrieswill be dropped if they would increase the size of the dead letter queue beyondthis setting.

Processing Events in the Dead Letter Queue

When you are ready to process events in the dead letter queue, you create apipeline that uses thedead_letter_queue input plugin to readfrom the dead letter queue. The pipeline configuration that you use depends, ofcourse, on what you need to do. For example, if the dead letter queue containsevents that resulted from a mapping error in Elasticsearch, you can create apipeline that reads the "dead" events, removes the field that caused the mappingissue, and re-indexes the clean events into Elasticsearch.

The following example shows a simple pipeline that reads events from the deadletter queue and writes the events, including metadata, to standard output:

input {  dead_letter_queue {    path => "/path/to/data/dead_letter_queue"     commit_offsets => true     pipeline_id => "main"   }}output {  stdout {    codec => rubydebug { metadata => true }  }}

The path to the top-level directory containing the dead letter queue. Thisdirectory contains a separate folder for each pipeline that writes to the deadletter queue. To find the path to this directory, look at the logstash.ymlsettings file. By default, Logstash creates thedead_letter_queue directory under the location used for persistentstorage (path.data), for example, LOGSTASH_HOME/data/dead_letter_queue.However, if path.dead_letter_queue is set, it uses that location instead.

When true, saves the offset. When the pipeline restarts, it will continuereading from the position where it left off rather than reprocessing all theitems in the queue. You can set commit_offsets to false when you areexploring events in the dead letter queue and want to iterate over the eventsmultiple times.

The ID of the pipeline that’s writing to the dead letter queue. The defaultis "main".

For another example, see Example: Processing Data That Has Mapping Errors.

When the pipeline has finished processing all the events in the dead letterqueue, it will continue to run and process new events as they stream into thequeue. This means that you do not need to stop your production system to handleevents in the dead letter queue.

Note

Events emitted from thedead_letter_queue input plugin pluginwill not be resubmitted to the dead letter queue if they cannot be processedcorrectly.

Reading From a Timestamp

When you read from the dead letter queue, you might not want to process all theevents in the queue, especially if there are a lot of old events in the queue.You can start processing events at a specific point in the queue by using thestart_timestamp option. This option configures the pipeline to startprocessing events based on the timestamp of when they entered the queue:

input {  dead_letter_queue {    path => "/path/to/data/dead_letter_queue"    start_timestamp => 2017-06-06T23:40:37    pipeline_id => "main"  }}

For this example, the pipeline starts reading all events that were delivered tothe dead letter queue on or after June 6, 2017, at 23:40:37.

Example: Processing Data That Has Mapping Errors

In this example, the user attempts to index a document that includes geo_ip data,but the data cannot be processed because it contains a mapping error:

{"geoip":{"location":"home"}}

Indexing fails because the Logstash output plugin expects a geo_point object inthe location field, but the value is a string. The failed event is written tothe dead letter queue, along with metadata about the error that caused thefailure:

{   "@metadata" => {    "dead_letter_queue" => {       "entry_time" => #<Java::OrgLogstash::Timestamp:0x5b5dacd5>,        "plugin_id" => "fb80f1925088497215b8d037e622dec5819b503e-4",      "plugin_type" => "elasticsearch",           "reason" => "Could not index event to Elasticsearch. status: 400, action: [\"index\", {:_id=>nil, :_index=>\"logstash-2017.06.22\", :_type=>\"doc\", :_routing=>nil}, 2017-06-22T01:29:29.804Z My-MacBook-Pro-2.local {\"geoip\":{\"location\":\"home\"}}], response: {\"index\"=>{\"_index\"=>\"logstash-2017.06.22\", \"_type\"=>\"doc\", \"_id\"=>\"AVzNayPze1iR9yDdI2MD\", \"status\"=>400, \"error\"=>{\"type\"=>\"mapper_parsing_exception\", \"reason\"=>\"failed to parse\", \"caused_by\"=>{\"type\"=>\"illegal_argument_exception\", \"reason\"=>\"illegal latitude value [266.30859375] for geoip.location\"}}}}"    }  },  "@timestamp" => 2017-06-22T01:29:29.804Z,    "@version" => "1",       "geoip" => {    "location" => "home"  },        "host" => "My-MacBook-Pro-2.local",     "message" => "{\"geoip\":{\"location\":\"home\"}}"}

To process the failed event, you create the following pipeline that reads fromthe dead letter queue and removes the mapping problem:

input {  dead_letter_queue {    path => "/path/to/data/dead_letter_queue/"   }}filter {  mutate {    remove_field => "[geoip][location]"   }}output {  elasticsearch{    hosts => [ "localhost:9200" ]   }}

The dead_letter_queue input reads from the dead letter queue.

The mutate filter removes the problem field called location.

The clean event is sent to Elasticsearch, where it can be indexed becausethe mapping issue is resolved.