Translate filter plugin

  • Plugin version: v3.2.3
  • Released on: 2018-09-05
  • Changelog

For other versions, see theVersioned plugin docs.

Getting Help

For questions about the plugin, open a topic in the Discuss forums. For bugs or feature requests, open an issue in Github.For the list of Elastic supported plugins, please consult the Elastic Support Matrix.

Description

A general search and replace tool that uses a configured hashand/or a file to determine replacement values. Currently supported areYAML, JSON, and CSV files. Each dictionary item is a key value pair.

You can specify dictionary entries in one of two ways:

  • The dictionary configuration item can contain a hash representingthe mapping.
  • An external file (readable by logstash) may be specified in thedictionary_path configuration item.

These two methods may not be used in conjunction; it will produce an error.

Operationally, for each event, the value from the field setting is testedagainst the dictionary and if it matches exactly (or matches a regex whenregex configuration item has been enabled), the matched value is put inthe destination field, but on no match the fallback setting string isused instead.

Example:

    filter {      translate {        field => "[http_status]"        destination => "[http_status_description]"        dictionary => {          "100" => "Continue"          "101" => "Switching Protocols"          "200" => "OK"          "500" => "Server Error"        }        fallback => "I'm a teapot"      }    }

Occasionally, people find that they have a field with a variable sized array ofvalues or objects that need some enrichment. The iterate_on setting helps inthese cases.

Alternatively, for simple string search and replacements for just a few valuesyou might consider using the gsub function of the mutate filter.

It is possible to provide multi-valued dictionary values. When using a YAML orJSON dictionary, you can have the value as a hash (map) or an array datatype.When using a CSV dictionary, multiple values in the translation must beextracted with another filter e.g. Dissect or KV.Note that the fallback is a string so on no match the fallback setting needsto formatted so that a filter can extract the multiple values to the correct fields.

File based dictionaries are loaded in a separate thread using a scheduler.If you set a refresh_interval of 300 seconds (5 minutes) or less then themodified time of the file is checked before reloading. Very large dictionariesare supported, internally tested at 100 000 key/values, and we minimisethe impact on throughput by having the refresh in the scheduler thread.Any ongoing modification of the dictionary file should be done using acopy/edit/rename or create/rename mechanism to avoid the refresh code fromprocessing half-baked dictionary content.

Translate Filter Configuration Options

This plugin supports the following configuration options plus the Common Options described later.

Also see Common Options for a list of options supported by allfilter plugins.

 

destination

  • Value type is string
  • Default value is "translation"

The destination field you wish to populate with the translated code. The defaultis a field named translation. Set this to the same value as source if you wantto do a substitution, in this case filter will allways succeed. This will clobberthe old value of the source field!

dictionary

  • Value type is hash
  • Default value is {}

The dictionary to use for translation, when specified in the logstash filterconfiguration item (i.e. do not use the @dictionary_path file).

Example:

    filter {      translate {        dictionary => {          "100"         => "Continue"          "101"         => "Switching Protocols"          "merci"       => "thank you"          "old version" => "new version"        }      }    }
Note

It is an error to specify both dictionary and dictionary_path.

dictionary_path

  • Value type is path
  • There is no default value for this setting.

The full path of the external dictionary file. The format of the table should bea standard YAML, JSON, or CSV.

Specify any integer-based keys in quotes. Thevalue taken from the event’s field setting is converted to a string. Thelookup dictionary keys must also be strings, and the quotes make theinteger-based keys function as a string. For example, the YAML file should looksomething like this:

    "100": Continue    "101": Switching Protocols    merci: gracias    old version: new version
Note

It is an error to specify both dictionary and dictionary_path.

The currently supported formats are YAML, JSON, and CSV. Format selection isbased on the file extension: json for JSON, yaml or yml for YAML, andcsv for CSV. The CSV format expects exactly two columns, with the first servingas the original text (lookup key), and the second column as the translation.

exact

  • Value type is boolean
  • Default value is true

When exact => true, the translate filter will populate the destination fieldwith the exact contents of the dictionary value. When exact => false, thefilter will populate the destination field with the result of any existingdestination field’s data, with the translated value substituted in-place.

For example, consider this simple translation.yml, configured to check the data field:

    foo: bar

If logstash receives an event with the data field set to foo, and exact => true,the destination field will be populated with the string bar.If exact => false, and logstash receives the same event, the destination fieldwill be also set to bar. However, if logstash receives an event with the data fieldset to foofing, the destination field will be set to barfing.

Set both exact => true AND regex => `true if you would like to match using dictionarykeys as regular expressions. A large dictionary could be expensive to match in this case.

fallback

  • Value type is string
  • There is no default value for this setting.

In case no translation occurs in the event (no matches), this will add a defaulttranslation string, which will always populate field, if the match failed.

For example, if we have configured fallback => "no match", using this dictionary:

    foo: bar

Then, if logstash received an event with the field foo set to bar, the destinationfield would be set to bar. However, if logstash received an event with foo set to nope,then the destination field would still be populated, but with the value of no match.This configuration can be dynamic and include parts of the event using the %{field} syntax.

field

  • This is a required setting.
  • Value type is string
  • There is no default value for this setting.

The name of the logstash event field containing the value to be compared for amatch by the translate filter (e.g. message, host, response_code).

If this field is an array, only the first value will be used.

iterate_on

  • Value type is string
  • There is no default value for this setting.

When the value that you need to perform enrichment on is a variable sized arraythen specify the field name in this setting. This setting introduces two modes,1) when the value is an array of strings and 2) when the value is an array ofobjects (as in JSON object).In the first mode, you should have the same field name in both field anditerate_on, the result will be an array added to the field specified in thedestination setting. This array will have the looked up value (or thefallback value or nil) in same ordinal position as each sought value.In the second mode, specify the field that has the array of objects initerate_on then specify the field in each object that provides the sought valuewith field and the field to write the looked up value (or the fallback value)to with destination.

For a dictionary of:

  100,Yuki  101,Rupert  102,Ahmed  103,Kwame

Example of Mode 1

filter {  translate {    iterate_on => "[collaborator_ids]"    field      => "[collaborator_ids]"    destination => "[collaborator_names]"    fallback => "Unknown"  }}

Before

{  "collaborator_ids": [100,103,110,102]}

After

{  "collaborator_ids": [100,103,110,102],  "collabrator_names": ["Yuki","Kwame","Unknown","Ahmed"]}

Example of Mode 2

filter {  translate {    iterate_on => "[collaborators]"    field      => "[id]"    destination => "[name]"    fallback => "Unknown"  }}

Before

{  "collaborators": [    {      "id": 100    },    {      "id": 103    },    {      "id": 110    },    {      "id": 101    }  ]}

After

{  "collaborators": [    {      "id": 100,      "name": "Yuki"    },    {      "id": 103,      "name": "Kwame"    },    {      "id": 110,      "name": "Unknown"    },    {      "id": 101,      "name": "Rupert"    }  ]}

override

  • Value type is boolean
  • Default value is false

If the destination (or target) field already exists, this configuration item specifieswhether the filter should skip translation (default) or overwrite the target fieldvalue with the new translation value.

refresh_interval

  • Value type is number
  • Default value is 300

When using a dictionary file, this setting will indicate how frequently(in seconds) logstash will check the dictionary file for updates.A value of zero or less will disable refresh.

regex

  • Value type is boolean
  • Default value is false

To treat dictionary keys as regular expressions, set regex => true.

Be sure to escape dictionary key strings for use with regex.Resources on regex formatting are available online.

refresh_behaviour

  • Value type is string
  • Default value is merge

When using a dictionary file, this setting indicates how the update will be executed.Setting this to merge causes the new dictionary to be merged into the old one. This meanssame entry will be updated but entries that existed before but not in the new dictionarywill remain after the merge; replace causes the whole dictionary to be replacedwith a new one (deleting all entries of the old one on update).

Common Options

The following configuration options are supported by all filter plugins:

add_field

  • Value type is hash
  • Default value is {}

If this filter is successful, add any arbitrary fields to this event.Field names can be dynamic and include parts of the event using the %{field}.

Example:

filter {  translate {    add_field => { "foo_%{somefield}" => "Hello world, from %{host}" }  }}
# You can also add multiple fields at once:filter {  translate {    add_field => {      "foo_%{somefield}" => "Hello world, from %{host}"      "new_field" => "new_static_value"    }  }}

If the event has field "somefield" == "hello" this filter, on success,would add field foo_hello if it is present, with thevalue above and the %{host} piece replaced with that value from theevent. The second example would also add a hardcoded field.

add_tag

  • Value type is array
  • Default value is []

If this filter is successful, add arbitrary tags to the event.Tags can be dynamic and include parts of the event using the %{field}syntax.

Example:

filter {  translate {    add_tag => [ "foo_%{somefield}" ]  }}
# You can also add multiple tags at once:filter {  translate {    add_tag => [ "foo_%{somefield}", "taggedy_tag"]  }}

If the event has field "somefield" == "hello" this filter, on success,would add a tag foo_hello (and the second example would of course add a taggedy_tag tag).

enable_metric

  • Value type is boolean
  • Default value is true

Disable or enable metric logging for this specific plugin instanceby default we record all the metrics we can, but you can disable metrics collectionfor a specific plugin.

id

  • Value type is string
  • There is no default value for this setting.

Add a unique ID to the plugin configuration. If no ID is specified, Logstash will generate one.It is strongly recommended to set this ID in your configuration. This is particularly usefulwhen you have two or more plugins of the same type, for example, if you have 2 translate filters.Adding a named ID in this case will help in monitoring Logstash when using the monitoring APIs.

filter {  translate {    id => "ABC"  }}

periodic_flush

  • Value type is boolean
  • Default value is false

Call the filter flush method at regular interval.Optional.

remove_field

  • Value type is array
  • Default value is []

If this filter is successful, remove arbitrary fields from this event.Example:

filter {  translate {    remove_field => [ "foo_%{somefield}" ]  }}
# You can also remove multiple fields at once:filter {  translate {    remove_field => [ "foo_%{somefield}", "my_extraneous_field" ]  }}

If the event has field "somefield" == "hello" this filter, on success,would remove the field with name foo_hello if it is present. The secondexample would remove an additional, non-dynamic field.

remove_tag

  • Value type is array
  • Default value is []

If this filter is successful, remove arbitrary tags from the event.Tags can be dynamic and include parts of the event using the %{field}syntax.

Example:

filter {  translate {    remove_tag => [ "foo_%{somefield}" ]  }}
# You can also remove multiple tags at once:filter {  translate {    remove_tag => [ "foo_%{somefield}", "sad_unwanted_tag"]  }}

If the event has field "somefield" == "hello" this filter, on success,would remove the tag foo_hello if it is present. The second examplewould remove a sad, unwanted tag as well.