Usefulness of an MQ layer in an ELK stack
An ELK stack is composed of Elasticsearch, Logstash, and Kibana; these 3 components are owned by the elastic.co company and are particulary useful to handle Data. (ship, enrich, index and search your Data)
In computer science, The message queue paradigm is a sibling of the publisher/subscriber pattern.
You can imagine an MQ componant works like several mailboxes; meaning that publisher(s)/subscriber(s) do not interact with the message at the same time, many publishers can post a message for one subsriber and vice versa.
➡ Redis or Kafka or RabbitMQ can be used as buffer in the ELK stack.
- Redis is an open source (BSD licensed), in-memory data structure store, used as a database, cache and message broker.
- RabbitMQ is open source message broker software (sometimes called message-oriented middleware) that implements the Advanced Message Queuing Protocol (AMQP). The RabbitMQ server is written in the Erlang programming language and is built on the Open Telecom Platform framework for clustering and failover.
- Apache Kafka is a robust open-source, scalable pub/sub message queue system originally developed by LinkedIn.
When it comes to the Elastic Stack ecosystem, there’s no “one and only one best approach”, it always depends on your precise use case and requirements. You need to ask yourself what is important to you or to your end-users and then design your solution accordingly.
To give you a quick answer, just remember that your MQ cluster will prevent your ELK stack against Event Spikes!
Spike of data is very common in a production environments; An example?
You’re indexing (real time) your webserver logs (apache); Your company is the target of an IP Flood (a type of Denial of Service attack) ; therefore the logs a grow exponentially; and logstash (your filtering element) will be soon overloaded; with an MQ system you’ll prevent (at least for a moment, depending your MQ cluster setup) your logging chain to failed.
In an elk architecture, processing is typically split into different stages Shipping, Filtering (drop, enrich data) and Indexing.
Logstash (or beat or any other shipper) instance that receives data from different data sources is responsible to immediately persist data received to a Kafka topic, redis key, rabbitMQ queue, its a producer. On the other side, a Logstash instance (filtering one) will consume data, at its own throttled speed, while performing transformations (enriching data) and indexed it into Elasticsearch.
From an administrator point of view, it can make your live easier to apply release upgdate. As you may know, elastic publish news version of its elements very often, and to be able to work toghether in a very efficient way they need to be align on the same version. Cutting your stack with a MQ layer allow you to be able to update elasticsearch cluster independently and then plan to upgrade the other components later next week … etc.
An MQ cluster makes your ELK stack stronger!
But of course don’t forget that an MQ cluster is yet another piece of software you need to tend to in your production environment (monitoring, maintain your component up to date); You need also to have good knowledge of your MQ software to be able to configure it properly, secured it and do the troubleshooting when something bad append.
See You folks, and let’s start indexing stuff with ELK ッ