Demystifying ELK stack

5 July 2018, Author: Cezary Piątek

Let’s assume that your system consists of a few microservices. Everything must have high availability so each microservice has at least two active instances on separate machines and everything must be multiplied by the number of testing and production related environments. When there is a situation that requires log analysis you have to skip from server to server looking for the file with desired information. You browse each file using some kind of notepad-based editor and if the files weight hundreds of megabytes it’s quite a challenge. If it still sounds like your current job you should definitely adopt ELK stack.

ELK stands for ElasticSearch-Logstash-Kibana and it’s a set of services which helps to improve productivity in the area of logging, covering aspects of collecting, processing, storing and presenting log data. Because of its modular nature, for a basic solution, you have to install and configure four applications: Filebeat, Logstash, Elasticsearch, and Kibana. At first, it may sound overwhelming. The first time I tried to do that it took me a few days. In the last project (it was the fourth time) I was able to make it work in three hours. My application is built with ASP.NET framework hosted on IIS and logging with log4net, but it doesn’t matter because ELK is able to collect, process, store and present logs which come in any format and from any source. The purpose of this blog post is to summarize my knowledge about setting up ELK stack. I hope you can find this note valuable and that it helps you save some time. I’m going to show you step by step how to implement ELK in your existing project without making any changes in your code base.

ELK Stack quick overview 🔗︎

Filebeat - responsible for collecting logs from files and forwarding it to Logstash
Logstash - parse and transform log data that comes from different sources in different format
ElasticSearch - storage for log data
Kibana - web application that presents log data (searching and visualization)

Setting up ELK Stack server 🔗︎

At first, you need a Linux server which will be responsible for processing and storing log data. You have to install there ElasticSearch, Logstash, and Kibana. You can do it manually by yourself or you can save a lot of time and use preconfigured docker images. In the docker path we have two options:

Single image https://elk-docker.readthedocs.io/ All services are packed into the single image. This is not an official release but the documentation is excellent.
Docker compose https://github.com/deviantony/docker-elk Every service comes as a separate official docker image combine together with docker compose file.

Because both solutions are very well documented it’s needles to duplicate it here.

Collecting logs with Filebeat 🔗︎

Filebeat is responsible for collecting log data from files and sending it to Logstash (it watches designated files for changes and sends new entries forward). Thanks to this tool you can add ELK stack to your existing project without the need to make any changes in your code base. In order to install Filebeat, download appropriate archive with binaries from https://www.elastic.co/downloads/beats/filebeat and extract it on the server where your logs are stored. I don’t recommend choosing c:\Program Files\ or c:\Program Files (x86)\ paths because user access control makes it hard to update configuration file. After extracting archive, open PowerShell console, go to the directory with FileBeat binaries and execute the following script

./install-service-filebeat.ps1

This should install filebeat as a Windows service. Use Get-Service filebeat to verify the current status of filebeat service. In the next step, you have to configure filebeat to harvest log data produced by your application. Filebeat harvesting configuration is located in filebeat.yml file and minimal configuration that works for me looks as follows:

filebeat.prospectors:
- input_type: log
  paths:
    - c:\inetpub\wwwroot\MyApp\logs\
  scan_frequency: 10
  encoding: utf-8
  multiline.pattern: '^(\d{4}-\d{2}-\d{2}\s)'
  multiline.negate: true 
  multiline.match: after  
  fields_under_root: true
  fields:
    app_env: test
    app_name: client
    type: web

output.logstash:  
  hosts: ["10.0.2.12:5044"]
  bulk_max_size: 1024

To make it work with your log data you should modify the following options:

paths should point to the location where your app is producing files with logs. Directory paths are accepted and concrete files as well (wildcards are accepted too).
multiline.pattern - regex pattern that matches the beginning of the new log entry inside the log file. In my case, I expect a line that starts with a date in the following format yyyy-MM-dd.
fields - a set of additional attributes that will be added to each log entry. I use it later to build ElasticSearch index name and identify the logs source.
output.logstash - hosts this is IP and port where the logstash is installed and listening.

Filebat configuration is in the yaml format which is sensible for whitespace. I used VisualStudioCode with yaml plugin to avoid potential problems caused by invalid indentation. After updating Filebeat configuration, restart the service using Restart-Service filebeat powershell command. If you are not sure that Filebeat is working as expected, stop Filebeat service with Stop-Service filebat and run it in the debug mode using command filebeat -e -d "publish" where all events will be printed in the console. Here you can read more about filebeat debugging.

Until you install ELK on production enviroment, you can set up Filbeat to watch one extra directory where you can put log files which ship from production. With this simple trick you get benefits immediately.

Processing logs with Logstash 🔗︎

Another piece of our logging stack is Logstash. This service is responsible for processing log entries. Its configuration consists of three parts:

input
filter
output

In the input section we have to configure plugin that allows us to receive data from filebeat. The filter section is responsible for parsing and transforming log entries. The output section allows to set up a plugin that sends structural logs to target storage (ElasticSearch in our case)

In order to parse logs, you have to use Grok filter. Grok is DSL that can be described as a regular expression on the steroids. It allows using standard regexp as well as predefined patterns (there is even an option to create your own patterns). A list of default patterns is available here. A pattern that handles multiline entries should start with (?m). Sample multiline pattern can looks as follows:

(?m)%{TIMESTAMP_ISO8601:timestamp}~~\[%{NUMBER:thread}\]~~\[%{USERNAME:user}\]~~\[%{DATA:ipAddress}\]~~\[%{DATA:requestUrl}\]~~\[%{DATA:requestId}\]~~%{DATA:level}~~%{DATA:logger}~~%{DATA:message}~~%{GREEDYDATA:exception}\|\|

To test your grok pattern you can use the followings on-line grok debuggers:

http://grokdebug.herokuapp.com/ - Works with multiline but handles only single entry
http://grokconstructor.appspot.com/do/match - Works with multiple log entries but unfortunately doesn’t accept (?m) at the beginnig (multiline switch can be used for subpatterns, check out this example)

Grok debbugger is also a part of Kibana X-Pack (Grok debugger in X-Pack).

Sample logstash configuration with input listening to filebeat and output set to elasticsearch:

input {
  beats {
    port => 5044
  }
}
filter {
  grok {      
      match => { "message" => "(?m)^%{TIMESTAMP_ISO8601:timestamp}~~\[%{DATA:thread}\]~~\[%{DATA:user}\]~~\[%{DATA:requestId}\]~~\[%{DATA:userHost}\]~~\[%{DATA:requestUrl}\]~~%{DATA:level}~~%{DATA:logger}~~%{DATA:logmessage}~~%{DATA:exception}\|\|" }
      add_field => { 
        "received_at" => "%{@timestamp}" 
        "received_from" => "%{host}"
      }
      remove_field => ["message"]      
    }
  date {
    match => [ "timestamp", "yyyy-MM-dd HH:mm:ss:SSS" ]
  }
}

output { 
  elasticsearch {
    hosts => ["127.0.0.1:9200"]
    sniffing => true
    manage_template => false 
    index => "%{app_name}_%{app_env}_%{type}-%{+YYYY.MM.dd}"
    document_type => "%{[@metadata][type]}"
  }
}

Please note that besides the grok filter I’ve also used date filter to set date type for the field containing our timestamp (thanks to that Kibana will be able to use it for time filter).

Save your logstash config in MyApp.conf file and put under /etc/logstash/conf.d path (if you are using docker copy to the directory that is mapped to this volume). To copy files between Windows and Linux machine I use WinScp

After updating logstash configuration you have to restart this service with command systemctl restart logstash. If there is a problem with restarting logstash you can check its logs in /var/log/logstash directory.

It’s a good practice to keep ELK config files (Filebeat and Logstash) under version control.

Presenting logs with Kibana 🔗︎

The last thing that left to do is to configure log presentation in Kibana. At first, we have to configure index pattern. Open Kibana in a web browser (type your ELK server address with port 5601) and go to Management -> Index Patterns -> Create Index Patter. In the Step 1 provide your index name with the date replaced by a wildcard (this is the value defined in logstash configuration for output.elasticsearch.index). You need to inject data into elasticsearch before being able to configure it. If there is no index matching your pattern, make sure that the filebeat and logstash are working correctly. In the Step 2 select @timestamp field for Time Filter field name. After successfully creating the index, you can go to Discover tab and start querying your new index.

Brief overview of Kibana Discovery 🔗︎

Selected index
List of available fields from your log entries
List of selected fields
Create a filter using the visual editor
Create filter using lucene query syntax
Switch ‘Kibana discovery’ into the Auto-Refresh mode (live monitoring for your logs)
Time filter - this is very important, define the period of your log data. 15 minutes by default!!! (If you don’t see any log entries probably you have inappropriate time filter)
View single log entry. You can copy a link to this specific log entry
Create a link for current filter results (always remember about Time filter, it’s best to switch it into absolute mode)
Manage your filter configuration (you can save or load predefined discovery configuration)

Maintenance 🔗︎

ElasticSearch has a requirement for disk free space. If this limit is exceeded, ElasticSearch stops working and you get this screen in Kibana:

To prevent this situation you have to regularly remove old indices (be carefull not to drop .kibana index). You can manage existing indices through Dev Tools Kibana module (this is simple rest client).

Unfortunately, when ElastiSeach plugin is read Kibana goes down too and the Dev Tools tab is not available. In this situation, you have to interact directly with ElasticSeach API using REST client such as postman or even PowerShell Invoke-RestMethod cmdlet.

Gettting list of all indices:

Invoke-RestMethod -Method Get -Uri http://your-elk.domain.com:9200/_cat/indices

Delete indices which match given pattern

Invoke-RestMethod -Method Delete -Uri http://your-elk.domain.com:9200/my_index_pattern-*

In order to bring back to life Kibana and ElasticSearch, after removing redundant indices, you have to restart Kibana and ElasticSearch services

/etc/init.d# ./elasticsearch restart
/etc/init.d# ./kibana restart

The key to success 🔗︎

The Number-One rule of successful implementation of ELK in your project is:

Make sure that everyone in your team knows that you have ELK, where it's accessible and HOW TO USE IT.

This sounds very obvious but I’ve met a team where somebody devoted a lot of time to configure ELK, but there were no profits from this because he forgot to teach his colleges how to use it. So after you run ELK in your team, make a meeting and show everybody how to access and use it. I can guarantee you that if they know how to effectively use Kibana, they will quickly become addictied and never want to go back to manually searching through log files.