Tips and Tricks for managing ELK configuration

13 November 2018, Author: Cezary Piątek

A few months ago I published “Demystifying ELK stack” article that summarizes my knowledge about setting up and configuring the system for collecting, processing and presenting logs, based on Filebeat, Logstash, Kibana, and Elasticsearch. Since then I’ve learned a few new DevOps things which help me and my teammates to work more effectively with ELK. I think they’re worth sharing.

Shrinking Filebeat configuration 🔗︎

I use Filebeat to collect data from log files and send them to Logstash for further processing and analyzing. Filebeat configuration is in YAML format and the most important part of it is the section filebeat.prospectors which is responsible for configuring harvesting data. A sample configuration looks as follows:

filebeat.prospectors:
- input_type: log
  paths:
    - c:\inetpub\wwwroot\MyApp\logs\Client.Web.log
  scan_frequency: 10
  encoding: utf-8
  multiline.pattern: '^(\d{4}-\d{2}-\d{2}\s)'
  multiline.negate: true 
  multiline.match: after  
  fields_under_root: true
  fields:
    app_env: test
    app_name: client
    type: web

Currently, I have 10 internal environments for different purposes (manual testing, automated UI testing, load testing, presentations for different customers etc.). Each environment consists of two web applications and each of them produces 2 log files (diagnostic and performance data). That gives me 10 * 2 * 2 = 40 log files and for every single one of them, I have to configure a separate prospector. I can’t use a single prospector with a wildcard in the path attribute because there is a need to add additional metadata, such as environment name, app name and log type (attributes defined in the fields node). However, some of the attributes are the same for every prospector which causes a massive configuration duplication and makes it harder to modify those common values. I was even thinking about preparing some kind of template for prospector configuration with custom PowerShell script that could facilitate creating a config for the entire environment. Instead of rushing to develop an in-house solution though, I started from browsing YAML specification and I found Merge Key Language-Independent Type which seemed to be a solution to my problem. The !!merge feature together with Anchors and Aliases allows to define and reuse keys of the map - simply speaking it’s a kind of variables in YAML files. In the following example, I’ve defined a common configuration with anchor &PROSPECTOR_COMMON_OPTIONS and merge it into any prospector configuration with << : operator.

PROSPECTOR_COMMON_OPTIONS : &PROSPECTOR_COMMON_OPTIONS
    scan_frequency: 10
    encoding: utf-8
    multiline.pattern: '^(\d{4}-\d{2}-\d{2}\s)'
    multiline.negate: true
    multiline.match: after  
    fields_under_root: true

filebeat.prospectors:
- input_type: log
  paths:
    - C:\logs\manualtest\Client.Web.log
  << : *PROSPECTOR_COMMON_OPTIONS
  fields:
    app_env: manualtest
    app_name: client
    type: web

- input_type: log
  paths:
    - C:\logs\automatedui\Office.Web.log
  << : *PROSPECTOR_COMMON_OPTIONS
  fields:
    app_env: automatedui
    app_name: office
    type: web

Thanks to this little trick I was able to reduce the number of entries in my Filebeat configuration and improve its maintainability. A lot of people criticize YAML for being hard to read and edit - in comparizon to JSON format - but it has some less known features which give it more possibilities than JSON has.

Temporary variables in Logstash configuration 🔗︎

In the Be the first to know of the bug article I described how we can easily integrate Logstash with Microsoft Teams to create some kind of early-warning system. In the proposed solution I used mutate filter to create extra fields which hold additional data consumed only by the output section, for example URL for Kibana filter, Jira create issue link or Microsft Teams webhook. After a while, I’ve realized that this additional data is unnecessarily stored in ElasticSearch index and consumes a lot of space. Thankfully, the authors of Filebeat foresaw the need for temporal variables and introduced Logstash Metadata. Now, instead of adding fields to events only for processing purpose, we can store them in the dedicated @metadata field:

 mutate
 {
    add_field =>
    {
        "[@metadata][webhookUrl]" => "https://outlook.office.com/webhook/0c744aca-7d19-4556/IncomingWebhook/ceb6ba15106147a57e14e03d662de6/86aafacf-4c13-9780-5d9063b10fb6"
    }
}

Automating the configuration update 🔗︎

Every time I changed Logstash or Filebeat configuration I had to log in to the appropriate server, replace the old config with the new one, restart the service and examine the service log file if the whole operation was successful. If something failed, I needed to correct the config file and repeat the whole routine. It was a very tedious process and nobody in the team besides me knew to how to do it. I even wrote the whole instruction down but the number of steps or the need to log into Linux server repealed others. A better solution than creating manual instruction is to automate the process. The easiest part was to create the script that updates Filebeat configuration because it resides on the Windows server:

function Update-FilebeatConfig
{
    [CmdletBinding()]
    param(
        $ComputerName, 
        $Credentials, 
        $FilebeatSrcFile
     )
    $session = New-PSSession -ComputerName $ComputerName -Credential $Credentials
    Copy-Item $FilebeatSrcFile -Destination "C:\Tools\Filebeat\" -ToSession $session -Force
    Invoke-Command -Session $session -ScriptBlock {
        Write-Verbose "Restarting filebeat..."
        Restart-Service filebeat
        Get-Service filebeat
        Write-Verbose "Filebeat restart finished."
    }
    Remove-PSSession -Session $session
}

Now everybody can easily update Filebeat configuration by invoking this function as follows:

Update-FilebeatConfig -ComputerName "app.server.lan" -Credentials (Get-Credential) -FilebeatSrcFile "./filebeat.yml" -Verbose

PowerShell Core on Linux 🔗︎

However, the real challenge for me was to automate the same process for Logstash configuration that lives on the Linux server. I started by installing PowerShell Core on my Linux server with the following commands:

sudo apt-get install libunwind8 libicu55 liblttng-ust0
wget https://github.com/PowerShell/PowerShell/releases/download/v6.1.0/powershell_6.1.0-1.ubuntu.16.04_amd64.deb
sudo dpkg -i powershell_6.1.0-1.ubuntu.16.04_amd64.deb
sudo apt-get install -f

Depending on your Linux distribution and version you might need to use a different package of PowerShell Core If you are working on Ubuntu, you can check your current version with lsb_release -a command. If everything went well, you should be able to enter the PowerShell console with pwsh command.

Beside installing PowerShell I also needed to enable PowerShell Remoting. This can be accomplished by installing OMI PSRP packages:

wget https://github.com/PowerShell/psl-omi-provider/releases/download/v1.4.1-28/psrp-1.4.1-28.universal.x64.deb
wget https://github.com/Microsoft/omi/releases/download/v1.4.2-5/omi-1.4.2-5.ssl_100.ulinux.x64.deb
sudo dpkg -i omi-1.4.2-5.ssl_100.ulinux.x64.deb
sudo dpkg -i psrp-1.4.1-28.universal.x64.deb

Despise all my concerns the installation went pretty smoothly (I only needed to adjust the version of package responsible for SSL) and I was able to remotely invoke command on the Linux server from my Windows workstation with Invoke-Command cmdlet. Then I was able to easily automate the process of updating Logstash config:

$sharedContext = {
    function Watch-File {
        [CmdletBinding()]
        param (
            [string] $FileName,
            [string] $StopContentPositive,
            [string] $StopContentNegative
        )
        $ErrorActionPreference = "Stop"    
        $stream =  New-Object  System.IO.FileStream $FileName, ([System.IO.FileMode]::Open), ([System.IO.FileAccess]::Read), ([System.IO.FileShare]::ReadWrite)
        $tries = 0
        $maxTries = 100
        
        try{
            try{
                $stream.Seek(0,  [IO.SeekOrigin]::End)
                $streanReader = New-Object System.IO.StreamReader $stream, ([System.Text.Encoding]::UTF8)
                do{
                    $line = $streanReader.ReadLine() 
                    $tries = $tries +1
                    if($null -eq $line)
                    {                    
                        Start-Sleep -Seconds 2
                    }else{
                        Write-Verbose $line
                    }
                    
                    if($line -like "*$StopContentNegative*")
                    {
                        Write-Error "Cannot restart logstash"
                        break;
                    }
                    
                }while ((-not ($line -like "*$StopContentPositive*")) -and ($tries -lt $maxTries)) 
    
                if($tries -eq $maxTries){
                    Write-Error "Cannot restart logstash"
                }               
            }finally{
                $streanReader.Dispose()
            }    
        }finally{
            $stream.Dispose()
        }
       
    }
}

function Update-LogstashConfig
{
    [CmdletBinding()]
    param(
            $ComputerName, 
            $Credentials, 
            $LogstashSrcFile
        )

    $sessionOptions = New-PSSessionOption -SkipCACheck -SkipRevocationCheck -SkipCNCheck
    $session = New-PSSession -ComputerName $ComputerName -Credential $Credentials -Authentication basic -UseSSL -SessionOption $sessionOptions
    Copy-Item $LogstashSrcFile -Destination /etc/logstash/conf.d/ -ToSession $session -Force
    Invoke-Command -Session $session -ScriptBlock {
        . ([scriptblock]::Create($using:sharedContext))   
        Write-Verbose "Restarting logstash..."
        systemctl restart logstash
        Watch-File -FileName "/var/log/logstash/logstash-plain.log" -StopContentPositive "Pipelines running" -StopContentNegative "Failed to execute action"
    } 
    Remove-PSSession -Session $session
}

I’ve enriched my script with Watch-File function that forwards Logstash logs and blocks the process until the restart is finished. Thanks to that we have a live stream of what is going on during the restart. The Logstash config update can be performed with the command:

Update-LogstashConfig -ComputerName 'elk.server.lan' -Credentials (Get-Credential) -LogstashSrcFile "./logstash/App.conf" -Verbose

I’ve put all of the above scripts together with the config files in the source control so everybody in the team can easily modify and deploy new ELK configuration.

TL;DR 🔗︎

Thanks to !!merge, Anchors and Aliases I can simulate variables in YAML and create reusable parts of Filebeat configuration. The Logstash @metadata field allows me to create variables needed only for processing logic without polluting ElasticSeach indices. With PowerShell Core I can easily manage Linux servers directly from my Windows workstation and automatically deploy ELK configuration.