Tips and Tricks for managing ELK configuration
A few months ago I published “Demystifying ELK stack” article that summarizes my knowledge about setting up and configuring the system for collecting, processing and presenting logs, based on Filebeat, Logstash, Kibana, and Elasticsearch. Since then I’ve learned a few new DevOps things which help me and my teammates to work more effectively with ELK. I think they’re worth sharing.
Shrinking Filebeat configuration 🔗︎
I use Filebeat
to collect data from log files and send them to Logstash
for further processing and analyzing. Filebeat
configuration is in YAML
format and the most important part of it is the section filebeat.prospectors
which is responsible for configuring harvesting data. A sample configuration looks as follows:
filebeat.prospectors:
- input_type: log
paths:
- c:\inetpub\wwwroot\MyApp\logs\Client.Web.log
scan_frequency: 10
encoding: utf-8
multiline.pattern: '^(\d{4}-\d{2}-\d{2}\s)'
multiline.negate: true
multiline.match: after
fields_under_root: true
fields:
app_env: test
app_name: client
type: web
Currently, I have 10 internal environments for different purposes (manual testing, automated UI testing, load testing, presentations for different customers etc.). Each environment consists of two web applications and each of them produces 2 log files (diagnostic and performance data). That gives me 10 * 2 * 2 = 40
log files and for every single one of them, I have to configure a separate prospector. I can’t use a single prospector with a wildcard in the path
attribute because there is a need to add additional metadata, such as environment name, app name and log type (attributes defined in the fields
node). However, some of the attributes are the same for every prospector which causes a massive configuration duplication and makes it harder to modify those common values. I was even thinking about preparing some kind of template for prospector configuration with custom PowerShell
script that could facilitate creating a config for the entire environment. Instead of rushing to develop an in-house solution though, I started from browsing YAML
specification and I found Merge Key Language-Independent Type which seemed to be a solution to my problem. The !!merge
feature together with Anchors and Aliases
allows to define and reuse keys of the map - simply speaking it’s a kind of variables in YAML files. In the following example, I’ve defined a common configuration with anchor &PROSPECTOR_COMMON_OPTIONS
and merge it into any prospector configuration with << :
operator.
PROSPECTOR_COMMON_OPTIONS : &PROSPECTOR_COMMON_OPTIONS
scan_frequency: 10
encoding: utf-8
multiline.pattern: '^(\d{4}-\d{2}-\d{2}\s)'
multiline.negate: true
multiline.match: after
fields_under_root: true
filebeat.prospectors:
- input_type: log
paths:
- C:\logs\manualtest\Client.Web.log
<< : *PROSPECTOR_COMMON_OPTIONS
fields:
app_env: manualtest
app_name: client
type: web
- input_type: log
paths:
- C:\logs\automatedui\Office.Web.log
<< : *PROSPECTOR_COMMON_OPTIONS
fields:
app_env: automatedui
app_name: office
type: web
Thanks to this little trick I was able to reduce the number of entries in my Filebeat
configuration and improve its maintainability. A lot of people criticize YAML
for being hard to read and edit - in comparizon to JSON
format - but it has some less known features which give it more possibilities than JSON
has.
Temporary variables in Logstash configuration 🔗︎
In the Be the first to know of the bug article I described how we can easily integrate Logstash
with Microsoft Teams
to create some kind of early-warning system. In the proposed solution I used mutate
filter to create extra fields which hold additional data consumed only by the output
section, for example URL for Kibana filter, Jira create issue link or Microsft Teams
webhook. After a while, I’ve realized that this additional data is unnecessarily stored in ElasticSearch
index and consumes a lot of space. Thankfully, the authors of Filebeat foresaw the need for temporal variables and introduced Logstash Metadata. Now, instead of adding fields to events only for processing purpose, we can store them in the dedicated @metadata
field:
mutate
{
add_field =>
{
"[@metadata][webhookUrl]" => "https://outlook.office.com/webhook/0c744aca-7d19-4556/IncomingWebhook/ceb6ba15106147a57e14e03d662de6/86aafacf-4c13-9780-5d9063b10fb6"
}
}
Automating the configuration update 🔗︎
Every time I changed Logstash
or Filebeat
configuration I had to log in to the appropriate server, replace the old config with the new one, restart the service and examine the service log file if the whole operation was successful. If something failed, I needed to correct the config file and repeat the whole routine. It was a very tedious process and nobody in the team besides me knew to how to do it. I even wrote the whole instruction down but the number of steps or the need to log into Linux
server repealed others. A better solution than creating manual instruction is to automate the process. The easiest part was to create the script that updates Filebeat
configuration because it resides on the Windows server:
function Update-FilebeatConfig
{
[CmdletBinding()]
param(
$ComputerName,
$Credentials,
$FilebeatSrcFile
)
$session = New-PSSession -ComputerName $ComputerName -Credential $Credentials
Copy-Item $FilebeatSrcFile -Destination "C:\Tools\Filebeat\" -ToSession $session -Force
Invoke-Command -Session $session -ScriptBlock {
Write-Verbose "Restarting filebeat..."
Restart-Service filebeat
Get-Service filebeat
Write-Verbose "Filebeat restart finished."
}
Remove-PSSession -Session $session
}
Now everybody can easily update Filebeat configuration by invoking this function as follows:
Update-FilebeatConfig -ComputerName "app.server.lan" -Credentials (Get-Credential) -FilebeatSrcFile "./filebeat.yml" -Verbose
PowerShell Core on Linux 🔗︎
However, the real challenge for me was to automate the same process for Logstash
configuration that lives on the Linux
server. I started by installing PowerShell Core
on my Linux server with the following commands:
sudo apt-get install libunwind8 libicu55 liblttng-ust0
wget https://github.com/PowerShell/PowerShell/releases/download/v6.1.0/powershell_6.1.0-1.ubuntu.16.04_amd64.deb
sudo dpkg -i powershell_6.1.0-1.ubuntu.16.04_amd64.deb
sudo apt-get install -f
Depending on your Linux distribution and version you might need to use a different package of PowerShell Core
If you are working on Ubuntu
, you can check your current version with lsb_release -a
command. If everything went well, you should be able to enter the PowerShell
console with pwsh
command.
Beside installing PowerShell
I also needed to enable PowerShell Remoting
. This can be accomplished by installing OMI PSRP
packages:
wget https://github.com/PowerShell/psl-omi-provider/releases/download/v1.4.1-28/psrp-1.4.1-28.universal.x64.deb
wget https://github.com/Microsoft/omi/releases/download/v1.4.2-5/omi-1.4.2-5.ssl_100.ulinux.x64.deb
sudo dpkg -i omi-1.4.2-5.ssl_100.ulinux.x64.deb
sudo dpkg -i psrp-1.4.1-28.universal.x64.deb
Despise all my concerns the installation went pretty smoothly (I only needed to adjust the version of package responsible for SSL
) and I was able to remotely invoke command on the Linux server from my Windows workstation with Invoke-Command
cmdlet. Then I was able to easily automate the process of updating Logstash
config:
$sharedContext = {
function Watch-File {
[CmdletBinding()]
param (
[string] $FileName,
[string] $StopContentPositive,
[string] $StopContentNegative
)
$ErrorActionPreference = "Stop"
$stream = New-Object System.IO.FileStream $FileName, ([System.IO.FileMode]::Open), ([System.IO.FileAccess]::Read), ([System.IO.FileShare]::ReadWrite)
$tries = 0
$maxTries = 100
try{
try{
$stream.Seek(0, [IO.SeekOrigin]::End)
$streanReader = New-Object System.IO.StreamReader $stream, ([System.Text.Encoding]::UTF8)
do{
$line = $streanReader.ReadLine()
$tries = $tries +1
if($null -eq $line)
{
Start-Sleep -Seconds 2
}else{
Write-Verbose $line
}
if($line -like "*$StopContentNegative*")
{
Write-Error "Cannot restart logstash"
break;
}
}while ((-not ($line -like "*$StopContentPositive*")) -and ($tries -lt $maxTries))
if($tries -eq $maxTries){
Write-Error "Cannot restart logstash"
}
}finally{
$streanReader.Dispose()
}
}finally{
$stream.Dispose()
}
}
}
function Update-LogstashConfig
{
[CmdletBinding()]
param(
$ComputerName,
$Credentials,
$LogstashSrcFile
)
$sessionOptions = New-PSSessionOption -SkipCACheck -SkipRevocationCheck -SkipCNCheck
$session = New-PSSession -ComputerName $ComputerName -Credential $Credentials -Authentication basic -UseSSL -SessionOption $sessionOptions
Copy-Item $LogstashSrcFile -Destination /etc/logstash/conf.d/ -ToSession $session -Force
Invoke-Command -Session $session -ScriptBlock {
. ([scriptblock]::Create($using:sharedContext))
Write-Verbose "Restarting logstash..."
systemctl restart logstash
Watch-File -FileName "/var/log/logstash/logstash-plain.log" -StopContentPositive "Pipelines running" -StopContentNegative "Failed to execute action"
}
Remove-PSSession -Session $session
}
I’ve enriched my script with Watch-File
function that forwards Logstash
logs and blocks the process until the restart is finished. Thanks to that we have a live stream of what is going on during the restart. The Logstash config update can be performed with the command:
Update-LogstashConfig -ComputerName 'elk.server.lan' -Credentials (Get-Credential) -LogstashSrcFile "./logstash/App.conf" -Verbose
I’ve put all of the above scripts together with the config files in the source control so everybody in the team can easily modify and deploy new ELK configuration.
TL;DR 🔗︎
Thanks to !!merge, Anchors and Aliases
I can simulate variables in YAML
and create reusable parts of Filebeat
configuration. The Logstash @metadata
field allows me to create variables needed only for processing logic without polluting ElasticSeach
indices. With PowerShell Core
I can easily manage Linux servers directly from my Windows workstation and automatically deploy ELK
configuration.