-
Notifications
You must be signed in to change notification settings - Fork 175
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
docs: adding tornado components logs split
- Loading branch information
Bertrand Rigaud
committed
Oct 9, 2023
1 parent
d72a445
commit 8dc0a81
Showing
2 changed files
with
255 additions
and
0 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
254 changes: 254 additions & 0 deletions
254
docs/source/AdministratorGuide/ServerInstallations/tornadoComponentsLogs.rst
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,254 @@ | ||
.. _tornado_components_logs: | ||
|
||
=============================== | ||
Split Tornado logs by component | ||
=============================== | ||
|
||
Dirac offers the ability to write logs for each component. One can find logs in <DIRAC FOLDER>/startup/<DIRAC COMPONENT>/log/current | ||
|
||
In case of Tornado, logs come from many components, and can be hard to sort. | ||
|
||
Using Fluent-bit will allow to collect logs from files, rearrange content, then send them elsewhere like an ELK instance or simply other files. | ||
Thus, in case of ELK, it's now possible to monitor and display informations through Kibana and Grafana tools, using filters to sort logs, or simply read other splitted log files, one by component. | ||
|
||
The idea behind that is to deal with logs independantly from Dirac. It is also possible to grab servers metrics such as cpu, memory and disk usage, giving the opportunity to make correlations between logs and server usage. | ||
|
||
DIRAC Configuration | ||
------------------- | ||
|
||
First of all, you should configure a JSON Log Backend in your ``Resources`` and ``Operations`` like:: | ||
|
||
Resources | ||
{ | ||
LogBackends | ||
{ | ||
StdoutJson | ||
{ | ||
Plugin = StdoutJson | ||
} | ||
} | ||
} | ||
|
||
Operations | ||
{ | ||
Defaults | ||
{ | ||
Logging | ||
{ | ||
DefaultBackends = StdoutJson | ||
} | ||
} | ||
} | ||
|
||
|
||
Fluent-bit Installation | ||
----------------------- | ||
|
||
On each Dirac server, install Fluent-bit (https://docs.fluentbit.io):: | ||
|
||
curl https://raw.githubusercontent.com/fluent/fluent-bit/master/install.sh | sh | ||
|
||
Fluent-bit Configuration | ||
------------------------ | ||
|
||
Edit and add in /etc/fluent-bit/fluent-bit.conf:: | ||
|
||
@INCLUDE dirac-json.conf | ||
|
||
Create following files in /etc/fluent-bit | ||
|
||
dirac-json.conf (Add all needed components and choose the output you want):: | ||
|
||
[SERVICE] | ||
flush 1 | ||
log_level info | ||
parsers_file dirac-parsers.conf | ||
|
||
[INPUT] | ||
name cpu | ||
tag metric | ||
Interval_Sec 10 | ||
|
||
[INPUT] | ||
name mem | ||
tag metric | ||
Interval_Sec 10 | ||
|
||
[INPUT] | ||
name disk | ||
tag metric | ||
Interval_Sec 10 | ||
|
||
[INPUT] | ||
name tail | ||
parser dirac_parser_json | ||
path <DIRAC FOLDER>/startup/<DIRAC_COMPONENT>/log/current | ||
Tag log.<DIRAC_COMPONENT>.log | ||
Mem_Buf_Limit 50MB | ||
|
||
[INPUT] | ||
name tail | ||
parser dirac_parser_json | ||
path <DIRAC FOLDER>/startup/<ANOTHER_DIRAC_COMPONENT>/log/current | ||
Tag log.<ANOTHER_DIRAC_COMPONENT>.log | ||
Mem_Buf_Limit 50MB | ||
|
||
[FILTER] | ||
Name modify | ||
Match log.* | ||
Rename log message | ||
Add levelname DEV | ||
|
||
[FILTER] | ||
Name modify | ||
Match * | ||
Add hostname ${HOSTNAME} | ||
|
||
[FILTER] | ||
Name Lua | ||
Match log.* | ||
script dirac.lua | ||
call add_raw | ||
|
||
[FILTER] | ||
Name rewrite_tag | ||
Match log.tornado | ||
Rule $tornadoComponent .$ $TAG.$tornadoComponentclean.log false | ||
Emitter_Name re_emitted | ||
|
||
#[OUTPUT] | ||
# name stdout | ||
# match * | ||
|
||
[OUTPUT] | ||
Name file | ||
Match log.* | ||
Path /vo/dirac/logs | ||
Mkdir true | ||
Format template | ||
Template {raw} | ||
|
||
[OUTPUT] | ||
name es | ||
host <host> | ||
port <port> | ||
logstash_format true | ||
logstash_prefix <index prefix> | ||
tls on | ||
tls.verify off | ||
tls.ca_file <path_to_ca_file> | ||
tls.crt_file <path_to_crt_file> | ||
tls.key_file <path_to_key_file> | ||
match log.* | ||
|
||
[OUTPUT] | ||
name es | ||
host <host> | ||
port <port> | ||
logstash_format true | ||
logstash_prefix <index prefix> | ||
tls on | ||
tls.verify off | ||
tls.ca_file <path_to_ca_file> | ||
tls.crt_file <path_to_crt_file> | ||
tls.key_file <path_to_key_file> | ||
match metric | ||
|
||
``dirac-json.conf`` is the main file, it defines different steps such as:: | ||
[SERVICE] where we describe our json parser (from dirac Json log backend) | ||
[INPUT] where we describe dirac components log file and the way it will be parsed (json) | ||
[FILTER] where we apply modifications to parsed data, for example adding a levelname "DEV" whenever logs are not well formatted, typically "print" in code, or adding fields like hostname to know from which host logs are coming, but also more complex treatments like in dirac.lua script (described later) | ||
[OUTPUT] where we describe formatted logs destination, here, we have stdout, files on disks and elasticsearch. | ||
|
||
dirac-parsers.conf:: | ||
|
||
[PARSER] | ||
Name dirac_parser_json | ||
Format json | ||
Time_Key asctime | ||
Time_Format %Y-%m-%d %H:%M:%S,%L | ||
Time_Keep On | ||
|
||
``dirac-parsers.conf`` describes the source format that will be parsed, and the time that will be used (here asctime field) as reference | ||
|
||
dirac.lua:: | ||
|
||
function add_raw(tag, timestamp, record) | ||
new_record = record | ||
|
||
if record["asctime"] ~= nil then | ||
raw = record["asctime"] .. " [" .. record["levelname"] .. "] [" .. record["componentname"] .. "] " | ||
if record["tornadoComponent"] ~= nil then | ||
patterns = {"/"} | ||
str = record["tornadoComponent"] | ||
for i,v in ipairs(patterns) do | ||
str = string.gsub(str, v, "_") | ||
end | ||
new_record["tornadoComponentclean"] = str | ||
raw = raw .. "[" .. record["tornadoComponent"] .. "] " | ||
else | ||
raw = raw .. "[]" | ||
end | ||
raw = raw .. "[" .. record["customname"] .. "] " .. record["message"] .. " " .. record["varmessage"] .. " [" .. record["hostname"] .. "]" | ||
new_record["raw"] = raw | ||
else | ||
new_record["raw"] = os.date("%Y-%m-%d %H:%M:%S %Z") .. " [" .. record["levelname"] .. "] " .. record["message"] .. " [" .. record["hostname"] .. "]" | ||
end | ||
|
||
return 2, timestamp, new_record | ||
end | ||
|
||
``dirac.lua`` is the most important transformation we perform on primarily logs, it builds new record depending on logs containing or not special field tornadocomponent, then cleans and formats it before sending to the outputs. | ||
|
||
Testing | ||
------- | ||
|
||
Before throwing logs to ElasticSearch, config can be tested in Standard output by uncommenting:: | ||
|
||
[OUTPUT] | ||
name stdout | ||
match * | ||
|
||
...and commenting ElasticSearch outputs. | ||
|
||
Then by using command:: | ||
|
||
/opt/fluent-bit/bin/fluent-bit -c //etc/fluent-bit/fluent-bit.conf | ||
|
||
NOTE: When all is OK, uncomment ElasticSearch outputs and comment stdout output | ||
|
||
Service | ||
------- | ||
|
||
``sudo systemctl start/stop fluent-bit.service`` | ||
|
||
Dashboards | ||
---------- | ||
|
||
In case of logs sent to an ELK instance, dashboards are available here: ??? | ||
|
||
On disk | ||
------- | ||
|
||
In case of logs sent to local files, Logrotate is mandatory. | ||
|
||
Having a week log retention, Logrotate config file should look like | ||
/etc/logrotate.d/diraclogs:: | ||
|
||
/vo/dirac/logs/* { | ||
rotate 7 | ||
daily | ||
missingok | ||
notifempty | ||
compress | ||
delaycompress | ||
create 0644 diracsgm dirac | ||
sharedscripts | ||
postrotate | ||
/bin/kill -HUP `cat /var/run/syslogd.pid 2>/dev/null` 2>/dev/null || true | ||
endscript | ||
} | ||
|
||
along with crontab line like | ||
|
||
``0 0 * * * logrotate /etc/logrotate.d/diraclogs`` |