Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[O11y][Nginx] Rally benchmark nginx.error #8762

Merged
merged 10 commits into from
Jan 16, 2024
14 changes: 14 additions & 0 deletions packages/nginx/_dev/benchmark/rally/error-benchmark.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,14 @@
---
description: Benchmark 20000 nginx.error events ingested
data_stream:
name: error
corpora:
generator:
total_events: 20000
template:
type: gotext
path: ./error-benchmark/template.ndjson
config:
path: ./error-benchmark/config.yml
fields:
path: ./error-benchmark/fields.yml
23 changes: 23 additions & 0 deletions packages/nginx/_dev/benchmark/rally/error-benchmark/config.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,23 @@
fields:
- name: 'timestamp'
period: -24h # one day
- name: agent.id
value: "ef5e274d-4b53-45e6-943a-a5bcf1a6f523"
- name: host.name
cardinality: 100
- name: log.level
enum: ["debug", "info", "notice", "warn", "error", "crit", "alert", "emerg"]
- name: process.pid
range:
min: 1
max: 100000
- name: thread.id
range:
min: 1
max: 100000
- name: connection_id
range:
min: 1
max: 100000
- name: timezone
enum: ["+01:00", "+02:00", "+03:00", "+03:30", "+04:00", "+04:30", "+05:30"]
34 changes: 34 additions & 0 deletions packages/nginx/_dev/benchmark/rally/error-benchmark/fields.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,34 @@
- name: agent.ephemeral_id
type: keyword
- name: agent.ephemeral_id
type: keyword
- name: agent.id
type: keyword
- name: agent.name
type: keyword
- name: agent.version
type: keyword
- name: agent.version
type: keyword
- name: connection_id
type: long
- name: event.created
type: date
- name: event.ingested
type: date
- name: host.ip
type: ip
- name: host.name
type: keyword
- name: log.level
type: keyword
- name: message
type: text
- name: process.pid
type: long
- name: thread.id
type: long
- name: timestamp
type: date
- name: timezone
type: keyword
Original file line number Diff line number Diff line change
@@ -0,0 +1,89 @@
{{- $timestamp := generate "timestamp" }}
{{- $agentId := generate "agent.id" }}
{{- $agentVersion := generate "agent.version" }}
{{- $agentName := generate "agent.name" }}
{{- $agentEphemeralid := generate "agent.ephemeral_id" }}
{{- $logLevel := generate "log.level" }}
{{- $pid := generate "process.pid" }}
{{- $threadId := generate "thread.id" }}
{{- $connectionId := generate "connection_id" }}
{{- $eventTimezone := generate "timezone" }}
{{- $hostname := generate "host.name" }}
{
"@timestamp": "{{ $timestamp.Format "2006-01-02T15:04:05.000Z" }}",
"agent": {
"ephemeral_id": "{{ $agentEphemeralid }}",
"id": "{{ $agentId }}",
"name": "{{ $agentName }}",
"type": "filebeat",
"version": "8.8.0"
},
"data_stream": {
"dataset": "nginx.error",
"namespace": "ep",
"type": "logs"
},
"ecs": {
"version": "8.5.1"
},
"elastic_agent": {
"id": "{{ $agentEphemeralid }}",
"snapshot": false,
"version": "8.8.0"
},
"event": {
"agent_id_status": "verified",
"created": "{{ generate "event.created" | date "2006-01-02T15:04:05.000Z" }}",
"dataset": "nginx.error",
"ingested": "{{ generate "event.ingested" | date "2006-01-02T15:04:05.000Z" }}",
"timezone": "{{ $eventTimezone }}"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How will this map to the timestamp in the message? I expect this to have the effect as there are multiple time zones, that events will be ingested in different hours. See also #8777 (comment)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ruflin I saw the error logs. They don't contain the timezone in the log timestamp so the timezone would be set according to the elastic-agent's timezone. Please refer the sample log
2019/11/05 14:50:44 [error] 54053#0: *3 open() "/usr/local/Cellar/nginx/1.10.2_1/html/adsasd" failed (2: No such file or directory), client: 127.0.0.1, server: localhost, request: "GET /pysio HTTP/1.1", host: "localhost:8080"

Currently we can set the timezone to a static value(+00:00). Also saw your comment of setting timezone variable which is a good thing to implement. Let me know if this is the correct thing to do right now.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

all the timestamps generated in the log appear to be in UTC ('Z')? perhaps a static +0000 would be best here.

},
"host": {
"architecture": "x86_64",
"containerized": false,
"hostname": "{{ $hostname }}",
"id": "66392b0697b84641af8006d87aeb89f1",
"ip": [
"{{ generate "host.ip" }}"
],
"mac": [
"02-42-AC-12-00-07"
],
"name": "{{ $hostname }}",
"os": {
"codename": "focal",
"family": "debian",
"kernel": "5.15.49-linuxkit",
"name": "Ubuntu",
"platform": "ubuntu",
"type": "linux",
"version": "20.04.5 LTS (Focal Fossa)"
}
},
"input": {
"type": "log"
},
"log": {
"file": {
"path": "/tmp/service_logs/error.log"
},
"level": "{{ $logLevel }}",
"offset": 0
},
"message": "{{$timestamp.Format "2006/01/02 15:04:05"}} [{{ $logLevel }}] {{ $pid }}#{{ $threadId }}: *{{ $connectionId }} {{generate "message"}}",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewing this I stumbled over the {{generate "message"}} part in the UI as it generates a random message. I was a bit surprised by this but something we can tune as a follow up.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree with your point that it generates random messages that doesn't look realistic. But I referred from this PR in which the same field is being generated.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can always follow up on this one. Lets keep it as is for now.

"nginx": {
ali786XI marked this conversation as resolved.
Show resolved Hide resolved
"error": {
"time": "{{$timestamp.Format "2006/01/02 15:04:05"}}",
"connection_id": {{ $connectionId }}
}
},
"process": {
ali786XI marked this conversation as resolved.
Show resolved Hide resolved
"pid": {{ $pid }},
"thread": {
"id": {{ $threadId }}
}
},
"tags": [
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this tag shipped by the agent normally?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is basically the user provided value or the default value of the data stream that remains static in all the events if kept unchanged throughout the period integration is configured. It can be set by user while configuring the integration.

"nginx-error"
]
}