Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[O11y][Nginx] Rally benchmark nginx.error #8762

Merged
merged 10 commits into from
Jan 16, 2024

Conversation

ali786XI
Copy link
Contributor

Proposed commit message

  • This PR adds benchmarking templates to the error data stream of Nginx

Checklist

  • I have reviewed tips for building integrations and this pull request is aligned with them.
  • I have verified that all data streams collect metrics or logs.

How to test this PR locally

Run this command from package root

  • elastic-package benchmark rally --benchmark error-benchmark -v

Related issues

Screenshots

--- Benchmark results for package: nginx - START ---
╭────────────────────────────────────────────────────────────────────────────────────╮
│ info                                                                               │
├────────────────────────┬───────────────────────────────────────────────────────────┤
│ benchmark              │                                           error-benchmark │
│ description            │               Benchmark 20000 nginx.error events ingested │
│ run ID                 │                      4f67f3dd-aee1-4c22-995c-917710510349 │
│ package                │                                                     nginx │
│ start ts (s)           │                                                1702990796 │
│ end ts (s)             │                                                1702990829 │
│ duration               │                                                       33s │
│ generated corpora file │ /root/.elastic-package/tmp/rally_corpus/corpus-4152939849 │
╰────────────────────────┴───────────────────────────────────────────────────────────╯
╭─────────────────────────────────────────────────────────────────────╮
│ parameters                                                          │
├─────────────────────────────────┬───────────────────────────────────┤
│ package version                 │                            1.17.0 │
│ data_stream.name                │                             error │
│ corpora.generator.total_events  │                             20000 │
│ corpora.generator.template.path │ ./error-benchmark/template.ndjson │
│ corpora.generator.template.raw  │                                   │
│ corpora.generator.template.type │                            gotext │
│ corpora.generator.config.path   │      ./error-benchmark/config.yml │
│ corpora.generator.config.raw    │                             map[] │
│ corpora.generator.fields.path   │      ./error-benchmark/fields.yml │
│ corpora.generator.fields.raw    │                             map[] │
╰─────────────────────────────────┴───────────────────────────────────╯
╭───────────────────────╮
│ cluster info          │
├───────┬───────────────┤
│ name  │ elasticsearch │
│ nodes │             1 │
╰───────┴───────────────╯
╭──────────────────────────────────────────────────╮
│ data stream stats                                │
├────────────────────────────┬─────────────────────┤
│ data stream                │ logs-nginx.error-ep │
│ approx total docs ingested │               20000 │
│ backing indices            │                   1 │
│ store size bytes           │             7262099 │
│ maximum ts (ms)            │       1703006970000 │
╰────────────────────────────┴─────────────────────╯
╭───────────────────────────────────────╮
│ disk usage for index .ds-logs-nginx.e │
│ rror-ep-2023.12.19-000001 (for all fi │
│ elds)                                 │
├──────────────────────────────┬────────┤
│ total                        │ 5.5 MB │
│ inverted_index.total         │ 1.4 MB │
│ inverted_index.stored_fields │ 2.7 MB │
│ inverted_index.doc_values    │ 966 kB │
│ inverted_index.points        │ 463 kB │
│ inverted_index.norms         │    0 B │
│ inverted_index.term_vectors  │    0 B │
│ inverted_index.knn_vectors   │    0 B │
╰──────────────────────────────┴────────╯
╭─────────────────────────────────────────────────────────────────────────────────────────╮
│ pipeline logs-nginx.error-1.17.0 stats in node UhUgzz1BQ5OJkbI7NZbkyw                   │
├────────────────────────────────────────────────┬────────────────────────────────────────┤
│ Totals                                         │ Count: 20000 | Failed: 0 | Time: 587ms │
│ pipeline (logs-nginx.error-1.17.0-third-party) │        Count: 0 | Failed: 0 | Time: 0s │
│ set ()                                         │  Count: 20000 | Failed: 0 | Time: 70ms │
│ set ()                                         │   Count: 20000 | Failed: 0 | Time: 7ms │
│ rename ()                                      │  Count: 20000 | Failed: 0 | Time: 19ms │
│ grok ()                                        │ Count: 20000 | Failed: 0 | Time: 231ms │
│ remove ()                                      │  Count: 20000 | Failed: 0 | Time: 13ms │
│ rename ()                                      │  Count: 20000 | Failed: 0 | Time: 17ms │
│ date ()                                        │        Count: 0 | Failed: 0 | Time: 0s │
│ date ()                                        │ Count: 20000 | Failed: 0 | Time: 106ms │
│ remove ()                                      │  Count: 20000 | Failed: 0 | Time: 10ms │
│ set ()                                         │   Count: 20000 | Failed: 0 | Time: 7ms │
│ append ()                                      │   Count: 20000 | Failed: 0 | Time: 9ms │
│ append ()                                      │   Count: 20000 | Failed: 0 | Time: 6ms │
│ remove ()                                      │  Count: 20000 | Failed: 0 | Time: 11ms │
│ pipeline (logs-nginx.error@custom)             │   Count: 20000 | Failed: 0 | Time: 2ms │
╰────────────────────────────────────────────────┴────────────────────────────────────────╯
╭─────────────────────────────────────────────────────────────────────────────────────────────╮
│ rally stats                                                                                 │
├────────────────────────────────────────────────────────────────┬────────────────────────────┤
│ Cumulative indexing time of primary shards                     │      3.405933333333333 min │
│ Min cumulative indexing time across primary shards             │                      0 min │
│ Median cumulative indexing time across primary shards          │   0.005033333333333333 min │
│ Max cumulative indexing time across primary shards             │      2.669366666666667 min │
│ Cumulative indexing throttle time of primary shards            │                      0 min │
│ Min cumulative indexing throttle time across primary shards    │                      0 min │
│ Median cumulative indexing throttle time across primary shards │                      0 min │
│ Max cumulative indexing throttle time across primary shards    │                      0 min │
│ Cumulative merge time of primary shards                        │    0.29873333333333335 min │
│ Cumulative merge count of primary shards                       │                        157 │
│ Min cumulative merge time across primary shards                │                      0 min │
│ Median cumulative merge time across primary shards             │  0.0033833333333333337 min │
│ Max cumulative merge time across primary shards                │    0.11781666666666667 min │
│ Cumulative merge throttle time of primary shards               │                      0 min │
│ Min cumulative merge throttle time across primary shards       │                      0 min │
│ Median cumulative merge throttle time across primary shards    │                      0 min │
│ Max cumulative merge throttle time across primary shards       │                      0 min │
│ Cumulative refresh time of primary shards                      │                0.81165 min │
│ Cumulative refresh count of primary shards                     │                       5449 │
│ Min cumulative refresh time across primary shards              │                      0 min │
│ Median cumulative refresh time across primary shards           │   0.004083333333333333 min │
│ Max cumulative refresh time across primary shards              │     0.5929166666666668 min │
│ Cumulative flush time of primary shards                        │                 2.4038 min │
│ Cumulative flush count of primary shards                       │                       5143 │
│ Min cumulative flush time across primary shards                │ 1.6666666666666667e-05 min │
│ Median cumulative flush time across primary shards             │                 0.0785 min │
│ Max cumulative flush time across primary shards                │                0.19245 min │
│ Total Young Gen GC time                                        │                    0.017 s │
│ Total Young Gen GC count                                       │                          3 │
│ Total Old Gen GC time                                          │                        0 s │
│ Total Old Gen GC count                                         │                          0 │
│ Store size                                                     │     0.02909163571894169 GB │
│ Translog size                                                  │    0.023729493841528893 GB │
│ Heap used for segments                                         │                       0 MB │
│ Heap used for doc values                                       │                       0 MB │
│ Heap used for terms                                            │                       0 MB │
│ Heap used for norms                                            │                       0 MB │
│ Heap used for points                                           │                       0 MB │
│ Heap used for stored fields                                    │                       0 MB │
│ Segment count                                                  │                        401 │
│ Total Ingest Pipeline count                                    │                      20009 │
│ Total Ingest Pipeline time                                     │                    1.368 s │
│ Total Ingest Pipeline failed                                   │                          0 │
│ Min Throughput                                                 │             4443.61 docs/s │
│ Mean Throughput                                                │             4744.89 docs/s │
│ Median Throughput                                              │             4825.61 docs/s │
│ Max Throughput                                                 │             4884.75 docs/s │
│ 50th percentile latency                                        │        92.6364534534514 ms │
│ 90th percentile latency                                        │      106.48578570690006 ms │
│ 100th percentile latency                                       │       114.9584490340203 ms │
│ 50th percentile service time                                   │        92.6364534534514 ms │
│ 90th percentile service time                                   │      106.48578570690006 ms │
│ 100th percentile service time                                  │       114.9584490340203 ms │
│ error rate                                                     │                     0.00 % │
╰────────────────────────────────────────────────────────────────┴────────────────────────────╯

--- Benchmark results for package: nginx - END   ---
Done

@elasticmachine
Copy link

🚀 Benchmarks report

To see the full report comment with /test benchmark fullreport

@elasticmachine
Copy link

💚 Build Succeeded

the below badges are clickable and redirect to their specific view in the CI or DOCS
Pipeline View Test View Changes Artifacts preview preview

Expand to view the summary

Build stats

  • Start Time: 2023-12-20T09:20:49.139+0000

  • Duration: 17 min 54 sec

Test stats 🧪

Test Results
Failed 0
Passed 22
Skipped 0
Total 22

🤖 GitHub comments

Expand to view the GitHub comments

To re-run your PR in the CI, just comment with:

  • /test : Re-trigger the build.

@elasticmachine
Copy link

🌐 Coverage report

Name Metrics % (covered/total) Diff
Packages 100.0% (2/2) 💚
Files 50.0% (2/4) 👎 -50.0
Classes 50.0% (2/4) 👎 -50.0
Methods 69.231% (27/39) 👎 -21.678
Lines 70.552% (230/326) 👍 20.892
Conditionals 100.0% (0/0) 💚

@ali786XI ali786XI marked this pull request as ready for review December 20, 2023 12:51
@ali786XI ali786XI requested a review from a team as a code owner December 20, 2023 12:51
Copy link
Contributor

@niraj-elastic niraj-elastic left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!

@mrodm
Copy link
Contributor

mrodm commented Dec 21, 2023

Hi @aliabbas-elastic, please update your branch with the latest contents from main branch. There was an important PR merged updating the CI pipelines. Thanks!

@ruflin
Copy link
Member

ruflin commented Dec 21, 2023

Hi @aliabbas-elastic Good to see this. Experimenting with some templates, I also started with #8777 and #8781 for nginx data. Please keep going with your approach as yours is more complete but take inspiration in case there is something additional you can use.

"created": "{{ generate "event.created" | date "2006-01-02T15:04:05.000Z" }}",
"dataset": "nginx.error",
"ingested": "{{ generate "event.ingested" | date "2006-01-02T15:04:05.000Z" }}",
"timezone": "{{ $eventTimezone }}"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How will this map to the timestamp in the message? I expect this to have the effect as there are multiple time zones, that events will be ingested in different hours. See also #8777 (comment)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ruflin I saw the error logs. They don't contain the timezone in the log timestamp so the timezone would be set according to the elastic-agent's timezone. Please refer the sample log
2019/11/05 14:50:44 [error] 54053#0: *3 open() "/usr/local/Cellar/nginx/1.10.2_1/html/adsasd" failed (2: No such file or directory), client: 127.0.0.1, server: localhost, request: "GET /pysio HTTP/1.1", host: "localhost:8080"

Currently we can set the timezone to a static value(+00:00). Also saw your comment of setting timezone variable which is a good thing to implement. Let me know if this is the correct thing to do right now.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

all the timestamps generated in the log appear to be in UTC ('Z')? perhaps a static +0000 would be best here.

Copy link

cla-checker-service bot commented Dec 22, 2023

💚 CLA has been signed

@ali786XI ali786XI force-pushed the nginx_benchmark_error branch from e8c94cc to 51a224d Compare December 22, 2023 09:07
"created": "{{ generate "event.created" | date "2006-01-02T15:04:05.000Z" }}",
"dataset": "nginx.error",
"ingested": "{{ generate "event.ingested" | date "2006-01-02T15:04:05.000Z" }}",
"timezone": "{{ $eventTimezone }}"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

all the timestamps generated in the log appear to be in UTC ('Z')? perhaps a static +0000 would be best here.

Copy link
Member

@ruflin ruflin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I suggest to get this in and follow up with @aspacca on the timestamp part.

"event": {
"agent_id_status": "verified",
"dataset": "nginx.error",
"timezone": "+00:00"
Copy link
Contributor Author

@ali786XI ali786XI Jan 9, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ruflin Thanks for your comment here. Just wanted to clarify we are giving static value here so this would become as $timestamp.Format "MST"?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ruflin Wanted to share some points here as I am not able to use the config $timestamp.Format "MST" here. Firstly, I looked into the documents generated. Here is the one such.

Sample Event
   {
    "@timestamp": "2024-01-11T12:26:38.253Z",
    "agent": {
        "ephemeral_id": "palethief",
        "id": "ef5e274d-4b53-45e6-943a-a5bcf1a6f523",
        "name": "denimwhimsey",
        "type": "filebeat",
        "version": "8.8.0"
    },
    "data_stream": {
        "dataset": "nginx.error",
        "namespace": "ep",
        "type": "logs"
    },
    "ecs": {
        "version": "8.5.1"
    },
    "elastic_agent": {
        "id": "palethief",
        "snapshot": false,
        "version": "8.8.0"
    },
    "event": {
        "agent_id_status": "verified",
        "dataset": "nginx.error",
        "timezone": "IST"
    },
    "host": {
        "architecture": "x86_64",
        "containerized": false,
        "hostname": "docker-fleet-agent",
        "id": "66392b0697b84641af8006d87aeb89f1",
        "ip": [
            "19.203.220.253"
        ],
        "mac": [
            "02-42-AC-12-00-07"
        ],
        "name": "docker-fleet-agent",
        "os": {
            "codename": "focal",
            "family": "debian",
            "kernel": "5.15.49-linuxkit",
            "name": "Ubuntu",
            "platform": "ubuntu",
            "type": "linux",
            "version": "20.04.5 LTS (Focal Fossa)"
        }
    },
    "input": {
        "type": "log"
    },
    "log": {
        "file": {
            "path": "/var/log/nginx/error.log"
        },
        "offset": 0
    },
    "message": "2024/01/11 12:26:38 [notice] 47858#85131: *33397 twister antelope cougar saver mark salmon gorilla thorn fang fly voice wasp carver elk spirit viper carp venomking",
    "tags": [
        "nginx-error"
    ]
}
Now if I pass this event through the ingest pipeline, I get this error message.
Unknown time-zone ID: IST

However the timezone with value +00:00 are working absolutely fine. Any suggestions here if anything is wrong in the config? Attaching the date processor as well which is failing.

  - date:
      if: ctx.event.timezone != null
      field: nginx.error.time
      target_field: '@timestamp'
      formats:
      - yyyy/MM/dd H:m:s
      timezone: '{{ event.timezone }}'
      on_failure:
      - append:
          field: error.message
          value: '{{ _ingest.on_failure_message }}'

cc:- @aspacca

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @aliabbas-elastic Thanks for playing with it. To be honest to dig more into the error, I need to play around with it. I'm hoping for some help / guidance from @aspacca on this one. Based on the error, it seems IST as a time zone id is not supported :-( Unfortunately the docs do not state which time zone abbrevations are supported: https://www.elastic.co/guide/en/elasticsearch/reference/current/date-processor.html Here is the code that Beats uses to add the timezone: https://github.com/elastic/beats/blob/main/libbeat/processors/add_locale/add_locale.go#L88

Around the timestamp and timezone, my mental model is as following: As long as the timestamp in the message and the timezone are aligned, Elasticsearch should do the right thing. Meaning if you generate data on your laptop and are based in Syndey, your timestamp is local to your time in sydney so will be the time zone that is added. If this is persisted, it will work on any other machine that will replay this data no matter the time zone this machine is in. It also means, if you look at your local Kibana instance and you are ingesting data right now, it will show up un "Now". Does this make sense?

Copy link
Contributor Author

@ali786XI ali786XI Jan 16, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ruflin As per suggestion from @aspacca $timestamp.Format "-07:00" works fine with this and I am getting the following reponse in timezone. Reference:- https://gosamples.dev/date-time-format-cheatsheet/#:~:text=trailing%20zeros%20omitted)-,Time%20zone%20format,-Time%20zone%20format

"timezone": "+05:30"

Updated in the PR as well

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Did a quick test on my end with this PR and works as expected!

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@aspacca Would be nice to "document" somewhere how to do it as I expect we will need it in quite a few places.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ruflin i agree

there's a few topics already we should document in some kind of knowledge base.
where do you suggest to place such content? I'm keen to have it on the corpus generator repo, and link that in elastic-package, but open to suggestion

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm ok having for now in the corpus tool. I think eventually we will need to have a simple guide on how to get started in elastic-package but for details still can refer to the corpus tool.

Most important is that we have docs.

@elasticmachine
Copy link

💚 Build Succeeded

cc @aliabbas-elastic

},
"offset": 0
},
"message": "{{$timestamp.Format "2006/01/02 15:04:05"}} [{{ $logLevel }}] {{ $pid }}#{{ $threadId }}: *{{ $connectionId }} {{generate "message"}}",
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewing this I stumbled over the {{generate "message"}} part in the UI as it generates a random message. I was a bit surprised by this but something we can tune as a follow up.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree with your point that it generates random messages that doesn't look realistic. But I referred from this PR in which the same field is being generated.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can always follow up on this one. Lets keep it as is for now.

"offset": 0
},
"message": "{{$timestamp.Format "2006/01/02 15:04:05"}} [{{ $logLevel }}] {{ $pid }}#{{ $threadId }}: *{{ $connectionId }} {{generate "message"}}",
"tags": [
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this tag shipped by the agent normally?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is basically the user provided value or the default value of the data stream that remains static in all the events if kept unchanged throughout the period integration is configured. It can be set by user while configuring the integration.

Copy link
Member

@ruflin ruflin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm good getting this PR in and the follow up with more fine tuning.

@ali786XI ali786XI merged commit bc6b4cd into elastic:main Jan 16, 2024
3 checks passed
ruflin added a commit to ruflin/integrations that referenced this pull request Jan 17, 2024
In elastic#8762 error log generation was added. The message was a random message generated. This changes it to have a partially more realistic message generated based on some actual logs. The enum values for the different fields are hardcoded. It would be nice to have for host, ip and paths some partially random generation that could be used as function with cardinality.
ruflin added a commit that referenced this pull request Feb 14, 2024
In #8762 error log generation was added. The message was a random message generated. This changes it to have a partially more realistic message generated based on some actual logs. The enum values for the different fields are hardcoded. It would be nice to have for host, ip and paths some partially random generation that could be used as function with cardinality.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request Integration:nginx Nginx
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants