Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[O11y][Kubernetes] Rally benchmark kubernetes.state_container #9106

Merged
merged 8 commits into from
Feb 15, 2024

Conversation

ali786XI
Copy link
Contributor

@ali786XI ali786XI commented Feb 9, 2024

Proposed commit message

  • This PR adds benchmarking templates to the state_container data stream of Kubernetes

Checklist

  • I have reviewed tips for building integrations and this pull request is aligned with them.
  • I have verified that all data streams collect metrics or logs.

How to test this PR locally

Run this command from package root

  • elastic-package benchmark rally --benchmark state_container-benchmark -v
  • elastic-package benchmark stream --benchmark state_container-benchmark -v

Related issues

Screenshots

--- Benchmark results for package: kubernetes - START ---
╭─────────────────────────────────────────────────────────────────────────────────────╮
│ info                                                                                │
├────────────────────────┬────────────────────────────────────────────────────────────┤
│ benchmark              │                                   statecontainer-benchmark │
│ description            │ Benchmark 20000 kubernetes.state_container events ingested │
│ run ID                 │                       0b22e23e-d5f3-4055-91c9-17441ec98e37 │
│ package                │                                                 kubernetes │
│ start ts (s)           │                                                 1707473600 │
│ end ts (s)             │                                                 1707473651 │
│ duration               │                                                        51s │
│ generated corpora file │  /root/.elastic-package/tmp/rally_corpus/corpus-1816164413 │
╰────────────────────────┴────────────────────────────────────────────────────────────╯
╭──────────────────────────────────────────────────────────────────────────────╮
│ parameters                                                                   │
├─────────────────────────────────┬────────────────────────────────────────────┤
│ package version                 │                                     1.56.0 │
│ data_stream.name                │                            state_container │
│ corpora.generator.total_events  │                                      20000 │
│ corpora.generator.template.path │ ./statecontainer-benchmark/template.ndjson │
│ corpora.generator.template.raw  │                                            │
│ corpora.generator.template.type │                                     gotext │
│ corpora.generator.config.path   │      ./statecontainer-benchmark/config.yml │
│ corpora.generator.config.raw    │                                      map[] │
│ corpora.generator.fields.path   │      ./statecontainer-benchmark/fields.yml │
│ corpora.generator.fields.raw    │                                      map[] │
╰─────────────────────────────────┴────────────────────────────────────────────╯
╭───────────────────────╮
│ cluster info          │
├───────┬───────────────┤
│ name  │ elasticsearch │
│ nodes │             1 │
╰───────┴───────────────╯
╭───────────────────────────────────────╮
│ disk usage for index metrics-kubernet │
│ es.state_container-ep (for all fields │
│ )                                     │
├──────────────────────────────┬────────┤
│ total                        │  12 MB │
│ inverted_index.total         │ 1.9 MB │
│ inverted_index.stored_fields │ 7.4 MB │
│ inverted_index.doc_values    │ 2.5 MB │
│ inverted_index.points        │ 345 kB │
│ inverted_index.norms         │    0 B │
│ inverted_index.term_vectors  │    0 B │
│ inverted_index.knn_vectors   │    0 B │
╰──────────────────────────────┴────────╯
╭──────────────────────────────────────────────────────────────────────────────────────────────╮
│ pipeline metrics-kubernetes.state_container-1.56.0 stats in node OQo_iVeVRSeYfWHrhZJ5_g      │
├──────────────────────────────────────────────────────┬───────────────────────────────────────┤
│ Totals                                               │ Count: 20000 | Failed: 0 | Time: 27ms │
│ pipeline (global@custom)                             │  Count: 20000 | Failed: 0 | Time: 2ms │
│ pipeline (metrics@custom)                            │  Count: 20000 | Failed: 0 | Time: 2ms │
│ pipeline (metrics-kubernetes@custom)                 │  Count: 20000 | Failed: 0 | Time: 2ms │
│ pipeline (metrics-kubernetes.state_container@custom) │  Count: 20000 | Failed: 0 | Time: 2ms │
╰──────────────────────────────────────────────────────┴───────────────────────────────────────╯
╭─────────────────────────────────────────────────────────────────────────────────────────────╮
│ rally stats                                                                                 │
├────────────────────────────────────────────────────────────────┬────────────────────────────┤
│ Cumulative indexing time of primary shards                     │     0.2747833333333333 min │
│ Min cumulative indexing time across primary shards             │                      0 min │
│ Median cumulative indexing time across primary shards          │  0.0005750000000000001 min │
│ Max cumulative indexing time across primary shards             │    0.19011666666666666 min │
│ Cumulative indexing throttle time of primary shards            │                      0 min │
│ Min cumulative indexing throttle time across primary shards    │                      0 min │
│ Median cumulative indexing throttle time across primary shards │                    0.0 min │
│ Max cumulative indexing throttle time across primary shards    │                      0 min │
│ Cumulative merge time of primary shards                        │                      0 min │
│ Cumulative merge count of primary shards                       │                          0 │
│ Min cumulative merge time across primary shards                │                      0 min │
│ Median cumulative merge time across primary shards             │                    0.0 min │
│ Max cumulative merge time across primary shards                │                      0 min │
│ Cumulative merge throttle time of primary shards               │                      0 min │
│ Min cumulative merge throttle time across primary shards       │                      0 min │
│ Median cumulative merge throttle time across primary shards    │                    0.0 min │
│ Max cumulative merge throttle time across primary shards       │                      0 min │
│ Cumulative refresh time of primary shards                      │    0.05616666666666667 min │
│ Cumulative refresh count of primary shards                     │                        543 │
│ Min cumulative refresh time across primary shards              │                      0 min │
│ Median cumulative refresh time across primary shards           │                 0.0005 min │
│ Max cumulative refresh time across primary shards              │   0.016483333333333332 min │
│ Cumulative flush time of primary shards                        │    0.16481666666666667 min │
│ Cumulative flush count of primary shards                       │                        362 │
│ Min cumulative flush time across primary shards                │ 1.6666666666666667e-05 min │
│ Median cumulative flush time across primary shards             │               0.005125 min │
│ Max cumulative flush time across primary shards                │                0.02075 min │
│ Total Young Gen GC time                                        │                    0.067 s │
│ Total Young Gen GC count                                       │                          5 │
│ Total Old Gen GC time                                          │                        0 s │
│ Total Old Gen GC count                                         │                          0 │
│ Store size                                                     │    0.022678245790302753 GB │
│ Translog size                                                  │   0.0001719137653708458 GB │
│ Heap used for segments                                         │                       0 MB │
│ Heap used for doc values                                       │                       0 MB │
│ Heap used for terms                                            │                       0 MB │
│ Heap used for norms                                            │                       0 MB │
│ Heap used for points                                           │                       0 MB │
│ Heap used for stored fields                                    │                       0 MB │
│ Segment count                                                  │                        395 │
│ Total Ingest Pipeline count                                    │                      20068 │
│ Total Ingest Pipeline time                                     │                    1.629 s │
│ Total Ingest Pipeline failed                                   │                          0 │
│ Min Throughput                                                 │            21020.27 docs/s │
│ Mean Throughput                                                │            21020.27 docs/s │
│ Median Throughput                                              │            21020.27 docs/s │
│ Max Throughput                                                 │            21020.27 docs/s │
│ 50th percentile latency                                        │       884.0546769788489 ms │
│ 100th percentile latency                                       │       894.0900478046387 ms │
│ 50th percentile service time                                   │       884.0546769788489 ms │
│ 100th percentile service time                                  │       894.0900478046387 ms │
│ error rate                                                     │                     0.00 % │
╰────────────────────────────────────────────────────────────────┴────────────────────────────╯

--- Benchmark results for package: kubernetes - END   ---
Done

@ali786XI ali786XI added enhancement New feature or request Integration:kubernetes Kubernetes labels Feb 9, 2024
@ali786XI ali786XI self-assigned this Feb 9, 2024
@ali786XI ali786XI changed the title rally benchmark kubernetes.state_container [O11y][Kubernetes] Rally benchmark kubernetes.state_container Feb 9, 2024
@elasticmachine
Copy link

elasticmachine commented Feb 9, 2024

🚀 Benchmarks report

Package aws 👍(11) 💚(3) 💔(3)

Expand to view
Data stream Previous EPS New EPS Diff (%) Result
apigateway_logs 13888.89 11627.91 -2260.98 (-16.28%) 💔
elb_logs 5464.48 3021.15 -2443.33 (-44.71%) 💔
emr_logs 25641.03 18867.92 -6773.11 (-26.42%) 💔

Package mysql 👍(0) 💚(0) 💔(2)

Expand to view
Data stream Previous EPS New EPS Diff (%) Result
error 14925.37 7092.2 -7833.17 (-52.48%) 💔
slowlog 20833.33 13333.33 -7500 (-36%) 💔

To see the full report comment with /test benchmark fullreport

@ali786XI ali786XI marked this pull request as ready for review February 12, 2024 05:52
@ali786XI ali786XI requested a review from a team as a code owner February 12, 2024 05:52
"name": "state_container"
},
"event": {
"duration": {{ $event_duration }},
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ingested fields is missing
eg. event.ingested: "2024-02-08T12:18:41Z"

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please refer here for the same doubt I had. It's taking the time actually when the event is ingested.

"phase": "{{ $status_phase }}",
"ready": {{ $status_ready }},
"restarts": {{ $restarts }},
"reason": "{{ $reason }}"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
"reason": "{{ $reason }}"
"last_terminated_reason": "{{ $reason }}"

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Took reference from here. Found this field you are mentioning as kubernetes.container.status.last_terminated_reason

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am testing with a real cluster and just spotted that as well.
Screenshot 2024-02-12 at 3 11 00 PM

In beats: https://github.com/elastic/beats/blob/main/metricbeat/module/kubernetes/state_container/state_container.go#L69

Will have a second look in the background to explain the diff

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When container is in terminated or waiting state ,the status.reason is populated
When container is in running state, then the status kubernetes.container.status.last_terminated_reason is populated

See info here and here

So it is probably another if case for you, that when phase is running then populate this kubernetes.container.status.last_terminated_reason

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@gizas Thanks for this. Added the same.

},
"id": "container-{{ $rangeofid }}",
"status": {
"phase": "{{ $status_phase }}",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The only values that can probably cause us problems in the future are these one for the status. Because the way it is implemented now a 'running' phase can have ready: false , because those values are randomly assigned.

Can you consider an if case like https://github.com/elastic/elastic-integration-corpus-generator-tool/blob/main/assets/templates/aws.billing/schema-b/gotext.tpl#L64 ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure. Updated the same. Please have a look.

@gizas
Copy link
Contributor

gizas commented Feb 12, 2024

@aliabbas-elastic I have tested them and work fine. Thanks for this. Added some comments.

Also because the dataset is named state_container, I would also advise that all folders/paths should be renamed to state_container-benchmark and change to state_container-benchmark.yml

@ali786XI
Copy link
Contributor Author

ali786XI commented Feb 12, 2024

Also because the dataset is named state_container, I would also advise that all folders/paths should be renamed to state_container-benchmark and change to state_container-benchmark.yml

@gizas Yes initially kept the name as you mentioned but facing these below error in linting

Lint the package
2024/02/12 18:54:13 Warning: references found in dashboard kibana/dashboard/kubernetes-f4dc26db-1b53-4ea2-a78b-1bfab8ea267c.json: kubernetes-ee55101a-9f62-44da-b64c-ffa1eb5abad8 (search) (SVR00004)
Error: checking package failed: linting package failed: found 1 validation error:
   1. item [state_container-benchmark] is not allowed in folder [/aliabbas-elastic/integrations/packages/kubernetes/_dev/benchmark/rally]

Saw couple of data streams as well with the present naming convention. Keeping the naming as per you suggested is definitely better. Is this a bug in the check command?
cc:- @aspacca

@gizas
Copy link
Contributor

gizas commented Feb 12, 2024

Also because the dataset is named state_container, I would also advise that all folders/paths should be renamed to state_container-benchmark and change to state_container-benchmark.yml

@gizas Yes initially kept the name as you mentioned but facing these below error in linting

You are right. I could not find in elastic-package where this is done. in my tests:

#Changing the folder to state_container-benchmark

elastic-package lint
Error: linting package failed: found 1 validation error:
   1. item [state_container-benchmark] is not allowed in folder [/Users/andreasgkizas/elastic/integrations/packages/kubernetes/_dev/benchmark/rally]

Changing the file only and NOT the folder to state_container-benchmark.yml

❯ elastic-package lint
2024/02/12 16:13:18  INFO New version is available - v0.97.0. Download from: https://github.com/elastic/elastic-package/releases/tag/v0.97.0
Lint the package
2024/02/12 16:13:19 Warning: references found in dashboard kibana/dashboard/kubernetes-f4dc26db-1b53-4ea2-a78b-1bfab8ea267c.json: kubernetes-ee55101a-9f62-44da-b64c-ffa1eb5abad8 (search) (SVR00004)
Done

Then the command becomes:
❯ elastic-package benchmark rally --benchmark state_container-benchmark

Dont you think is better? Or creates more confusion?

@ali786XI
Copy link
Contributor Author

Dont you think is better? Or creates more confusion?

Sounds better than the previous as there will be no ambiguity in running a benchmark without looking at the benchmark structure.

@ali786XI ali786XI force-pushed the kubernetes_benchmark_statecontainer branch from a4ff42c to 27d02ea Compare February 12, 2024 15:02
@ali786XI ali786XI requested review from a team as code owners February 12, 2024 15:04
@aspacca
Copy link
Contributor

aspacca commented Feb 13, 2024

@aliabbas-elastic . @gizas

the limit on folder naming is by package-spec:
https://github.com/elastic/package-spec/blob/main/spec/integration/_dev/benchmark/spec.yml#L26-L62

folder: pattern: '^[a-z0-9]+-benchmark$'

no limit on the scenario name: pattern: '^.+\.yml$'

it's just a matter of changing the specs, please proceed as you see it fitting :)

@elasticmachine
Copy link

💚 Build Succeeded

History

cc @aliabbas-elastic

Copy link

Quality Gate passed Quality Gate passed

Kudos, no new issues were introduced!

0 New issues
0 Security Hotspots
No Coverage information No data about Coverage
No Duplication information No data about Duplication

See analysis details on SonarQube

@ali786XI ali786XI merged commit 2829b1c into elastic:main Feb 15, 2024
5 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants