Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Benchmark stream command #1584

Merged
merged 13 commits into from
Dec 8, 2023
Merged

Conversation

aspacca
Copy link
Contributor

@aspacca aspacca commented Dec 5, 2023

Similarly to benchmark rally command we want to generate schema-b documents for a given integrations.
Instead of creating a rally track out of them we will stream them, according to a configurable rate, directly to an ES cluster, using bulk requets

see #1541 for more context

usage (from a package root):

elastic-package benchmark stream -v --events-per-period 10 --period-duration 1s

or

elastic-package benchmark stream -v --events-per-period 10 --period-duration 1s --backfill -15m

or

elastic-package benchmark stream -v --benchmark container-benchmark --events-per-period 10 --period-duration 1s --backfill -15m

flags:

  • --benchmark: run a specific benchmark, if not present all benchmarks for a packages will be run
  • --backfill: negative duration to backfill events ingestion for, if not present event will be ingested since now
  • --period-duration: time between each bulk request
  • --events-per-period: events on each bulk request
  • --timestamp-field: field from generator config used for @timestamp event's field (default "timestamp": it is required for backfill and overriding periodsettings)
  • --perform-cleanup: passing this flag will delete documents in the streaming data streams before and after the streaming, as well as uninstalling the integration at the end

@aspacca aspacca requested review from jsoriano and ruflin December 5, 2023 08:51
@aspacca aspacca self-assigned this Dec 5, 2023
@aspacca aspacca mentioned this pull request Dec 5, 2023
@ruflin
Copy link
Member

ruflin commented Dec 5, 2023

--benchmark: run a specific benchmark, if not present all benchmarks for a packages will be run

For now, it is really nice that it runs just all tracks. I expect eventually we need to have something like "default" tracks but we can put that for later.

--ticker-duration: time between each bulk request

ticker seems to be very Golang specific. Ideas for alternative names?

I did a quick run of the code, so far all looks good 🎉

@aspacca
Copy link
Contributor Author

aspacca commented Dec 5, 2023

ticker seems to be very Golang specific. Ideas for alternative names?

--bulk-request-interval: time between each bulk request
--events-per-bulk-request: events on each bulk request

?

@ruflin
Copy link
Member

ruflin commented Dec 5, 2023

Is it possible to configure the stream command to ship data to a cluster not started with elastic-package? I assume the env variables could be adjusted and it would just work 🤔

@aspacca
Copy link
Contributor Author

aspacca commented Dec 5, 2023

I assume the env variables could be adjusted and it would just work 🤔

indeed:

ELASTIC_PACKAGE_ELASTICSEARCH_HOST
ELASTIC_PACKAGE_ELASTICSEARCH_PASSWORD
ELASTIC_PACKAGE_ELASTICSEARCH_USERNAME
ELASTIC_PACKAGE_KIBANA_HOST

Copy link
Member

@ruflin ruflin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have been thinking more about our conversation around the naming. One thing I realised that might be obvious is that if multiple benchmarks are run in parallel, all of them will have the same ticker period. And the config is per benchmark. Lets assume I have the following config and 2 benchmarks exist:

--ticker-duration: 10
--events-per-ticker: 1

I assume this sends 2 events, 1 for each benchmark, every 10 seconds. The ticker duration reminds me a lot of the period config in Metricbeat. And the second is then events-per-period? I would stay away from bulk-request as it would also be ok, if we don't use bulk requests :-) Maybe exchange period with duration?

An alternative config would be --events-per-second=0.1. This would ship one event every 10 seconds. Is this more consumable?

Also we should have defaults for all configs, so if they are skipped it just works. For example if we stick to the previous configs, have 10s period as default and 1 event per 10s.

cmd/benchmark.go Show resolved Hide resolved
cmd/benchmark.go Show resolved Hide resolved
internal/benchrunner/runners/stream/runner.go Show resolved Hide resolved
internal/cobraext/flags.go Outdated Show resolved Hide resolved
@aspacca
Copy link
Contributor Author

aspacca commented Dec 5, 2023

An alternative config would be --events-per-second=0.1. This would ship one event every 10 seconds. Is this more consumable?

it's better because we have only one flag, but it's rather tricky to express period duration where you want resolution of minutes, I would keep the two different flags, and just rename them

@ruflin
Copy link
Member

ruflin commented Dec 6, 2023

I would keep the two different flags, and just rename them

Ok, lets go with this for now.

We have now quite a list of command and flags, as soon as things settle down a bit more there is an opportunity to look at all flags together and unify / standardise / cleanup where we can.

@ruflin
Copy link
Member

ruflin commented Dec 7, 2023

I did a quick test with the most recent AWS package by running: elastic-package benchmark stream -v This uses all the default. Opening Kibana with Logs Explorer shows the following result which is great!

Screenshot 2023-12-07 at 09 29 50

You can stream data to a remote ES cluster setting the following environment variables:

ELASTIC_PACKAGE_ELASTICSEARCH_HOST=https://my-deployment.es.eu-central-1.aws.foundit.no
ELASTIC_PACKAGE_ELASTICSEARCH_USERNAME=elastic
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do API keys work here? Serverless has mostly API keys

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

no, it doesn't https://github.com/elastic/elastic-package/blob/main/internal/stack/clients.go#L22-L25

but it should be possible to run something like:

elastic-package stack up --provider serverless
$(elastic-package stack shellinit)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jsoriano Seems like a missing feature in elastic-package? Should we open an issue?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, this is missing, please create an issue.

BenchStreamPeriodDurationFlagName = "period-duration"
BenchStreamPeriodDurationFlagDescription = "duration of the period between each ingestion cycle: expressed as a positive duration"

BenchStreamPerformCleanupFlagName = "perform-cleanup"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The part I stumbled into, this does not only cleanup at the end, but it also does cleanup on start. Is this expected?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes, it is intended: I thought it might be useful if I changed the template and wanted to compare data before and after

while indeed, since now backfill has a default value, we will have duplicated data for the last 15 minutes.
I can change to avoid cleanup only in the end, but to do it on start

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On the fence about this one. As now perform-cleanup is not the default anymore, I think it is less of an issue. Lets see how it is used and come back to this but leave it for now.

@@ -65,6 +65,21 @@ const (
BenchCorpusRallyUseCorpusAtPathFlagName = "use-corpus-at-path"
BenchCorpusRallyUseCorpusAtPathFlagDescription = "path of the corpus to use for the benchmark: if present no new corpus will be generated"

BenchStreamBackFillFlagName = "backfill"
BenchStreamBackFillFlagDescription = "amount of time to ingest events for since starting from now: expressed as a negative duration"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The description is very clear, but of course I put in 1m at first.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is indeed a bit weird to have a parameter that only accepts negative numbers. Could we revert it, so it accepts only positive numbers and we negate it in code?

Copy link
Member

@ruflin ruflin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code LGTM

As a follow up, I think there is some potential to refactor / cleanup but we can take this separate.

Generator *generator `config:"generator" json:"generator"`
}

type generator struct {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This seems to be almost the same as https://github.com/elastic/elastic-package/blob/main/internal/benchrunner/runners/rally/scenario.go, same for other objects. Is the duplication on purpose?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1, we should probably try to reduce duplication, there is quite a lot between benchmark runners. This would also help to evaluate better how much logic to maintain we are adding with each runner.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes, let's plan for another PR moving repeated code to some common package under benchmark

@ruflin
Copy link
Member

ruflin commented Dec 7, 2023

@aspacca Is there a way we could have tests for these features

Copy link
Member

@jsoriano jsoriano left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Mostly looks good to me. Added a comment about ignoring the error in one flag, and some other questions that would not be blockers before merging.

cmd/benchmark.go Outdated Show resolved Hide resolved
Generator *generator `config:"generator" json:"generator"`
}

type generator struct {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1, we should probably try to reduce duplication, there is quite a lot between benchmark runners. This would also help to evaluate better how much logic to maintain we are adding with each runner.

@@ -65,6 +65,21 @@ const (
BenchCorpusRallyUseCorpusAtPathFlagName = "use-corpus-at-path"
BenchCorpusRallyUseCorpusAtPathFlagDescription = "path of the corpus to use for the benchmark: if present no new corpus will be generated"

BenchStreamBackFillFlagName = "backfill"
BenchStreamBackFillFlagDescription = "amount of time to ingest events for since starting from now: expressed as a negative duration"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is indeed a bit weird to have a parameter that only accepts negative numbers. Could we revert it, so it accepts only positive numbers and we negate it in code?

internal/benchrunner/runners/stream/runner.go Show resolved Hide resolved
Copy link
Member

@jsoriano jsoriano left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we need to handle the error in the flag, as we do with all other flags. For the rest it LGTM.

cmd/benchmark.go Outdated Show resolved Hide resolved
@aspacca
Copy link
Contributor Author

aspacca commented Dec 8, 2023

@jsoriano

The error here can happen if the flag is not defined or if there is some type conflict, it should not happen if the user does not provide the flag. We should report the error if it happens.

I don't want to return an error if there is some type conflict, I want the consumer of the command to be able to see the command running with the default value without any error. :)

but we can warn the user with a log entry :)

@aspacca
Copy link
Contributor Author

aspacca commented Dec 8, 2023

@aspacca Is there a way we could have tests for these features

@ruflin we can have some end2end tests: I will add in a separate PR

cmd/benchmark.go Outdated Show resolved Hide resolved
@elasticmachine
Copy link
Collaborator

💚 Build Succeeded

History

cc @aspacca

@jsoriano jsoriano merged commit f075590 into elastic:main Dec 8, 2023
4 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants