-
Notifications
You must be signed in to change notification settings - Fork 456
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[O11y][MySQL] Rally benchmark mysql.performance
#8761
Conversation
🌐 Coverage report
|
🚀 Benchmarks reportTo see the full report comment with |
Hi @aliabbas-elastic, please update your branch with the latest contents from main branch. There was an important PR merged updating the CI pipelines. Thanks! |
… into mysql_benchmark_performance
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM!
min: 0 | ||
max: 200 | ||
fuzziness: 0.2 | ||
cardinality: 100 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i'm still a little confused about how cardinality of fields 'links' them. does this mean that for a given host id, the fetch_count
will always be the same?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For this lets say the events we generate are 20,000 so for these events there would be 100 unique values of fetch_count
generated. In other sense, I would expect a single value to repeat for over 200 times as per the doc saying this.
I don't think this metric value links to host id field as the values generated are totally random
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
are we on the right track here @aspacca ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For this lets say the events we generate are 20,000 so for these events there would be 100 unique values of
fetch_count
generated. In other sense, I would expect a single value to repeat for over 200 times as per the doc saying this.
@aliabbas-elastic , @tommyers-elastic this is indeed correct
but given that the cardinality of host.name
is 100 too, a single host.name
value will be repeated for 200 time as well: more specifically, the same Nth value of fetch_count
will be repeated always in the same event where the same Nth value of host.name
will be repeated
if either you did on purpose, or you don't care about this details, no need to change anything
instead if you explicitly want to avoid such "link", you should set different cardinalities between the two fields
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@aspacca Thanks for correcting here. I think there is no meaning of generating different host.name
and keeping them as it is would be better.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i think as-is we generate unrealistic data here, since the fetch_count will always be identical for each host.
if we wanted to keep each field (host.name, fetch_count) with a cardinality of approximately 100, what configuration would you suggest here @aspacca ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@tommyers-elastic it depends on how many events we want to generate
imagine we chose 101 and 99, this means the the first 9999 events (101 * 99) will have non linked host.name
and fetch_count
, but then the same exact series of 9999 events of non linked value pairs will be repeated.
there are not exact cardinality values to set, but for not being them a factor one of the other (so: no 100 and 50 - these will lead to two fields just to be linked 1:2 instead of 1:1 :))
once you follow this rule the "best" values are just the values that, multiplied, produce a series that's long enough for be considered "realistic": that's domain knowledge somehow that @aliabbas-elastic is best suited than me to apply :)
packages/mysql/_dev/benchmark/rally/performance-benchmark/template.ndjson
Outdated
Show resolved
Hide resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
min: 0 | ||
max: 200 | ||
fuzziness: 0.2 | ||
cardinality: 100 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For this lets say the events we generate are 20,000 so for these events there would be 100 unique values of
fetch_count
generated. In other sense, I would expect a single value to repeat for over 200 times as per the doc saying this.
@aliabbas-elastic , @tommyers-elastic this is indeed correct
but given that the cardinality of host.name
is 100 too, a single host.name
value will be repeated for 200 time as well: more specifically, the same Nth value of fetch_count
will be repeated always in the same event where the same Nth value of host.name
will be repeated
if either you did on purpose, or you don't care about this details, no need to change anything
instead if you explicitly want to avoid such "link", you should set different cardinalities between the two fields
packages/mysql/_dev/benchmark/rally/performance-benchmark/template.ndjson
Outdated
Show resolved
Hide resolved
"{{ generate "host_ip" }}" | ||
], | ||
"mac": [ | ||
"02:42:c0:a8:f4:07" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
any implication of having different host.ip with the same mac address? (like for dashboards or similar)
if not, no need to change
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Keeping a static value for these fields would be fine. Let me update it.
Proposed commit message
performance
data stream ofMySQL
Checklist
How to test this PR locally
Run this command from package root
elastic-package benchmark rally --benchmark performance-benchmark -v
Related issues
Screenshots