Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use the registry for (sub-)metric validation and move data crunching out of the Engine #2426

Merged
merged 11 commits into from
Mar 29, 2022

Conversation

na--
Copy link
Member

@na-- na-- commented Mar 9, 2022

This PR is the first part of #1889. It aims to prepare the Engine for complete annihilation 😅 I want to get completely rid of it and use its constituent parts separately instead, basically to almost completely decouple the test execution from the metrics handling.

However, the Engine has tons of tests that would have to be re-written. So this PR is an attempt to softly split the Engine by moving things around and defining and implementing its replacement components, while still keeping them under the same Engine facade. This way we can have a reasonably high assurance we haven't broken anything, since all of the old tests still work with minimal adjustment. Hopefully, this will allow us to merge things gradually, not have to do it all at once.

Given the extensive refactoring here, there were a few opportunities to fix some bugs and side-issues along the way:

Overall, I am not sure if it will be easier to review this commit by commit or all at once. The commits are logically separated, but a lot of the same code is changed in multiple commits because of that 🤷‍♂️

@na-- na-- added this to the v0.38.0 milestone Mar 9, 2022
@na--
Copy link
Member Author

na-- commented Mar 9, 2022

The xk6 test is failing because of this: #2428
I'd prefer someone else to fix it in another PR and then I'll rebase

@na--

This comment was marked as outdated.

@oleiade oleiade mentioned this pull request Mar 14, 2022
Base automatically changed from cleanup-6 to master March 18, 2022 11:23
core/engine.go Outdated
for metricName, thresholds := range e.options.Thresholds {
metric, err := e.getOrInitPotentialSubmetric(metricName)

if e.runtimeOptions.NoThresholds.Bool {
Copy link
Contributor

@mstoykov mstoykov Mar 18, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shouldn't this skip the whole parsing either way - irregardless of error?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm I don't think so, given that thresholds are basically the only way to currently create sub-metrics for the end-of-test summary, and we can have --no-thresholds without also having --no-summary. So, if --no-thresholds are enabled, I'm following the previously established logic that we shouldn't fail the test for wrongly configured thresholds, but I'm still trying to parse them for sub-metrics, just emit any errors as warnings.

core/engine.go Outdated
e.Metrics[m.Name] = m
m.Observed = true
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I really dislike this new property on the Metric type. I would really prefer if we go back to the map.

I do think Metric already has too many responsibilities and now adding it knowing it was "observed" seems contra productive.
we even still add it to the map so ... 🤷 Why not just get it out of it.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I really dislike it as well, I want to completely split apart the current Metric into at least two parts:

  • The first part, which should be the actual Metric, will be just the Name, Type and Contains- that should be the only thing that theRegistry` actually controls and cares about
  • A second completely separate struct should be the Sink and Observed and probably the threshold and submetric things. That will only be a concern of the MetricsRegistry. I don't think they should even be exported... In any case, the mapping from a *Metric to that second struct will happen only in the MetricsRegistry.

Unfortunately, until we have that separation, I kind of had to put Observed in the Metric struct, since that's where the Sink already was 😞 Doing otherwise without even more refactoring would cause us to have a map lookup for every sample of every metric that k6 generates... 😞 I intended on splitting this when I work on the time series refactoring.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

btw this is what I mean by this TODO:

k6/stats/stats.go

Lines 449 to 450 in f06d6f0

// TODO: decouple the metrics from the sinks and thresholds... have them
// linked, but not in the same struct?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

e a map lookup for every sample of every metric that k6 generates..

Didn't we have the same up to this point as well?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, we did, but we don't actually need to have that, so I guess you can consider it an optimization? 😅 :

I probably dislike the extra item in Metric even more than you, but with Sink already in there, this actually seems a better (temporary and transient) architecture to me, since all of this state is now kept in Metric and will be easy to refactor into a different struct all at the same time. Previously, the Observed state was implicitly tracked by that map in the Engine, while all of the other state was in every Metric...

If you want, I am fine with implementing my TODO above now, before we merge this PR, though I'd prefer to do it in a separate PR after this #2442 PR from @oleiade is also merged. It will leave the metrics.Registry as the only source of truth for metrics, so we'll be easily able to assign them sequential IDs and use that in the MetricsEngine without the need for a map.

@mstoykov
Copy link
Contributor

mstoykov commented Mar 21, 2022

@na-- the panic you posted is due to the test parallalization not because of the code.

Copy link
Contributor

@olegbespalov olegbespalov left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ugh, it's a big and good one 👍

In general looks good, I left a few comments.

}

if err != nil {
return fmt.Errorf("invalid metric '%s' in threshold definitions: %w", metricName, err)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe not for this PR, but what do you think about first trying to collect all invalid metrics and return an error that contains all wrong variations?

metrics/engine/engine.go Outdated Show resolved Hide resolved
metrics/engine/engine.go Outdated Show resolved Hide resolved
metrics/engine/engine.go Outdated Show resolved Hide resolved
Copy link
Contributor

@codebien codebien left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some minor things

Comment on lines +7 to +10
// TODO: maybe even move the outputs to a sub-folder here? it may be worth it to
// do a new Output v2 implementation that uses channels and is more usable and
// easier to write? this way the old extensions can still work for a while, with
// an adapter and a deprecation notice
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we should have this comment in the issue instead to have it here.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I kind of already also have it the issue though, see the end of #2430 (comment)

core/engine.go Outdated Show resolved Hide resolved
core/local/local.go Outdated Show resolved Hide resolved
Comment on lines +101 to +102
e.ingester = me.GetIngester()
outputs = append(outputs, e.ingester)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it not possible to have this in the same place where we set the other outputs?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It may be, but I think it will actually be worse to do it that way.

he other outputs receive a whole bunch of things this one doesn't need, while this one needs the MetricsEngine, something we create a few lines above here (or in cmd/run.go in the follow up PR without an Engine). Besides, we add this output based on completely different criteria only some of the time, based on RuntimeOptions values...

cmd/run.go Outdated Show resolved Hide resolved
oleiade
oleiade previously approved these changes Mar 22, 2022
Copy link
Member

@oleiade oleiade left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM 🎉

A bunch of minor, non-blocking comments on my side. I do agree with the brought up argument about metrics.Observed; I dislike it too, but I understand the arguments in (temporary favor of it, and can live with it for now.

core/engine.go Outdated Show resolved Hide resolved
core/engine.go Outdated Show resolved Hide resolved
core/engine.go Outdated Show resolved Hide resolved
core/engine.go Outdated Show resolved Hide resolved

func (me *MetricsEngine) getOrInitPotentialSubmetric(name string) (*stats.Metric, error) {
// TODO: replace with strings.Cut after Go 1.18
nameParts := strings.SplitN(name, "{", 2)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I ended up doing something somewhat similar to extract metric/submetric name for thresholds validation. I ended up using strings.FieldFunc, which, I believe, could lead to somewhat clearer code here?

delimiterFn := func(c rune) bool {
    return c == '{' || c == '}' || c == ':'
}

fmt.Println("Fields are: %q", strings.FieldsFunc("metric_name{foo:bar}", f))

The resulting length is either 1 (metric), or an odd value >=3. No need to match the closing } for instance. You could also drop the matching of : in your case, I believe.

Hope that's helpful 🙇🏻

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wouldn't that also accept metric_name}foo{bar: as valid? 😕

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's a fair point. I need to look into it. I'll add a test for it in #2463. Commit e4e51f0 introduces a ParseMetricName function.

na-- added 7 commits March 29, 2022 11:26
This is a prerequisite for solving other issues like always evaluating thresholds correctly, and as a side-benefit, it also allows us to validate them in the init context, before the test has started.
This allows us to slowly deconstruct and split apart the Engine. It also clears the way for us to have test suites, where every test has a separate pool of VUs and its own ExecutionScheduler.
Copy link
Member

@oleiade oleiade left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🍾

@na-- na-- merged commit 7694e8d into master Mar 29, 2022
@na-- na-- deleted the cleanup-7 branch March 29, 2022 13:26
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
5 participants