Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add the shared state to the global scope to get previous data #8447

Merged
merged 3 commits into from
Nov 30, 2020
Merged

Add the shared state to the global scope to get previous data #8447

merged 3 commits into from
Nov 30, 2020

Conversation

essobedo
Copy link
Contributor

@essobedo essobedo commented Nov 21, 2020

Required for all PRs:

  • Signed CLA.
  • Associated README.md updated.
  • Has appropriate unit tests.

Fix for #7793

Motivation

We would like to have a way to compare the current metric with the previous one

Modifications:

  • Adds the shared state to the global scope
  • Adds a new simple script to show how to use the shared state
  • Adds a link to the new script into the README.md

@sjwang90 sjwang90 added area/starlark feat Improvement on an existing feature such as adding a new setting/mode to an existing plugin labels Nov 23, 2020
@srebhan srebhan self-assigned this Nov 23, 2020
@srebhan
Copy link
Member

srebhan commented Nov 23, 2020

While the code looks good, I'm not sure this is actually a good thing to do. There are multiple reasons IMO:

  1. I think it is not guaranteed that the metrics are sent to apply in a time-ordered way. That means you will soon want to extend this to store N past metrics to sort them and might create all kind of corner-cases.
  2. I'd rather think of an option to make the global scope read/write to carry over information between apply calls e.g. to have a counter or similar. However, this will probably create hard-to-debug problems as the plugin becomes stateful.
  3. This circumvents the one-metric in multiple-metrics out scheme. Wouldn't it be better to add another function e.g. applyAll that you can use in cases where you need to process over multiple metrics? We could even copy the starlark processor and add a starlark aggregator...

@ssoroka what is your opinion on this?

Copy link
Contributor

@ssoroka ssoroka left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code review, will respond to the other comments separately.

plugins/processors/starlark/starlark.go Outdated Show resolved Hide resolved
plugins/processors/starlark/starlark.go Outdated Show resolved Hide resolved
plugins/processors/starlark/starlark.go Outdated Show resolved Hide resolved
plugins/processors/starlark/starlark.go Outdated Show resolved Hide resolved
@ssoroka
Copy link
Contributor

ssoroka commented Nov 23, 2020

@srebhan

I think it is not guaranteed that the metrics are sent to apply in a time-ordered way. That means you will soon want to extend this to store N past metrics to sort them and might create all kind of corner-cases.

Generally they are as most inputs place them in order and Telegraf keeps them in order, but you're right that it's not a guarantee. It depends on the setup. Other than the complexity, I think storing a set of metrics should still work.

I'd rather think of an option to make the global scope read/write to carry over information between apply calls e.g. to have a counter or similar. However, this will probably create hard-to-debug problems as the plugin becomes stateful.

I considered that, but it might be better for avoiding bugs to make it explicit.

This circumvents the one-metric in multiple-metrics out scheme. Wouldn't it be better to add another function e.g. applyAll that you can use in cases where you need to process over multiple metrics? We could even copy the starlark processor and add a starlark aggregator...

Processing over multiple metrics with batches doesn't really work (where do you draw that batch border, what if you need a metric outside that grouping?). In the future all aggregators are probably going to be reimplemented as processors (likely transparently), as processors + state are exactly aggregators.

@essobedo essobedo changed the title Add a shared cache to get previous data Add a shared state to get previous data Nov 23, 2020
@essobedo essobedo requested a review from ssoroka November 23, 2020 18:43

// Store the pair (key, value) into the shared state. If the value is None, the pair (key, value) will be
// removed from the shared state if it exists
func (s *State) Store(key string, value starlark.Value) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should there be a configurable capacity limit here to prevent OOMs?

I can foresee cases where the state store usage might blow up, for example if the name is defined based on tag values and the cardinality unexpectedly increases.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

that's a good question. I'm not sure if we should handle this or just add a warning to watch out for this.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One alternative is to just the ability to check the size of the state and clear it if it exceeds a threshold. This could be implemented as additional built-in Starlark functions or just a configuration setting.

Running into an OOM is unlikely but for some of my use cases I need to trust that there is some reasonable guaranteed upper bound for memory usage when dealing with high-volume, high-cardinality data.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I could see how it might be difficult to manage this in Starlark itself.. I'll give some thought to that.

@srebhan
Copy link
Member

srebhan commented Nov 26, 2020

@ssoroka well we currently have the global scope locked as Daniel didn't want the plugin to be stateful to avoid all those nasty hard-to-debug problems. This now adds a state to the plugin. Fair enough, there are plenty of use-cases for it.

I considered that, but it might be better for avoiding bugs to make it explicit.

I don't see how this helps. We still have a state.

We can do the same with inserting globals["state"] = starlark.NewDict(0) just before globals.Freeze() (line 80) and the have a script:

state = {
  "counter": 0,
  "cache": []
}

def apply(metric):
  metric.fields["lala"] = state["counter"]
  metric.fields["bubu"] = len(state["cache"])
  state["counter"] += 1
  state["cache"].append(deepcopy(metric))

  return metric

That's 1 line of code...

@ssoroka
Copy link
Contributor

ssoroka commented Nov 26, 2020

@ssoroka well we currently have the global scope locked as Daniel didn't want the plugin to be stateful to avoid all those nasty hard-to-debug problems. This now adds a state to the plugin. Fair enough, there are plenty of use-cases for it.

All the plugins have state, everything in the plugin struct is stateful.

We can do the same with inserting globals["state"] = starlark.NewDict(0) just before globals.Freeze() (line 80) and the have a script:

state = {
  "counter": 0,
  "cache": []
}

I like this too, but it's maybe a little less obvious that something special is happening with this variable as opposed to other global variables. There's another consideration that I want to soon be persisting plugin state externally to Telegraf, and it would be interesting if this plugin could take advantage of that. If it's a global var I need to reach into the plugin to get the values, but if it's functions they can easily hook into an external state store/load interface.

I'm not sure what you're saying with your example. Here's the same code with Store/Load:

def apply(metric):
  counter = Load("counter") || 0  # we should change this to support defaults, like Load("counter", 0)
  cache = Load("cache") || [] # prefer Load("cache", [])
  metric.fields["lala"] = counter
  metrics.fields["bubu"] = len(cache)
  Store("counter", counter + 1)
  Store("cache", cache.append(deepcopy(metric)))

  return metric

That's 1 line of code...

I count the state global version at 11 lines and the Store/Load funcs at 9; I don't see a huge difference.

@essobedo essobedo requested a review from ssoroka November 27, 2020 22:05
@essobedo essobedo changed the title Add a shared state to get previous data Add the shared state to the global scope to get previous data Nov 30, 2020
@essobedo essobedo requested a review from ssoroka November 30, 2020 20:20
@ssoroka ssoroka merged commit 01fc69d into influxdata:master Nov 30, 2020
@essobedo essobedo deleted the 7793/shared_cache branch December 5, 2020 09:38
@thatsafunnyname
Copy link
Contributor

Thanks for adding the shared state, I found it useful in #8903 .

I think this common question in the Starlark Processor README.md should be updated to reflect the newly available shared state:

https://github.com/influxdata/telegraf/blame/8ddbab47a46e256281392ad0aac876715189c117/plugins/processors/starlark/README.md#L166

How can I save values across multiple calls to the script?

Telegraf freezes the global scope, which prevents it from being modified.
Attempting to modify the global scope will fail with an error.

Maybe add:

A shared global dictionary named state exists, this can be used by the apply function.
See an example of this in plugins/processors/starlark/testdata/compare_metrics.star

Thanks.

@essobedo
Copy link
Contributor Author

@thatsafunnyname Makes sense indeed, feel free to create a dedicated ticket for it

thatsafunnyname added a commit to thatsafunnyname/telegraf that referenced this pull request Mar 1, 2021
For  influxdata#8907 .  After influxdata#8447  , this common question in the Starlark Processor README.md should be updated to reflect the newly available shared state.
arstercz pushed a commit to arstercz/telegraf that referenced this pull request Mar 5, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/starlark feat Improvement on an existing feature such as adding a new setting/mode to an existing plugin
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants