Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[agents] definition of ACTIVE/DISABLED_INSTRUMENTATION #92

Closed
axw opened this issue May 28, 2019 · 23 comments
Closed

[agents] definition of ACTIVE/DISABLED_INSTRUMENTATION #92

axw opened this issue May 28, 2019 · 23 comments

Comments

@axw
Copy link
Member

axw commented May 28, 2019

This is an offshoot of #76.

There are some questions about whether active and disableInstrumentations config should be reloadable, or whether they should be static.

Currently, in the Node.js agent, starting the agent with active: false would skip instrumentation entirely. This means that it is not possible to dynamically "activate" the agent.

Elsewhere, users of the Java agent have assumed that setting active: false would completely disable the agent: elastic/apm-agent-java#639. Given the config's name, this is pretty a fair assumption. In that case it was due to metrics which we can fix, but still leaves a question about remote config polling in the future.

So, open questions:

  • What should active: false do? Should this just disable event generation, or should it disable all code paths, including remote config polling?
    • If active: false disables remote config polling, should we have some other config that allows the agent to run "dormant" - polling remote config, but not generating events?
    • If active: false disables only event generation, should we have some other config which disables everything, including remote config polling?
  • Should we have ternary state for disabling instrumentation? On, off (but possible to turn on), skipped entirely (never instrumented in the first place - cannot turn on).
@SergeyKleyman
Copy link
Contributor

SergeyKleyman commented May 28, 2019

I would prefer to have two separate Boolean configuration settings:

  • enabled (or any other suitable name) - this setting is read by the agent only during the startup and if it is set to false the agent should do as little as possible in order to simulate the state as if the agent is not there at all. In particular this would mean not communicating with APM server (we can discuss if it also means not writing any logs)
  • active - unlike enabled the agent should support changing this configuration setting after the startup. If it's set to false the agent should "shrink" its footprint to the "minimum" as long as the first part of this requirement is satisfied (so agent should continue polling remote config and it should be able to go back and forth between "shrunk to minimum" and "fully functioning" states). The goal in minimizing the footprint is the same as in enabled - to simulate the state as if the agent is not there at all but there might be additional development cost, performance, etc. constraints affecting how small is this minimal footprint can be so we can leave it for each agent team to decide.

@axw
Copy link
Member Author

axw commented May 28, 2019

@SergeyKleyman I was thinking the same. The main issue I have with this is the subtle difference in meaning between "enabled" and "active" - I've not had any better ideas for an alternative name to "enabled". I suppose these are not things that people will want to do every day, so maybe that's not a big issue anyway.

@SergeyKleyman
Copy link
Contributor

In addition to introduction of enabled we can also rename active setting to stress the distinction between them - maybe paused? The additional advantage of having completely new configuration settings would be easier migration for agents that had different semantics for active.

@mikker
Copy link
Contributor

mikker commented May 28, 2019

I like @SergeyKleyman's suggestion and I agree we could probably come up with more telling names.

@mikker
Copy link
Contributor

mikker commented May 28, 2019

Didn't think too long but how about keeping enabled for completely turned off or not, and recording for sending data or not.

@SergeyKleyman
Copy link
Contributor

Yes, I agree - recording is better than paused because it's aligned with enabled in that they are both "positive" :) (both set to true is the default, fully functioning state).

@watson
Copy link
Contributor

watson commented May 28, 2019

I like the names enabled and recording 👍 By not giving active a new meaning it also leaves us with the option to simply alias it to enabled, deprecate it, and remove it in a major or two.

Just to ensure there are no misunderstandings, the recording flag is not to be confused with the disableSend flag that a few of the agents have. Setting disableSend: false has a bit more overhead than recording: false will have because the latter will be as close to a no-op as possible.

Regarding disableInstrumentations, I'd still like this config option to behave in a way that's non-reversible - similar to the new enabled flag we're discussing.

@felixbarny
Copy link
Member

Will recording: false also disable metrics?

@watson
Copy link
Contributor

watson commented May 28, 2019

@felixbarny I hope so. In my mind, this is as close as we can get to enabled: false just with the ability to turn it on via remote config later. So it has to still poll /config, but that's it.

@Qard
Copy link

Qard commented Jul 17, 2019

Personally I think we should just rely on more focused config options like remoteConfig: false. We're coming into this issue because of the gradually increasing complexity of adding more and more new features. As new features are added, the meaning of our configs may change subtly, if they are not more specific. I'm sure we'd eventually find another case where enabled or recording is not nuanced enough to fully describe what it's trying to do and needs to be split up again. 🤷‍♂

@SergeyKleyman
Copy link
Contributor

SergeyKleyman commented Aug 8, 2019

@Qard I agree that it's easier to understand and support configuration options that are as orthogonal/independent in their semantics as possible. OTOH use case for making agent's presence as less felt as possible without restarting the application is IMHO so widespread that making an exception for it and introducing an "aggregate" configuration option (instead of having to remember each one of the "atomic" configuration options it's composed of) is worth it.

@axw
Copy link
Member Author

axw commented Aug 9, 2019

Agreed, we can add more focused config as well, but having the ability to disable everything or disable just data recording is user-friendly.

What we are voting on

  • Introduce enabled config: true by default, setting to false will completely disable the agent, including instrumentation and remote config polling. This can only be set locally to the agent (e.g. in code, config file, environment variable), and is interpreted once at agent initialisation time; once disabled, disabled for the process's lifetime.
  • Introduce recording config: true by default, setting to false will completely disable the recording and sending of data (transactions, spans, errors, and metrics). It must be possible to toggle recording dynamically, either locally or via remote config. Setting recording to false will therefore not disable remote config polling, config file watching, etc.
  • Deprecated active config. Its behaviour will remain the same (which varies across agents).
  • disabledInstrumentations config should not be dynamic. Like enabled, once a module is disabled it is disabled for the process's lifetime.

Vote

Agent Yes No Indifferent N/A Link to agent issue
.NET
elastic/apm-agent-dotnet#122
Go
elastic/apm-agent-go#661
Java
elastic/apm-agent-java#1086
Node.js
elastic/apm-agent-nodejs#1271
Python
elastic/apm-agent-python#575
Ruby
elastic/apm-agent-ruby#623
RUM
elastic/apm-agent-rum-js#459

@Qard
Copy link

Qard commented Aug 9, 2019

Does recording apply to errors too, or no? You make no specific mention of it there.

@axw
Copy link
Member Author

axw commented Aug 9, 2019

@Qard yeah, I just forgot about them :) Thanks, updated.

@felixbarny
Copy link
Member

felixbarny commented Aug 9, 2019 via email

@axw
Copy link
Member Author

axw commented Aug 9, 2019

@felixbarny my bad, I forgot we had discrepancies. Agree, let's leave active alone.

@watson
Copy link
Contributor

watson commented Aug 9, 2019

@axw what do you mean "leave it alone"? As in we shouldn't deprecate it after all? If so, what's the use-case of keeping it around?

@axw
Copy link
Member Author

axw commented Aug 9, 2019

@watson sorry for being unclear in my comment. I think the updated description is clearer:

Deprecated active config. Its behaviour will remain the same (which varies across agents).

i.e. just put a note in the docs saying that it's replaced by enabled/recording, and that it'll be removed in a future version.

@mikker
Copy link
Contributor

mikker commented Aug 9, 2019

And custom instrumentation with an agent with enabled: false will be noop, right? So people can leave the instrumentation code as is.

@felixbarny
Copy link
Member

And custom instrumentation with an agent with enabled: false will be noop, right? So people can leave the instrumentation code as is.

IIUC, both enabled=false and recording=false would have that effect.

So for the Java agent, instrument will be remapped to enabled and active will be remapped to recording and that's it, right?

@watson
Copy link
Contributor

watson commented Aug 9, 2019

So for the Java agent, instrument will be remapped to enabled and active will be remapped to recording and that's it, right?

@Qard we could consider doing the same. Though for historic reasons, instrument: false would leave on automatic capturing of errors and uncaught exceptions, whereas recoding: false would not 🤔 So it's a breaking change if we would do that... so probably not worth it at the moment.

@felixbarny
Copy link
Member

felixbarny commented Aug 12, 2019

Eyal made me aware that instrument actually has slightly different semantics. It would disable all auto-instrumentations but still allows to manually create transactions with the public API and the OpenTracing bridge. We wouldn't want to get rid of that functionality, so I guess we will then have enabled (new) recording (renamed from active), instrument (as-is), disabled_instrumentations (as-is)

@SergeyKleyman
Copy link
Contributor

I think we agreed on the new configuration options and an issue to implement was created for each agent so we can close this issue, any objections?

beniwohli added a commit to elastic/apm-agent-python that referenced this issue May 1, 2020
* introduce enabled/recording settings

see elastic/apm#92 (comment)

Co-authored-by: Colton Myers <[email protected]>
romulorosa pushed a commit to romulorosa/apm-agent-python that referenced this issue Oct 15, 2020
* introduce enabled/recording settings

see elastic/apm#92 (comment)

Co-authored-by: Colton Myers <[email protected]>
romulorosa pushed a commit to romulorosa/apm-agent-python that referenced this issue Oct 15, 2020
* introduce enabled/recording settings

see elastic/apm#92 (comment)

Co-authored-by: Colton Myers <[email protected]>
beniwohli added a commit to beniwohli/apm-agent-python that referenced this issue Sep 14, 2021
* introduce enabled/recording settings

see elastic/apm#92 (comment)

Co-authored-by: Colton Myers <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

7 participants