Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Ingest Management] Agent supports capabilities definition #23848

Merged
merged 40 commits into from
Feb 15, 2021

Conversation

michalpristas
Copy link
Contributor

@michalpristas michalpristas commented Feb 4, 2021

What does this PR do?

This PR introduces capabilities as described in #21000
Consists of first step of implementation - bare input/output/upgrade filter without any additional filtering options such as only subset of metricsets etc.

How this works is that there is a Capability injected down to components which checks objects passed to Apply fn.
This capability has set of input/output/upgrade capabilities based on description in capabilities.yml file. This file sits next to agent config and contains a list of definitions.

Each of these capabilities decides whether input object is interesting and capability can operate on it or not.
If so it checks for whatever it is configured for and returns updated object.
It always returns object of the same type as it was passed in.
E.g input capability is able to operate on maps and on AST. if ast is passed in it can never happen map is returned.

When agent rules out inuput/output log is written with ERROR log level and health of agent is degraded as discussed in elastic/kibana#76841

Feature works for standalone and fleet managed mode.

sample capabilities.yml allowing system metrics and nothing else

version: 0.0.1
capabilities:
- rule: allow
  input: system/metrics
- rule: deny
  input: "*"

Testing

The ideal way how to play with this is to package an agent and then modify elastic-agent.yml, capabilities.yml and invoking ./elastic-agent inspect to see how the resulting config looks like

Why is it important?

#21000

Checklist

  • My code follows the style guidelines of this project
  • I have commented my code, particularly in hard-to-understand areas
  • I have made corresponding changes to the documentation
  • I have made corresponding change to the default configuration files
  • I have added tests that prove my fix is effective or that my feature works
  • I have added an entry in CHANGELOG.next.asciidoc or CHANGELOG-developer.next.asciidoc.

@michalpristas michalpristas added enhancement needs_backport PR is waiting to be backported to other branches. Team:Ingest Management Team:Elastic-Agent Label for the Agent team labels Feb 4, 2021
@michalpristas michalpristas self-assigned this Feb 4, 2021
@@ -0,0 +1,4 @@
- rule: allow
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Having a top level entry capabilities instead of directly the array would allow use to extend it later with other things if needed. For example the version of the capabilities file.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1 on adding a version from day 1.

@michalpristas
Copy link
Contributor Author

/package

@michalpristas
Copy link
Contributor Author

/package

@@ -142,7 +154,7 @@ func (e *emitterController) update() error {
return e.router.Dispatch(ast.HashStr(), programsToRun)
}

func emitter(ctx context.Context, log *logger.Logger, agentInfo *info.AgentInfo, controller composable.Controller, router programsDispatcher, modifiers *configModifiers, reloadables ...reloadable) (emitterFunc, error) {
func emitter(ctx context.Context, log *logger.Logger, agentInfo *info.AgentInfo, controller composable.Controller, router programsDispatcher, modifiers *configModifiers, caps capabilities.Capability, reloadables ...reloadable) (emitterFunc, error) {
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: The number of parameters gets a little out of hand. Do we really need this function, or can we just use the struct and have a separate "init" or "start" function/method?

Do we pass a tuple of ctx, log, agentInfo often? In the future we might even want to pass something for metrics and APM tracing. The 'tuple' could be combined into an appContext struct.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i'm +1 on this here. would be worth a small refactor across all the codebase, we have this pattern i think on multple places. these things were simple at first but got bloated over time

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The emitter function is actually become more of a wrapper for the emitterController with the addition of dynamic inputs and conditions.

We should probably refactor that into a single controller.

x-pack/elastic-agent/pkg/capabilities/capabilities.go Outdated Show resolved Hide resolved
x-pack/elastic-agent/pkg/capabilities/expr.go Show resolved Hide resolved
type Capability interface {
// Apply applies capabilities on input and returns true if input should be completely blocked
// otherwise, false and updated input is returned
Apply(interface{}) (bool, interface{})
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The interface looks hard to use correctly long term.

Some capabilities seem to accept map[string]interface{} and in some places an "AST" object is required. The return type is also not so clear.

Is the return type always map[string]interface{} or can it be an AST? Does the return type depend on the input type.

Will a capability modify the input, or will it always create a new object?

For the implementor of a capability: how do I know which input type I must accept? Does it depend on where the capability will be applied at? Do we have capability that need to implement multiple possible input types?

For example inputCapability: blocked is false if the Input gets passed the wrong type. The inputCapability even silently ignores programmer errors if I pass in the wrong type by accident in the future.

Can we settle on a single type as input to a Capability to ease maintenance in the future? In case we really need the two implementations for Apply, let's use the type system to reflect this e.g. ApplyConfig(m map[string]interface{}) ... and ApplyAST(...) (maybe even in a separate interface).
If possible I would prefer concrete types over anything interface{}.

How about returning an error instead of a boolean? e.g. common signals would be defined as constants ErrBlocked, while we still will be able to pass errors up if an instances is used wrongly.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i was aiming for abstract interface, for some capabilities it may be map, it may be struct, it may be both. depends on a place where this is coming from.
e.g for upgrade it is an Action coming form fleetapi or object conforming to interface, for input it can be map or AST, or if we implement it elsewhere it may be also an Action. Separate implementation would lead to more objects there, and also on the top level side we would probably need all of them methods. (at the level of top capability returned by Load whcih is then passed and injected down the stream)

it's up to implementor of capability to be aware of the hooks and places where data are coming from.

returning an error seems reasonable

x-pack/elastic-agent/pkg/capabilities/capabilities_test.go Outdated Show resolved Hide resolved
Input string `json:"input" yaml:"input"`
}

func (c *inputCapability) Apply(in interface{}) (bool, interface{}) {
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

inputCapability seems to be internal to this package only. Actually seems to be internal to the multiInputsCapability. Why not accept and return concrete types such that multiInputsCapability does not need to do special casts and runtime type checks validating the output of inputCapability?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i delegated this responsibility to the latest caller. in case inputCapability is used outside of multiInput i wanted to have checks in place.
it might be redundant but it's not run that often

x-pack/elastic-agent/pkg/capabilities/input.go Outdated Show resolved Hide resolved
if inputs := inputsMap(inputsIface, c.log); inputs != nil {
renderedInputs, err := c.renderInputs(inputs)
if err != nil {
c.log.Errorf("marking inputs failed for capability '%s': %v", c.name(), err)
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Structure logging. If possible, let's include input id and other meta-data (if present) that allow us to identifiy the input: c.log.With(...).Errorf or c.log.Errorw.

x-pack/elastic-agent/pkg/capabilities/expr.go Show resolved Hide resolved
@michalpristas
Copy link
Contributor Author

/package

Copy link
Contributor

@blakerouse blakerouse left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks like @urso gave a good review. I didn't catch anything that you have not already fixed. +1 from me.

@michalpristas
Copy link
Contributor Author

/package

@mdelapenya
Copy link
Contributor

mdelapenya commented Feb 15, 2021

/package

Hey @michalpristas, I'm seeing the following error while running the standalone tests (using the docker image):

[fleet_elastic-agent_1][docker.elastic.co/observability-ci/elastic-agent:pr-23848][e52698e5e9c49fd623fdffaadae8be1b883b05627aa09938f23a234b93273122][2021-02-15T09:28:28.128Z] standard_init_linux.go:219: exec user process caused: exec format error

Container logs: https://beats-ci.elastic.co/job/e2e-tests/job/e2e-testing-mbp/job/master/351/artifact/docker_logs_fleet_stand_alone_agent%20&&%[email protected]

do you think it's related?

@michalpristas
Copy link
Contributor Author

@mdelapenya exec format i dont think so, maybe arm?

@michalpristas
Copy link
Contributor Author

tried linux locally and it works. @mdelapenya is talking to the team to see what's causing failures. will not merge until i know it's not related

@ruflin
Copy link
Contributor

ruflin commented Feb 15, 2021

@michalpristas @urso I assume in any case we should not merge when CI is red? I guess this also applied for E2E?

@michalpristas
Copy link
Contributor Author

michalpristas commented Feb 15, 2021

failures don't look related, something with e2e test in general

@michalpristas michalpristas merged commit 282a7bc into elastic:master Feb 15, 2021
michalpristas added a commit to michalpristas/beats that referenced this pull request Feb 15, 2021
…3848)

[Ingest Management] Agent supports capabilities definition (elastic#23848)
)

type ruler interface {
Rule() string
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is 'Rule' supposed to return? Maybe we should name the method RuleType()?

Why do we need the return type "string" here? AFAICT the only valid values are allow and deny. If valid values are restricted we should introduce an enum.

I don't see the method Rule() to be used anywhere? Do you plan to use it somewhere in the future?

type Capability interface {
// Apply applies capabilities on input and returns true if input should be completely blocked
// otherwise, false and updated input is returned
Apply(interface{}) (interface{}, error)
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I still would say the interface is doing to much, while not really abstracting things.

The input and output types are interface{}, that is the internal representation of the objects are not really abstracted away behind coherent interfaces. Instead we rely on internal knowledge of the potential structures to be passed from different call sides. Due to the need to downcast and analyze the internal structure, we fail to have proper abstractions here. Instead we allow for potential bugs being introduced in the future in case we change the internal representation for e.g. Inputs to a proper go-type without finding and updating the right capabilities.

The Apply method indicates that we are doing a transformation. Yet we only want to filter. Would it make sense to change the interface to act more like a predicate and have the code to move through the configured structures in some other central place?
My understanding is that we have the 'walking' on the structure already in the transpiler, why should we replicate the walking in the capabilities?

E.g.

type Capability interface {
  // TestFeature returns nil if the kind and typeName are accepted by the capability.
  // ErrDeny is returned if the capability rejects the current feature.
  TestFeature(kind ComponentKind, typeName string) (err error)
}

var ErrDeny = errors.New("deny")

type ComponentKind int

const (
  ComponentInput ComponentKind = iota
  ComponentOutput
)

Alternatively (in case we want transformations), we might want to pass the root configuration object to Apply. In that case only the implementation is abstracted away, but the input type is known. If we change the go type for inputs/outputs to proper structures, the actual implementation of the input capability should fail to compile.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement needs_backport PR is waiting to be backported to other branches. Team:Elastic-Agent Label for the Agent team
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants