Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Elastic Agent] Allow beats subprocess to define custom configuration. #90

Closed
ph opened this issue Jan 14, 2022 · 19 comments
Closed

[Elastic Agent] Allow beats subprocess to define custom configuration. #90

ph opened this issue Jan 14, 2022 · 19 comments
Labels
Team:Elastic-Agent-Control-Plane Label for the Agent Control Plane team

Comments

@ph
Copy link
Contributor

ph commented Jan 14, 2022

Motivation

Use case 1: Subprocess doesn't load specific custom beats configuration, this means that the beats will start with the default configuration that is defined in the go-ucfg structs from the source code. As an example, the k8s processors are enabled by default in the context of elastic-agent when they should not.

Use case 2: Seccomp is on by default, with the default options and it's not currently possible to overrides these default and prevent forking java-attacher.

Possible solution:

Allow subprocess to load custom {process_name}.elastic-agent.yml when it starts, this could be added as part of the specs and the developer would be able to define a configuration that should be used.

Something like this in the spec:

args: [
  "-c", "{process_name}.elastic-agent.yml"
  "-E", "management.enabled=true",
  "-E", "gc_percent=${APMSERVER_GOGC:100}"
]

Questions:

Is this introduce a security issue?

@ph ph added the Team:Elastic-Agent-Control-Plane Label for the Agent Control Plane team label Jan 14, 2022
@elasticmachine
Copy link
Contributor

Pinging @elastic/elastic-agent-control-plane (Team:Elastic-Agent-Control-Plane)

@ph
Copy link
Contributor Author

ph commented Jan 14, 2022

I don't think that would introduce a bigger security issues, the files are owned by the right users, if you have already escalated privileges and you are able to edit the file, I think you can do more problems.

@simitt
Copy link
Contributor

simitt commented Jan 17, 2022

Allow subprocess to load custom {process_name}.elastic-agent.yml when it starts, this could be added as part of the specs and the developer would be able to define a configuration that should be used.

@ph would this be extensible to user configuration eventually and how would that look like?
elastic/apm-server#6238 is an example where the default policy of standalone beats was missing a syscall used by the latest version of glibc. If seccomp filters are applied with a default policy, we either need to ensure that default policies cover for all supported architectures (hard to fully ensure) or allow user customization.

@ph
Copy link
Contributor Author

ph commented Jan 17, 2022

@simitt It could be eventually be extended to user configuration.

I am looking a bit more into security and how we could lock down stuff, but I presume we could add a list of seccomp rules to the integration that the agent would aggregate and restart or apply it to the corresponding input.

Now can a user be able to change it, maybe, maybe it's an opt-in feature when managed in fleet.

@sgtsnooky
Copy link

Hi all,
I am still experiencing this issue on 8.1.2.

[elastic_agent.filebeat][error] Error extracting container id

@jlind23 jlind23 removed the v8.3.0 label May 25, 2022
@WoodyWoodsta
Copy link

As per elastic/beats#27216 (comment), I was going to open a new issue about the error Error extracting container id, but after doing a bit more testing, I'm not entirely sure it's a separate case. So I thought I would add my case here:

I have kubernetes container logs enabled in the kubernetes integration. I also have the system integration. The following cases present:

  1. If I enable Collect... system syslog logs but the syslog file is not present in the configured directory, I get no errors
  2. If I disable Collect... system syslog logs but the syslog file is present in the configured directory, I get no errors
  3. If I enable Collect... system syslog logs and the syslog file is present in the configured directory, I get errors

Is this related to offending default config or is this a separate issue?

@WoodyWoodsta
Copy link

WoodyWoodsta commented Aug 31, 2022

After some further experimentation, it seems that sadly the issue is caused by enabling any log ingestion (be that from system integration or a custom log integration). At the moment this renders ingesting logs from anything other than k8s container logs impossible given the amount of errors received. Is there a way around this?

If not, we'd appreciate an update on this, as it's currently blocking a fairly common case path in my opinion!

@amagno
Copy link

amagno commented Sep 23, 2022

Hello everybody.
We started trying ElasticCloud and unfortunately we are getting the same problem mentioned above when using elastic agent in AKS and when configuring integrations to get logs from files.
Using the managed agent version of 8.4.2 and in the agent logs I get this:

[elastic_agent.filebeat][error] Error extracting container id - source value does not contain matcher's logs_path '/var/lib/docker/containers/

We didn't find clear solutions to the problem.
So I appreciate any update on the matter.

@gmontoro
Copy link

Same for agent 8.4.2 with a cluster running containerd. I think the use of containerd is the issue as therefor the libs from docker are not available also:
"Docker is no longer supported as of September 2022. For more information about this deprecation, see the AKS release notes." https://learn.microsoft.com/en-us/azure/aks/cluster-configuration

Can you please provide some update to that issue.

@fludo
Copy link

fludo commented Oct 24, 2022

This will by a no-go to use ElasticSearch for observability if containerd is not supported correctly. Any clue when this will be adressed ? @cmacknz ?

@cmacknz
Copy link
Member

cmacknz commented Oct 24, 2022

Please open a separate enhancement request to add containerd support so it can be triaged appropriately, the original description of this issue is unrelated as far as I can tell.

@Happycoil
Copy link

@cmacknz I opened a separate issue: #1614

@WoodyWoodsta
Copy link

@cmacknz @Happycoil Sorry - why is this considered a separate issue?

@gmontoro's comment was the one about containerd but nowhere does it suggest that this issue is scoped to just containerd? I thought this was a configuration issue, as per the original description?

@Happycoil
Copy link

@WoodyWoodsta I'm not sure, there's some confusion here among the myriad of different issues and threads. As far as I can tell this issue describes part of the solution to the problem being described by AKS users, but it doesn't appear to be triaged properly. This shouldn't have been a problem for over a year if there was a clear understanding of the consequences. Signing up to Elastic Cloud and connecting it to AKS just doesn't work properly, and Elastic is potentially losing sales because of what is apparently a small config problem.

@WoodyWoodsta
Copy link

I'm not a cloud customer so it's not just that market. My stack is bare-metal kubeadm + containerd, so I can see how it might appear to be containerd-related but to me sounds like a red-herring.

@cmacknz
Copy link
Member

cmacknz commented Oct 26, 2022

Thanks for the comments everyone, it seems there is a real bug here with the agent running on AKS which we are not tracking properly.

The original purpose of this issue was to add a feature to the agent to enable configuration overrides for processes that the agent starts, which I suppose could possibly been a work around for the problem here but the issue description does not read as a bug affecting many customers or any agent deployed to AKS.

#1614 clarifies this and the scope of impact.

@endorama
Copy link
Member

endorama commented Feb 14, 2023

Hello, to add to the Use case 1 mentioned in the description, it seems also the add_cloud_metadata processor is enabled by default (in my case in metricbeat, if my understanding of this piece of code is correct) even in cases where this lead to misleading data.

@cmacknz
Copy link
Member

cmacknz commented Feb 15, 2023

Controlling global processors will be addressed with https://github.com/elastic/ingest-dev/issues/2442

@jlind23
Copy link
Contributor

jlind23 commented Sep 17, 2024

Closing this as not aligned with our long term strategy, will reopen later if required.

@jlind23 jlind23 closed this as completed Sep 17, 2024
@jlind23 jlind23 reopened this Sep 17, 2024
@jlind23 jlind23 closed this as not planned Won't fix, can't repro, duplicate, stale Sep 17, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Team:Elastic-Agent-Control-Plane Label for the Agent Control Plane team
Projects
None yet
Development

No branches or pull requests