[DESIGN][Agent] Minimizing Elastic-Agent privileges #147

andrewvc · 2021-09-14T13:12:02Z

Action plan after meeting today with @blakerouse @fntlnz and @justinkambic

There are three use cases for elastic-agent with different security requirements, where we can have three different behaviors.

For docker containers specifically, we need a clear path to running as non-root for two reasons:

It will be flagged by many orgs as insecure,
Some software (synthetics) cannot run as root, so we need consistent guidance, today we need to advise people to run as different users for different use cases.

New Behavior by Use Case

Install command on local machine

Keep running as root
Individual beats can downgrade privileges / setuid as needed (see [Heartbeat] Setuid to regular user / lower capabilities when possible beats#27878 which does this in just heartbeat as an example)

Run in docker with `docker run`

No need to run as root because we don't run elastic endpoint security, we should recommend running as elastic-agent
We will need to use setcap to add privileges to the elastic-agent binary
Individual beats should downgrade privileges via setcap as needed
If you want to run endpoint then you'll need to run a separate container with

docker run --network agent elastic-agent
docker run --network agent --privileged elastic-endpoint

Run in kubernetes

Run a pod for agent that contains an unprivileged container for elastic-agent, and a privileged container for elastic-endpoint

Tasks:

Elastic-agent docs updated to recommend running as regular user
Use setcap in elastic-agent docker container to add all required capabilities as inheritable so subprocesses can use privs
Modify individual beats to setuid / setcap/ downgrade for the local machine use case
- Use setcap in subprocesses in container to drop unneeded privileges

The text was updated successfully, but these errors were encountered:

elasticmachine · 2021-09-14T13:34:32Z

Pinging @elastic/agent (Team:Agent)

andrewvc · 2021-09-14T20:59:07Z

I believe that in a k8s environment hostPath volumes still present a problem. See elastic/beats#19600 . @jsoriano can you add your thoughts here?

jsoriano · 2021-09-15T08:39:04Z

No need to run as root because we don't run elastic endpoint security

Is this issue focused on Uptime?

In any case I think this is a risky assumption, a user of Elastic Agent for any use-case may decide to install a different integration in the future that may need further privileges, if they do, they will probably find weird failures, and they will end up having to replace their installation of Elastic Agent, or run multiple of them, what may undermine the user experience intended with Agent/Fleet.

The default experience should assume that Agent can run any integration. As a process supervisor, it should be understandable that its default is running full privileged. There can be options to run with less privileges, and we should document them, but we have to think on this as unified user experience, considering what happens if a user associates a policy with an agent that doesn't have privileges to run it.

4. If you want to run endpoint then you'll need to run a separate container with
docker run --network agent elastic-agent
docker run --network agent --privileged elastic-endpoint

This would also undermine the experience intended with Agent/Fleet. What is the benefit of this new experience if you still need to run agents individually?

Individual beats should downgrade privileges via setcap as needed

Modify individual beats to setuid / setcap/ downgrade for the local machine use case

I consider this a good practice for any application, but I think it'd be better if we don't rely on this to ensure the minimum privileges principle. I would propose a security model where Elastic Agent has the control of the privileges of the processes executed. The main reasons for that:

Elastic Agent may execute processes of different nature, what will require different implementations for capabilities management, what is error-prone. Think that Agent already runs Beats and Endpoint, and may run other different collectors in the future. Running all of them with full privileges, trusting that they will do the right thing after that is a risk.
This is a common practice (docker and other container runtimes run containers by default with a reduced set of capabilities, execution can be tuned to increase privileges, systemd and other service supervisors have features to control the capabilities of the services they run...).
This can allow in the future to decide the capabilities required per enabled integration, for example metricbeat with the system module enabled is executed with more capabilities than metricbeat monitoring only a remote apache.
As well as controlling privileges, it could also run collectors as different users, solving the mentioned problem with synthetics.
When running with reduced privileges, Elastic Agent may inform Fleet of its capabilities so it can give feedback to the user about the available options to run more privileged integrations. Or it can reject the execution of a policy if it doesn't have enough privileges, providing meaningful guidance to the user at the moment of trying to associate the policy (instead of blindly running it till something fails, and then having to investigate through logs and so on).

This model would be based on:

Elastic Agent runs any collector by default with a reduced set of capabilities.
Any collector (or integration in the future?) may override these defaults with configuration in their spec.
As a good practice, collectors may still further downgrade their privileges if wanted, but not required.

I believe that in a k8s environment hostPath volumes still present a problem. See elastic/beats#19600 . @jsoriano can you add your thoughts here?

In some restricted k8s environments hostPath cannot be used. This is a problem with use cases where you want to persist state between executions or after upgrades. This is specially important for filebeat, probably not so much for heartbeat. Solutions for this are not straight-forward, they will depend on the available volume providers in the environment.

andrewvc · 2021-09-15T13:01:52Z

All good points @jsoriano, however, one concern @joshbressers has had is that users may be reluctant or unable to run the docker container as root, esp. in large environments with strict security policies. I'd argue that elastic-agent is less akin to systemd or another "process supervisor" in that context, it's simply the user app to be run.

WRT how the processes are invoked, I agree it'd be nice to have elastic-agent do it instead of the processes themselves. Another model could be just using the setcap command to set capabilities on the filesystem for the respective binaries, we could do that at build time if elastic/beats#27651 were implemented.

jsoriano · 2021-09-15T15:33:31Z

I'd argue that elastic-agent is less akin to systemd or another "process supervisor" in that context, it's simply the user app to be run.

Yes, you are right, Agent being a process supervisor is an implementation detail, nothing that a user can see as a reason to have more privileges.
Still, I think we have to count with users configuring integrations that require more privileges than the ones given to the Agents.

Another model could be just using the setcap command to set capabilities on the filesystem for the respective binaries, we could do that at build time if elastic/beats#27651 were implemented.

Yes, this could be a good idea in any case.

andrewvc · 2021-09-21T20:06:57Z

I think for now, given the valid concerns @jsoriano has raised, let's proceed with merging elastic/beats#27878 , and postpone future work for now. That solves the use cases we need on our team, and we probably don't have the bandwidth for a larger scale fix at this point.

marclop · 2021-11-29T03:51:39Z

I'm taking a look at having the apm-server not run as root when the elastic-agent is run as root and what our options are. We seem to have decided to not manage the user/group for binaries that are run by the elastic-agent and have the beats themselves change their user/group and set capabilities.

I would like us to revisit that decision, ideally allowing beats to specify which user:group they would like to be run as, instead of requiring each individual beat to implement the logic that heartbeat currently has to change its user:group and optionally set specific capabilities.

Ideally, the elastic-agent should allow beats to specify the user:group that it should be run as, as well as any additional capabilities that the beat requires in order to run successfully:

name: APM-Server
cmd: apm-server
artifact: apm-server
...
user: elastic-agent
group: elastic-agent
# APM server doesn't require any additional capabilities, but they could be specified as:
# linux_capabilities: 'cap_net_raw+ep'

Another option would be to recommend that the elastic-agent be run with an unprivileged user, I see the issue has a bullet point to update the documentation to recommend elastic-agent be run with an unprivileged user, are here any blockers to update the docs / references to recommend using a regular user?

elasticmachine · 2021-11-29T08:50:44Z

Pinging @elastic/elastic-agent-control-plane (Team:Elastic-Agent-Control-Plane)

jsoriano · 2021-11-29T08:57:07Z

We seem to have decided to not manage the user/group for binaries that are run by the elastic-agent and have the beats themselves change their user/group and set capabilities.

I am not sure if there has been an active decision on this after this issue was opened. This is only the way it currently works.

I would like us to revisit that decision, ideally allowing beats to specify which user:group they would like to be run as, instead of requiring each individual beat to implement the logic that heartbeat currently has to change its user:group and optionally set specific capabilities.

+1 to this, this would be in line of my proposal in #147, where Elastic Agent controls the privileges, based on info given on each collector spec. I don't think that an approach like this one has been discarded, only that it would need more work.

jlind23 · 2021-12-03T17:26:42Z

@ruflin seems to be a requirement to consider for the V2 design you are doing.

ruflin · 2021-12-06T07:56:11Z

@jlind23 I added a note to the design doc to dig into it.

eedugon · 2022-01-27T13:38:12Z

@jsoriano , please take in mind that the current elastic-agent docker image (7.16.2) is adding the elastic-agent user to the root group and the main directory (elastic-agent) is owned by root:root without permissions to anyone.

In platforms like azure containers our image doesn't work at all because of security restrictions (elastic-agent user will NOT belong to root group hence it won't have permissions to see any of the content of the elastic-agent directory).

The following small change solves the problem:

FROM docker.elastic.co/beats/elastic-agent:7.16.2
USER root
RUN chown -R :elastic-agent /usr/share/elastic-agent
USER elastic-agent

The previous just changes the group ownership of the elastic-agent directory and all its content to the elastic-agent group. Then, in the hypothetical case of the elastic-agent user not belonging to root group at least it will have access to the content of the directory to run the agent.

At the moment we are not running as root but adding the non-root user to root group, which looks weird.

jlind23 · 2022-01-27T14:52:04Z

@ph This is something we may consider to avoid having issues on cloud container solutions such as azure containers..

jsoriano · 2022-01-27T15:57:37Z

@eedugon these changes to add files and users to the root user group were done in the context of supporting OpenShift guidelines, you can read more about this in elastic/beats#12905 (reverted and reapplied in elastic/beats#18873).

If we change this to support Azure, we have to check that we keep supporting these OpensShift guidelines.

jlind23 · 2022-01-27T15:58:18Z

@blakerouse @ruflin what is your opinion here? Any particular path we should take?

ph · 2022-01-27T19:47:50Z

If I understand the guideline, making that change will be incompatible with openshift.

For an image to support running as an arbitrary user, directories and files that are written to by processes in the image must be owned by the root group and be read/writable by that group. Files to be executed must also have group execute permissions.

From: https://docs.openshift.com/container-platform/4.9/openshift_images/create-images.html

eedugon · 2022-01-27T19:54:44Z

Thanks Jaime! root group membership isn’t that important, just with a directory ownership change the image would work in Azure (although depending on the openship requirements I don’t know if that would break openshift compatibility). What looks weird to me is trying to run our software with “non root” users but adding the user to the root group. Anyway I’m totally ok with any decision you take here, but it would be great to add in the docs the container environments that we verify or support. El El jue, 27 ene 2022 a las 20:48, Pier-Hugues Pellerin < ***@***.***> escribió:

…

If I understand the guideline, making that change will be incompatible with openshift. For an image to support running as an arbitrary user, directories and files that may be written to by processes in the image should be owned by the root group and be read/writable by that group. Files to be executed should also have group execute permissions. From: https://docs.openshift.com/container-platform/3.11/creating_images/guidelines.html — Reply to this email directly, view it on GitHub <elastic/elastic-agent#147>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/AGBFXJLA32MJEF7FSD7YIVDUYGOP5ANCNFSM5EAEATFQ> . You are receiving this because you were mentioned.Message ID: ***@***.***>

eedugon · 2022-01-27T20:04:35Z

You are right Pier-Hugues, sorry I hadn’t read your post. So clearly my workaround is against openshift, and the reason for the user to be on “root” group is probably beyond my understanding. Sorry for the noise here! The openshift guideline also explains:

Because the container user is always a member of the root group, the

container user can read and write these files. I don’t know if that’s generic on Linux dockers or it’s just an openshift proposal, just looked weird from sysadmin and security point of view. El El jue, 27 ene 2022 a las 20:54, Edu Gonzalez de la Herran < ***@***.***> escribió:

…

Thanks Jaime! root group membership isn’t that important, just with a directory ownership change the image would work in Azure (although depending on the openship requirements I don’t know if that would break openshift compatibility). What looks weird to me is trying to run our software with “non root” users but adding the user to the root group. Anyway I’m totally ok with any decision you take here, but it would be great to add in the docs the container environments that we verify or support. El El jue, 27 ene 2022 a las 20:48, Pier-Hugues Pellerin < ***@***.***> escribió: > If I understand the guideline, making that change will be incompatible > with openshift. > > For an image to support running as an arbitrary user, directories and files that may be written to by processes in the image should be owned by the root group and be read/writable by that group. Files to be executed should also have group execute permissions. > > From: > https://docs.openshift.com/container-platform/3.11/creating_images/guidelines.html > > — > Reply to this email directly, view it on GitHub > <elastic/elastic-agent#147>, > or unsubscribe > <https://github.com/notifications/unsubscribe-auth/AGBFXJLA32MJEF7FSD7YIVDUYGOP5ANCNFSM5EAEATFQ> > . > You are receiving this because you were mentioned.Message ID: > ***@***.***> >

jsoriano · 2022-01-28T09:48:49Z

Because the container user is always a member of the root group, the
container user can read and write these files.

I don’t know if that’s generic on Linux dockers or it’s just an openshift proposal, just looked weird from sysadmin and security point of view.

Yes, this seems to be the case for containers started with Docker with arbitrary uids:

$ docker run -it --rm -u 1000 ubuntu:20.04 id
uid=1000 gid=0(root) groups=0(root)
$ docker run -it --rm -u 1000 alpine id
uid=1000 gid=0(root)

And yes, this effectively allows to access (mounted) host files with permissions for the root (0) group.

What I think that OpenShift additionaly does is to use user namespacing, this way the id 0 in the container belongs to a random unprivileged user and group in the host. (Update, more info about this: https://cloud.redhat.com/blog/a-guide-to-openshift-and-uids, https://cookbook.openshift.org/users-and-role-based-access-control/why-do-my-applications-run-as-a-random-user-id.html)

ph · 2022-01-28T13:19:34Z

@jsoriano Is that correct to believe that we might need to have a different docker images for the azure case?

jsoriano · 2022-01-28T14:37:28Z

@jsoriano Is that correct to believe that we might need to have a different docker images for the azure case?

Yes, it may be possible that we need an specific image for Azure if their runtime is different enough. We would need to investigate a bit more.

jlind23 · 2022-02-04T13:14:01Z

@ph first thing to do will be to have a single config running on both openshift and azure container, and if it's not working then we should consider shipping a specific azure image which i will definitely try to avoid.
Something we should investigate in one of our coming release.

nicpenning · 2023-05-22T03:18:44Z

Is this FR / issue still alive?

As a user, I would like to be able to set which user context each integration executes as.

For example, we can run Filebeat today as a service on Windows with a specific user to access files and folders that cannot be accessed by system. This is a slight blocker for us to migrate a few different integrations.

A work around is deploying an agent locally to said systems but we would prefer to use network mapped drives (even though discouraged, this works very well) to reduce overhead on the servers themselves and have less agents to manage.

Also, it's best to have reduced permissions anyways, especially when you are simply reading log files and forwarding them on to another resource.

Please do let me know if this concept is worth considering here or a new issue/FR makes sense.

Thanks!

jlind23 · 2024-05-27T13:30:02Z

Elastic Agent can now be run as non root on Linux, Mac and Windows hence closing this as done.
cc @ycombinator @nimarezainia

andrewvc added the enhancement New feature or request label Sep 14, 2021

andrewvc mentioned this issue Sep 14, 2021

[elastic-agent] Evaluate whether root is the correct user to run the elastic-agent docker image as elastic/beats#27648

Closed

blakerouse added the Team:Elastic-Agent Label for the Agent team label Sep 14, 2021

simitt mentioned this issue Sep 14, 2021

[Elastic Agent] Drop privileges when managed by Elastic Agent elastic/apm-server#4571

Closed

jsoriano added the Team:Elastic-Agent-Control-Plane Label for the Agent Control Plane team label Nov 29, 2021

jlind23 added the V2-Architecture label Dec 3, 2021

jlind23 added the 8.3-candidate label Feb 4, 2022

jlind23 mentioned this issue Mar 7, 2022

Update Elastic Agent docker image for azure containers runtime #82

Closed

jlind23 transferred this issue from elastic/beats Mar 7, 2022

jlind23 added the discuss label Mar 16, 2022

jlind23 changed the title ~~[Agent] Minimizing Elastic-Agent privileges~~ [Design][Agent] Minimizing Elastic-Agent privileges Mar 16, 2022

jlind23 changed the title ~~[Design][Agent] Minimizing Elastic-Agent privileges~~ [DESIGN][Agent] Minimizing Elastic-Agent privileges Mar 16, 2022

jlind23 added v8.3.0 and removed 8.3-candidate labels Mar 23, 2022

jlind23 closed this as completed May 27, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[DESIGN][Agent] Minimizing Elastic-Agent privileges #147

[DESIGN][Agent] Minimizing Elastic-Agent privileges #147

andrewvc commented Sep 14, 2021 •

edited

Loading

elasticmachine commented Sep 14, 2021

andrewvc commented Sep 14, 2021

jsoriano commented Sep 15, 2021 •

edited

Loading

andrewvc commented Sep 15, 2021

jsoriano commented Sep 15, 2021

andrewvc commented Sep 21, 2021

marclop commented Nov 29, 2021

elasticmachine commented Nov 29, 2021

jsoriano commented Nov 29, 2021

jlind23 commented Dec 3, 2021

ruflin commented Dec 6, 2021

eedugon commented Jan 27, 2022 •

edited

Loading

jlind23 commented Jan 27, 2022

jsoriano commented Jan 27, 2022

jlind23 commented Jan 27, 2022

ph commented Jan 27, 2022 •

edited

Loading

eedugon commented Jan 27, 2022 via email

eedugon commented Jan 27, 2022 via email

jsoriano commented Jan 28, 2022 •

edited

Loading

ph commented Jan 28, 2022

jsoriano commented Jan 28, 2022

jlind23 commented Feb 4, 2022

nicpenning commented May 22, 2023 •

edited

Loading

jlind23 commented May 27, 2024

[DESIGN][Agent] Minimizing Elastic-Agent privileges #147

[DESIGN][Agent] Minimizing Elastic-Agent privileges #147

Comments

andrewvc commented Sep 14, 2021 • edited Loading

New Behavior by Use Case

Install command on local machine

Run in docker with docker run

Run in kubernetes

elasticmachine commented Sep 14, 2021

andrewvc commented Sep 14, 2021

jsoriano commented Sep 15, 2021 • edited Loading

andrewvc commented Sep 15, 2021

jsoriano commented Sep 15, 2021

andrewvc commented Sep 21, 2021

marclop commented Nov 29, 2021

elasticmachine commented Nov 29, 2021

jsoriano commented Nov 29, 2021

jlind23 commented Dec 3, 2021

ruflin commented Dec 6, 2021

eedugon commented Jan 27, 2022 • edited Loading

jlind23 commented Jan 27, 2022

jsoriano commented Jan 27, 2022

jlind23 commented Jan 27, 2022

ph commented Jan 27, 2022 • edited Loading

eedugon commented Jan 27, 2022 via email

eedugon commented Jan 27, 2022 via email

jsoriano commented Jan 28, 2022 • edited Loading

ph commented Jan 28, 2022

jsoriano commented Jan 28, 2022

jlind23 commented Feb 4, 2022

nicpenning commented May 22, 2023 • edited Loading

jlind23 commented May 27, 2024

andrewvc commented Sep 14, 2021 •

edited

Loading

Run in docker with `docker run`

jsoriano commented Sep 15, 2021 •

edited

Loading

eedugon commented Jan 27, 2022 •

edited

Loading

ph commented Jan 27, 2022 •

edited

Loading

jsoriano commented Jan 28, 2022 •

edited

Loading

nicpenning commented May 22, 2023 •

edited

Loading