-
Notifications
You must be signed in to change notification settings - Fork 4.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Heartbeat] Enable Heartbeat to run when elastic-agent container is executed as root #30869
Conversation
This pull request is now in conflicts. Could you fix it? 🙏
|
Pinging @elastic/uptime (Team:Uptime) |
888f3ba
to
c92ed60
Compare
@@ -85,7 +85,7 @@ func (paths *Path) InitPaths(cfg *Path) error { | |||
} | |||
|
|||
// make sure the data path exists | |||
err = os.MkdirAll(paths.Data, 0750) | |||
err = os.MkdirAll(paths.Data, 0770) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am not familiar with the directory/paths/permissions code yet, but my only thought is if it is possible to be more conservative with the permissions changes since this only affects heartbeat (as far as I know).
Should we make this change (and the umask change in the Dockerfile above) apply only to heartbeat, instead of every beat as it does now?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, I would also prefer a more localized change for heartbeat.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
To clarify, the issue fixed here applies to heartbeat
running inside a elastic-agent
container.
Path initialization is peformed by the first beat that kicks off inside elastic-agent
. AFAIK, there's no way to specify a order. In that case, the alternative would be to move path initialization inside elastic-agent
, before beats are spawned.
As for the entrypoint, I've followed the guideline here: #29708. As it stands, all umask
operations are performed outside libbeat code. With this, we can restrict the change to docker containers without affecting other type of installs (systemd and others). These install methods still have the umask specified in the service configuration template (0027) and will continue to create files with the same permission level as it is today (0750).
There's also to note, some of the directories that are being generated from inside elastic-agent
today are attempted with a higher permission level than the default mask allows. Here are a few examples:
$ grep mkdirat /tmp/agent.* | grep "state/data" | grep default\"
/tmp/agent.379:mkdirat(AT_FDCWD, "/usr/share/elastic-agent/state/data/logs/default", 0775) = 0
/tmp/agent.379:mkdirat(AT_FDCWD, "/usr/share/elastic-agent/state/data/tmp/default", 0775) = 0
If this permission level can be considered a security concern, I'm guessing we shouldn't be relying on an environmental umask
to filer them out.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the explanation! I don't have major concerns about this change.
@rdner thoughts on this? You were the last person to touch anything to do with umask.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I can confirm that this particular line should not affect other kinds of distribution since they all have their umask
set in their respective scripts which would overwrite the permission set for this directory.
However, I have some concerns that we now have a different umask
for all beats running under elastic-agent
because of the change in the Docker template. If that's okay for everyone I'm okay with that too.
The whole umask
story started with this security issue which was primarily about the access by all
, I guess giving more access to the group
is fine #14005
Pinging @elastic/elastic-agent-control-plane (Team:Elastic-Agent-Control-Plane) |
LGTM, someone from the control plane team (@blakerouse?) should approve this though as I think this affects them the most. |
@Mergifyio update |
✅ Branch has been successfully updated |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, tested locally. I think we have unrelated E2E test failures however
if localUserName := os.Getenv("BEAT_SETUID_AS"); localUserName != "" && syscall.Geteuid() == 0 { | ||
sysInfo, err := sysinfo.Host() | ||
isContainer := false | ||
if err == nil && sysInfo.Info().Containerized != nil { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm just thinking about the distinction between this running in a container and running in our container. I think this check is fine given that we also check BEAT_SETUID_AS
which wouldn't be in a custom container most likely.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I concur, this check does not eliminate completely the risk of misusing this feature.
@Mergifyio update |
✅ Branch has been successfully updated |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looking at the change I am ok with that change @blakerouse Any concerns on your side?
@emilioalvap looks like the new linter we installed is now complaining about some stuff:
|
This pull request is now in conflicts. Could you fix it? 🙏
|
30616e5
to
3939ad0
Compare
This pull request is now in conflicts. Could you fix it? 🙏
|
0ab1266
to
d51aaa1
Compare
/package |
1 similar comment
/package |
/test |
@Mergifyio update |
✅ Branch has been successfully updated |
This pull request is now in conflicts. Could you fix it? 🙏
|
… docker containers and restrict heartbeat setuid to containerized instances
572ad02
to
a46ae2d
Compare
/test |
@Mergifyio update |
☑️ Nothing to do
|
/package |
/test |
…xecuted as root (#30869) (#31406) * Include group write permission in runtime directories, adapt umask on docker containers and restrict heartbeat setuid to containerized instances * Add CHANGELOG entries * Fix linter issues (cherry picked from commit 8906734) Co-authored-by: Emilio Alvarez Piñeiro <[email protected]>
…xecuted as root (elastic#30869) * Include group write permission in runtime directories, adapt umask on docker containers and restrict heartbeat setuid to containerized instances * Add CHANGELOG entries * Fix linter issues
…xecuted as root (#30869) * Include group write permission in runtime directories, adapt umask on docker containers and restrict heartbeat setuid to containerized instances * Add CHANGELOG entries * Fix linter issues
Since it's a bugfix, I'm including
elastic-agent
changes to be backported to8.2
and I will create a separate PR onelastic-agent
repo for8.3
.What does this PR do?
This PR:
umask
on docker containers,setuid
to containerized instances.This PR works in tandem with elastic/elastic-agent#202.
Why is it important?
Heartbeat comes with
setuid
functionality to be able to runnpm
as a non-root user during synthetics checks inside anelastic-agent
container. Without it,npm
would complain when executed as root.When
elastic-agent
is executed as root, all beats (filebeat, metricbeat, ...) run asroot
andheartbeat
will run as user specified onBEAT_SETUID_AS
env variable,elastic-agent
by default. This user needs permissions to write to local directories and we enable that by making the user belong toroot
group. But some of the directories that the user need access to, that are created during runtime, do not allow for group write permission, meaningheartbeat
won't be able to start and it will eventually report as degraded:These are the directories:
Most of our own documented examples to run
heartbeat
andelastic-agent
in k8s are configured to run asroot
, check here and here for reference.Checklist
CHANGELOG.next.asciidoc
orCHANGELOG-developer.next.asciidoc
.How to test this PR locally
root
:Related issues
Screenshots