-
Notifications
You must be signed in to change notification settings - Fork 241
Conversation
Docker mounts cgroup stats in the wrong place (i.e., inconsistently with the hierarchy in /proc/1/cgroup). This breaks the logic for traversing from /proc/self/cgroup to the stats for a process running inside a container. In core Elasticsearch, an undocumented hack was added to support this, for now. Thus, for cgroup stats to be available from Elasticsearch running inside a Docker container, the Elasticsearch process should be started with `-Des.cgroups.hierarchy.override=/`. This commit adds support for this hack.
Thank you @jasontedor for this PR! I added one comment. I'd also like to add a test once we have this in the image, what would be the right way of testing if the option works as expected? For the record, I did some reading to familiarize myself with the issue and found historical info in this docker issue as well as in: https://fabiokung.com/2014/03/13/memory-inside-linux-containers/). There also seems to be an ongoing effort for a library that standardizes how to retrieve info normally found under |
The best way to test this is to start a container against a build of Elasticsearch that contains the hack, and submit the request I tested this in the following way. I built a package of Elasticsearch containing the hack. I started a web server in the directory containing the package with $ sudo docker build --no-cache -t elasticsearch-cgroups --build-arg ES_DOWNLOAD_URL=http://192.168.1.232:8000 --build-arg ELASTIC_VERSION=6.0.0-alpha1-SNAPSHOT . where the web server started above is listening on 192.168.1.232. I started an image with: $ sudo docker run -p 9200:9200 -e "http.host=0.0.0.0" -e "transport.host=127.0.0.1" elasticsearch-cgroups and all was right in the world.
Additional relevant issues are opencontainers/runtime-spec#65, opencontainers/runtime-spec#66, opencontainers/runtime-spec#397, and opencontainers/runc#781 (note that as of now, the last one is still open). |
I don't see a comment on the PR? |
@jasontedor Ah sorry didn't mention your handle: 67fd123#r97553765 |
@@ -22,4 +22,6 @@ do | |||
fi | |||
done < <(env) | |||
|
|||
echo "-Des.cgroups.hierarchy.override=/" >> config/jvm.options |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wonder if it is better to do this fix in this file (which is the CMD
) or in the Dockerfile
. There are + and - for both approaches:
As shown above, though the CMD (bin/es-docker) script
As this is always invoked when starting the container (unless the user overrides our CMD
) there is a corner case that if a user ships his own jvm.options
via a bind mount (or in the Dockerfile, while building his own docker image by forking ours) there could be a duplicate entry or generally the user get surprised where did that entry come from.
On the other hand having the echo
line in bin/es-docker
will always make sure that the parameter is present, but we could ensure it's done only if needed with a combo of grep + echo.
Via the Dockerfile
If we just add the above echo command as a RUN line in the Dockerfile
(after extracting the tarball) we ensure that things are correct in the image-shipped jvm.options
file and also no unexpected surprises if the user decides to bind mount jvm.options or create his own image.
Downside of this approach is that it doesn't "enforce" the option for the user, in case he opts to use his own config file.
I'd like to hear your opinion on this. FWIW I am usually leaning in favor of solutions that don't try too hard to be clever and invasive so I think I'd favor doing it in the Dockerfile.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@dliappis Given your thoughts here, which I agree with, I wonder what you think of:
diff --git a/build/elasticsearch/bin/es-docker b/build/elasticsearch/bin/es-docker
index b50046e..de544fd 100755
--- a/build/elasticsearch/bin/es-docker
+++ b/build/elasticsearch/bin/es-docker
@@ -22,6 +22,6 @@ do
fi
done < <(env)
-echo "-Des.cgroups.hierarchy.override=/" >> config/jvm.options
+export ES_JAVA_OPTS="-Des.cgroups.hierarchy.override=/ $ES_JAVA_OPTS"
exec bin/elasticsearch ${es_opts}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I like this much more. Mutating that file could be really nasty, since it might be owned by the user and living on some persistent storage that they control.
If we can apply this fix via an environment variable, is there any reason not to do so with an ENV
declaration in the Dockerfile? I think that's the most transparent and obvious place for future readers to understand what's happening.
Wherever we do it, let's also have a comment linking back to this thread, since it's a pretty subtle thing that's going on.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Correct me if I'm wrong please, but I don't think we can do it in the Dockerfile since that bakes the value of ES_JAVA_OPTS into the image but users can override this externally when running the image.
I will push a comment explaining the situation later this morning.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Correct me if I'm wrong please, but I don't think we can do it in the Dockerfile since that bakes the value of ES_JAVA_OPTS into the image but users can override this externally when running the image.
Yes that is correct and in the website docs docker-compose.yml example we are showing how to use ES_JAVA_OPTS
as one method of defining the jvm heap size.
@jasontedor Just so that I understand better this solution, basically es.cgroups.hierarchy.override
can be defined as a jvm option, as documented here hence will be honored when presented through ES_JAVA_OPTS
?
I am asking this as my understanding is that most parameters defined in jvm.options
can also be passed as -E
style cli args. If that is true, the user can run the image using docker run ... -e es.cgroups.hierarchy.override=/
and so it could be the value here instead of the existing es_opts=''
?
(Note that I tried the last point with 6.0.0-alpha1-SNAPSHOT and received:
uncaught exception in thread [main]
org.elasticsearch.bootstrap.StartupException: java.lang.IllegalArgumentException: unknown setting [es.cgroups.hierarchy.override] please check that any required plugins are installed, or check the breaking changes documentation for removed settings
)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just so that I understand better this solution, basically
es.cgroups.hierarchy.override
can be defined as a jvm option, as documented here hence will be honored when presented throughES_JAVA_OPTS
?
Yes.
I am asking this as my understanding is that most parameters defined in
jvm.options
can also be passed as-E
style cli args. If that is true, the user can run the image usingdocker run ... -e es.cgroups.hierarchy.override=/
and so it could be the value here instead of the existinges_opts=''
?
This part is not right, ES_JAVA_OPTS
and jvm.options
control JVM flags, so options in ES_JAVA_OPTS
can also be set in jvm.options
and vice-versa. The -E
CLI flags to Elasticsearch are Elasticsearch settings and can also be set in elasticsearch.yml
.
(Note that I tried the last point with 6.0.0-alpha1-SNAPSHOT and received:
uncaught exception in thread [main] org.elasticsearch.bootstrap.StartupException: java.lang.IllegalArgumentException: unknown >setting [es.cgroups.hierarchy.override] please check that any required plugins are installed, or >check the breaking changes documentation for removed settings
)
This is expected since es.cgroups.hierarchy.override
is not an Elasticsearch setting, we are reading it as JVM system property which is set via the JVM flag -Des.cgroups.hierarchy.override=/
; this can be done via jvm.options
or ES_JAVA_OPTS
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the clarification @jasontedor !
@jasontedor Thanks for the info about the endpoint |
@Jarpy I think I understand better the issue: From reading the initial comment I was under the impression that the issue is an inconsistency between Digging a bit more, I guess I didn't understand that the real issue is that the
|
Correct, sorry for not being clear. |
So that I could build a Docker image from an Elasticsearch distribution that I assembled locally. The Dockerfile does this:
so setting ES_DOWNLOAD_URL to point to the web server lets the Docker build just pull the package (and the sha1 file) from the package I just built. |
jenkins, test it |
I've tested this manually for now as we can't test against master as there are no published artifacts. (I will need to adjust the building framework for this to work.) |
Docker mounts cgroup stats inconsistently with the hierarchy in /proc/self/cgroup. This breaks the logic for traversing from /proc/self/cgroup to the stats for a process running inside a container. In core Elasticsearch, an undocumented hack was added to support this, for now. Thus, for cgroup stats to be available from Elasticsearch running inside a Docker container, the Elasticsearch process should be started with `-Des.cgroups.hierarchy.override=/`. Add hack for obtaining cgroup stats and comment it. Relates #25 Relates elastic/elasticsearch#22757
Docker mounts cgroup stats inconsistently with the hierarchy in /proc/self/cgroup. This breaks the logic for traversing from /proc/self/cgroup to the stats for a process running inside a container. In core Elasticsearch, an undocumented hack was added to support this, for now. Thus, for cgroup stats to be available from Elasticsearch running inside a Docker container, the Elasticsearch process should be started with `-Des.cgroups.hierarchy.override=/`. Add hack for obtaining cgroup stats and comment it. Relates #25 Relates elastic/elasticsearch#22757
Docker mounts cgroup stats in the wrong place (i.e., inconsistently with the hierarchy in /proc/1/cgroup). This breaks the logic for traversing from /proc/self/cgroup to the stats for a process running inside a container. In core Elasticsearch, an undocumented hack was added to support this, for now. Thus, for cgroup stats to be available from Elasticsearch running inside a Docker container, the Elasticsearch process should be started with
-Des.cgroups.hierarchy.override=/
. This commit adds support for this hack.Relates elastic/elasticsearch#22757