-
Notifications
You must be signed in to change notification settings - Fork 4.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
glob(["**"]) with cyclic symlinks can OOM without a useful error message #10783
Comments
I can reliably reproduce this with the following script, which doesn't require the entire ceph source tree.
|
Very brief comment: Bazel normally handles cases like this directly and elegantly (explained in my BazelCon 2019 Lighting Talk https://youtu.be/EoYdWmMcqDs)... except for "legacy globbing" (#10610 (comment)) which is where the issue here is happening. |
Seems like this is a potential vulnerability for any production query environment? Giving to @haxorz for triage. Another team member might be interested, will ping them. |
Thank you for contributing to the Bazel repository! This issue has been marked as stale since it has not had any activity in the last 2+ years. It will be closed in the next 14 days unless any other activity occurs or one of the following labels is added: "not stale", "awaiting-bazeler". Please reach out to the triage team ( |
This issue has been automatically closed due to inactivity. If you're still interested in pursuing this, please reach out to the triage team ( |
Description of the problem / feature request:
Attempting to glob a directory that contains cyclic symbolic links normally produces a reasonably explanatory error (as in #133), which would let the developer trying to do so know that they should fix their input to not contain symbolic links.
I encountered a scenario where
glob(["**"])
on a directory with cyclic symlinks would instead cause Bazel to appear to hang, and eventually terminate with an unexplanatory OutOfMemoryError, like the following:I wouldn't expect Bazel to support constructing globs over directories containing cyclic symlinks, but this error message is rather confusing, and I was only able to parse out what was happening (i.e. that the crash came from cyclic symlinks in the source tarball of the external dependency that I was trying to build) by using a memory analyzer on the crash dump and reading over the Bazel source manually.
I suggest that Bazel should explicitly check for cyclic symlinks, and exit with a clear error message if a loop is detected.
Bugs: what's the simplest, easiest way to reproduce this bug? Please provide a minimal example if possible.
Put this in a WORKSPACE file:
Then run this:
After running for a long time, I eventually get the error shown previously.
EDIT: @cryslith's comment provides an easier way to reproduce this, which works on more machines (because it sets a smaller memory limit): #10783 (comment)
This can also be reproduced by unpacking the Ceph tarball under an empty WORKSPACE and defining a filegroup in a BUILD file. Demonstrating that this is specifically due to the cyclic symlinks can be done by deleting all
.qa
symlinks and trying again. (The Ceph tarball also has other issues that prevent its use in a filegroup, but Bazel appears to be able to identify those accurately and report useful errors for them.)What operating system are you running Bazel on?
I'm using a debian buster chroot.
What's the output of
bazel info release
?Have you found anything relevant by searching the web?
I found issues #133, #1293, #2927, and #6350, which are about other issues with globbing and symlinks, but none of them directly address this problem.
Any other information, logs, or outputs that you want to share?
Two relevant screenshots from running Eclipse Memory Analyzer on the .hprof dump:
These show that the problem leading up to the OOM crash is that a very large number of entries are populated into
com.google.devtools.build.lib.vfs.UnixGlob$GlobVisitor.results
, due to paths that cycle through the.qa
symlinks contained in the Ceph source tarball.The text was updated successfully, but these errors were encountered: