Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Disable docker sandbox #4649

Merged
merged 18 commits into from
Feb 17, 2023
Merged

Disable docker sandbox #4649

merged 18 commits into from
Feb 17, 2023

Conversation

stgpetrovic
Copy link
Collaborator

@stgpetrovic stgpetrovic commented Feb 17, 2023

Revert to local builds on cloud vs sandboxed (shm size issues).

There are a couple of issues there that intermingle.

Bazel has a bug where the cppmap files don't get included as deps when building with clang when not in sandbox, from an external repository (local repository is OK). Using a cache might alleviate this.
The suggested solution is building in the sandbox.

For some reason, docker build --shm-size=16g does not prduce a 16g /dev/shm (it is actually 64M which is the default). There are differences in docker build/run but during build it should have 16g but it does not (as checked with RUN df -h showing 64M).

So the solution for this issue is to build sandboxed (without cache, unlike the other build that has cache and does not suffer from this; we could replicate that in tpu tests too), and in docker, but to run with RUN --mount=type=tmpfs,target=/dev/shm,rw cmd which then mounts tmfps as /dev/shm and builds properly. The build needs about 4G of temp files so 16G is an overkill there.

This makes the build work (avoid bazel bug by sandboxing), and makes it fast (avoid sandbox base being on HDD, and avoid docker shm-size issue by mounting the tmpfs).

We can rework this as needed in the future when we streamline all different builds into a single config.

@stgpetrovic stgpetrovic force-pushed the disable-docker-sandbox branch 4 times, most recently from 2db515b to fa0cd25 Compare February 17, 2023 13:57
Copy link
Collaborator

@will-cromar will-cromar left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for fixing this!

Copy link
Contributor

@yeounoh yeounoh left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you!

@stgpetrovic stgpetrovic merged commit 3f091bd into master Feb 17, 2023
@stgpetrovic stgpetrovic deleted the disable-docker-sandbox branch February 22, 2023 17:00
mateuszlewko pushed a commit that referenced this pull request Mar 15, 2023
* Revert "Disable the Sandbox build in Docker (#4515)"

This reverts commit 661ab6d.

* Remove submodule step from TPU CI

* Update cloudbuild.yaml

* Remove separate TF build step

* Add bazel WORKSPACE and rc

* Update Dockerfile

* throw more compute at the TPU CI build

* more bazel jobs

* Revert to old machine type and default to local builds.

* Revert to default nonsandbox

* Remove jobs limitation

* Try smaller shm

* Fix build of llvm in sandbox

* Mount 7g of ram properly

* TEMP remove pytorch build

* remove install of packages

* confirm shm creation

* Revert machine type

---------

Co-authored-by: Will Cromar <[email protected]>
Co-authored-by: Will Cromar <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants