-
Notifications
You must be signed in to change notification settings - Fork 1.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
tools/docker/syzbot: follow best practices #5375
base: master
Are you sure you want to change the base?
Conversation
tarasmadan
commented
Oct 8, 2024
•
edited
Loading
edited
- Remove duplicates.
- Prepend apt-get install by apt-get update. (https://docs.docker.com/build/building/best-practices/#apt-get)
- Less RUN calls - less layers.
a6c1fb4
to
001f971
Compare
tools/docker/syzbot/Dockerfile
Outdated
RUN sudo update-alternatives --install /usr/bin/llvm-objcopy llvm-objcopy /usr/bin/llvm-objcopy-15 100 | ||
RUN sudo update-alternatives --install /usr/bin/llvm-objdump llvm-objdump /usr/bin/llvm-objdump-15 100 | ||
RUN sudo update-alternatives --install /usr/bin/llvm-addr2line llvm-addr2line /usr/bin/llvm-addr2line-15 100 | ||
RUN apt-get update --allow-releaseinfo-change && \ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Based on my experience building containers, it's actually better to have each install on a separate RUN line b/c docker build caching.
If we have them all in one command, and if one package in the middle gets rotten, or we need to add a new one, all caching is invalidating.
I don't see any downsides of separate RUN lines (process start overhead is negligible in this case).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
https://docs.docker.com/build/building/best-practices/#apt-get :
"Always combine RUN apt-get update with apt-get install in the same RUN statement."
The motivation - every time we modify apt-get install lines, we want to run apt-get update.
Caching the apt-get update result we later use old versions.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If I am reading the best practice correctly it says: this allows you to install latest packages when you re-build containers.
But what they suggest both makes things worse and does not achieve what it claims to achieve.
If we are re-building the containers after a year, and change only the command line prompt at the end (or something else besides apt-get), we also want latest packages, but this trick won't work, caching will kick in.
The way it makes things significantly worse: if I have to build the container 10 times within few hours to figure out, which packages got broken and which I want to install, I do want caching, and, no, I don't want getting latest versions each time.
What will work much better, and will actually achieve installing latest packages is docker build --no-cache
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I feel some inconsistency here:
- "apt-get install -y -q --no-install-recommends llvm-15 clang-15 clang-format-15 clang-tidy-15 lld-15" installs 5 packages in one RUN. All seems to be connected (llvm+clang).
- Lines 37-46 are connected the same way (llvm+clang) but were splitted in 10 RUN calls.
I understand your concern. But in this case we're installing the strongly connected components. I don't think it makes sense to make a 10 levels deep cache here.
docker build --no-cache
. Yes. And its quite sad we're not using github actions to autobuild it...
I hope it'll be the syz-ci refactoring scope.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What's the difference between installing several packages with one apt-get command and multiple commands?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
From chatGPT:
# Copyright 2021 syzkaller project authors. All rights reserved.
# Use of this source code is governed by Apache 2 LICENSE that can be found in the LICENSE file.
# See /tools/docker/README.md for details.
FROM debian:bookworm
# Set non-interactive mode for apt-get
ARG DEBIAN_FRONTEND=noninteractive
# Update and install essential packages in one layer to reduce image size
RUN apt-get update && apt-get install -y -q --no-install-recommends \
# Build essentials:
gcc g++ binutils make ccache \
# Common utilities:
unzip curl sudo procps psmisc nano vim git bzip2 dh-autoreconf software-properties-common \
# Kernel build dependencies:
flex bison bc gawk dwarves cpio texinfo texi2html lzop lbzip2 \
zlib1g-dev libelf-dev libncurses-dev libmpc-dev libssl-dev \
# Cuttlefish and gVisor dependencies:
rsync libarchive-tools \
# Cross-compilation tools:
crossbuild-essential-amd64 crossbuild-essential-arm64 libbpf-dev \
&& apt-get clean && rm -rf /var/lib/apt/lists/*
# Conditional installation for cross-compilation
RUN if [ "$(uname -m)" = "x86_64" ]; then \
apt-get update && apt-get install -y -q --no-install-recommends \
libc6-dev-i386 libc6-dev-i386-amd64-cross lib32gcc-12-dev lib32stdc++-12-dev \
g++-arm-linux-gnueabi g++-aarch64-linux-gnu g++-powerpc64le-linux-gnu \
g++-mips64el-linux-gnuabi64 g++-s390x-linux-gnu g++-riscv64-linux-gnu; \
fi && apt-get clean && rm -rf /var/lib/apt/lists/*
# Install Go
RUN curl -sSL "https://dl.google.com/go/go1.22.7.linux-$(uname -m | sed 's/aarch64/arm64/;s/x86_64/amd64/').tar.gz" | tar -C /usr/local -xz \
&& echo "export PATH=/usr/local/go/bin:\$PATH" >> /etc/profile
# Install LLVM and Clang
RUN curl -sSL https://apt.llvm.org/llvm-snapshot.gpg.key | apt-key add - \
&& add-apt-repository "deb http://apt.llvm.org/bookworm/ llvm-toolchain-bookworm-15 main" \
&& apt-get update --allow-releaseinfo-change && apt-get install -y -q --no-install-recommends \
llvm-15 clang-15 clang-format-15 clang-tidy-15 lld-15 \
&& apt-get clean && rm -rf /var/lib/apt/lists/*
# Set up alternatives for clang
RUN update-alternatives --install /usr/bin/clang clang /usr/bin/clang-15 100 \
&& update-alternatives --install /usr/bin/clang++ clang++ /usr/bin/clang++-15 100 \
&& update-alternatives --install /usr/bin/clang-tidy clang-tidy /usr/bin/clang-tidy-15 100 \
&& update-alternatives --install /usr/bin/clang-format clang-format /usr/bin/clang-format-15 100 \
&& update-alternatives --install /usr/bin/ld.lld ld.lld /usr/bin/lld-15 100 \
&& update-alternatives --install /usr/bin/llvm-nm llvm-nm /usr/bin/llvm-nm-15 100 \
&& update-alternatives --install /usr/bin/llvm-ar llvm-ar /usr/bin/llvm-ar-15 100 \
&& update-alternatives --install /usr/bin/llvm-objcopy llvm-objcopy /usr/bin/llvm-objcopy-15 100 \
&& update-alternatives --install /usr/bin/llvm-objdump llvm-objdump /usr/bin/llvm-objdump-15 100 \
&& update-alternatives --install /usr/bin/llvm-addr2line llvm-addr2line /usr/bin/llvm-addr2line-15 100
# Create a symlink for Python
RUN mkdir -p /usr/grte/v5/bin && ln -s /usr/bin/python3 /usr/grte/v5/bin/python2.7
# Install Bazel
RUN curl -o /usr/local/bin/bazel -sSL "https://releases.bazel.build/7.1.2/release/bazel-7.1.2-linux-$(uname -m | sed 's/aarch64/arm64/')" \
&& chmod ugo+x /usr/local/bin/bazel
# Install QEMU from backports
RUN add-apt-repository "deb http://deb.debian.org/debian bookworm-backports main" \
&& apt-get update && apt-get install -t bookworm-backports -y -q --no-install-recommends \
qemu-user \
&& apt-get clean && rm -rf /var/lib/apt/lists/*
# Conditional installation for QEMU system packages
RUN if [ "$(uname -m)" = "x86_64" ]; then \
apt-get install -t bookworm-backports -y -q --no-install-recommends \
qemu-utils qemu-system-misc qemu-system-x86 qemu-system-arm qemu-system-aarch64 \
qemu-system-s390x qemu-system-mips qemu-system-ppc; \
fi && apt-get clean && rm -rf /var/lib/apt/lists/*
# Create syzkaller user
RUN useradd --create-home syzkaller
# Set custom prompt for root
RUN echo "export PS1='\n\W🤖 '" >> /root/.bashrc
# Copy the command script
COPY run-syz-command.sh /run-syz-command.sh
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not related to the docker layers but perhaps, we could've used stable LLVM versions that kernel doc suggests? I've found that it's the most stable way rather than installing from the distro's packages.
From https://docs.kernel.org/kbuild/llvm.html:
We provide prebuilt stable versions of LLVM on kernel.org. These have been optimized with profile data for building Linux kernels, which should improve kernel build times relative to other distributions of LLVM.
So we could've done something like this:
RUN wget -O- https://mirrors.edge.kernel.org/pub/tools/llvm/files/llvm-15.0.7-x86_64.tar.gz | tar -xzf - -C /usr/bin/
and the most of llvm, clang dpkg packages will not be required in Dockerfile, thus less layers/caching.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Feel free to create issue and invite @ramosian-glider + @a-nogikh to discuss it!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this archive contains the most of llvm utils, also llvm-readelf, llvm-strip
mentioned in this PR #5367:
user@work:~/opt/llvm-14.0.6-x86_64$ ls -l
total 12
drwxr-xr-x 2 user user 4096 Oct 10 21:34 bin
drwxr-xr-x 3 user user 4096 Apr 26 23:24 include
drwxr-xr-x 3 user user 4096 Apr 26 23:24 lib
user@work:~/opt/llvm-14.0.6-x86_64$ cd bin
user@work:~/opt/llvm-14.0.6-x86_64/bin$ ls -l
total 251664
lrwxrwxrwx 1 user user 8 Apr 26 23:24 clang -> clang-14
lrwxrwxrwx 1 user user 5 Apr 26 23:24 clang++ -> clang
-rwxr-xr-x 1 user user 100523848 Apr 26 23:20 clang-14
lrwxrwxrwx 1 user user 5 Apr 26 23:24 clang-cl -> clang
lrwxrwxrwx 1 user user 5 Apr 26 23:24 clang-cpp -> clang
lrwxrwxrwx 1 user user 3 Apr 26 23:24 ld64.lld -> lld
lrwxrwxrwx 1 user user 3 Apr 26 23:24 ld.lld -> lld
-rwxr-xr-x 1 user user 78376208 Apr 26 23:20 lld
lrwxrwxrwx 1 user user 3 Apr 26 23:24 lld-link -> lld
lrwxrwxrwx 1 user user 15 Apr 26 23:24 llvm-addr2line -> llvm-symbolizer
-rwxr-xr-x 1 user user 15696696 Apr 26 23:20 llvm-ar
-rwxr-xr-x 1 user user 13322072 Apr 26 23:20 llvm-dwarfdump
-rwxr-xr-x 1 user user 16128872 Apr 26 23:20 llvm-nm
-rwxr-xr-x 1 user user 4450536 Apr 26 23:20 llvm-objcopy
-rwxr-xr-x 1 user user 16973168 Apr 26 23:20 llvm-objdump
lrwxrwxrwx 1 user user 7 Apr 26 23:24 llvm-ranlib -> llvm-ar
lrwxrwxrwx 1 user user 12 Apr 26 23:24 llvm-readelf -> llvm-readobj
-rwxr-xr-x 1 user user 6598288 Apr 26 23:21 llvm-readobj
-rwxr-xr-x 1 user user 319464 Apr 26 23:20 llvm-strings
lrwxrwxrwx 1 user user 12 Apr 26 23:24 llvm-strip -> llvm-objcopy
-rwxr-xr-x 1 user user 5293248 Apr 26 23:20 llvm-symbolizer
lrwxrwxrwx 1 user user 3 Apr 26 23:24 wasm-ld -> lld
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You messages will be lost here. Please copy your proposal to the new issue.
It is better to make changes more focused to increase the landing probability.
Adding more ideas here we postpone this PR landing...
001f971
to
58e3080
Compare
1. Remove duplicates. 2. Prepend apt-get install by apt-get update. (https://docs.docker.com/build/building/best-practices/#apt-get) 3. Less RUN calls - less layers.
58e3080
to
b6673b1
Compare