-
Notifications
You must be signed in to change notification settings - Fork 5.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Thinner docker image #1598
Thinner docker image #1598
Conversation
This figure shows that the size of new docker image based on this PR is far less than the current image "0.10.0rc1-cpu". |
paddle/scripts/docker/buildall.sh
Outdated
|
||
function build_in_docker() { | ||
docker build . -t paddle-build-env -f paddle/scripts/docker/buildimage/Dockerfile.build | ||
BUILDER=$(docker run -d paddle-build-env) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
这里用docker run -d --rm --name paddle-builder paddle-build-env
吧,也不需要$BUILDER
和后面的docker rm
了
赞👍,有办法能让这个方式(docker build两次)在dockerhub跑起来吗?如果不行的话我来把我们的CI服务器设置一下专门搞这个吧。感觉瘦身版的docker image非常有必要。 |
paddle/scripts/docker/buildall.sh
Outdated
BUILDER=$(docker run -d paddle-build-env) | ||
docker exec $BUILDER /bin/bash -c "export BUILD_AND_INSTALL=ON && /paddle/paddle/scripts/docker/build.sh" | ||
mkdir -p $BINARIES_DIR | ||
docker cp $BUILDER:/usr/local/opt/paddle/bin/paddle_pserver_main $BINARIES_DIR |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
这个PR看上去和我在 #1599 (comment) 描述的还是不一样。我那个方法不需要 docker cp,因为源码目录 paddle
以及其中的子目录paddle/build
都在host上,是通过 docker run -v $PWD:/paddle paddlepaddle/paddle:dev
中的 -v
参数分享给 development container 的,所以二进制文件自然就在 paddle/build
子目录里了。
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
转到#1602 |
paddle/scripts/docker/buildall.sh
Outdated
docker cp $BUILDER:/usr/local/bin/paddle $BINARIES_DIR | ||
docker cp $BUILDER:/usr/local/opt/paddle/bin/paddle_usage $BINARIES_DIR | ||
|
||
docker cp $BUILDER:/usr/local/opt/paddle/share/wheels $BINARIES_DIR |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
我记得Paddle的CMake是可以pack出来deb包的。如果想要编译出来一个可以被其他镜像用的二进制包,不如直接把编译的时候打包成deb,再把deb拷贝出来。这样做的好处是:
- 不用写死拷贝的路径了。
- 修改CMake文件来增加新二进制文件的时候,不用修改dockerfile了。
打包deb的代码好久没动了,具体的可以参考这里
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
|
||
ENV HOME /root | ||
|
||
RUN sed 's@http:\/\/archive.ubuntu.com\/ubuntu\/@mirror:\/\/mirrors.ubuntu.com\/mirrors.txt@' -i /etc/apt/sources.list && \ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
替换成 mirror://mirrors.ubuntu.com/mirrors.txt,会导致ubuntu的一些PPA报 hashsum mismatched 错误。主要原因是,有一些mirror的同步周期比较长,导致PPA里面依靠的软件包还没有出现在mirror里面。
感觉默认情况下用ubuntu的官方源是靠谱的。之前有一个编译参数是 UBUNTU_MIRROR
。如果需要使用mirror,就在编译Paddle的时候,添加编译参数即可(参考这个文件的第六行第七行)。
docker build --build-arg UBUNTU_MIRROR=mirror://mirrors.ubuntu.com/mirrors.txt
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
我测试的情况看,造成hashsum mismatched 错误的原因应该是GFW的问题。添加新的mirrors之后,这种情况基本不再出现。同时,update的时候应该会检测最近的源在哪里。所以还是建议把mirrors加到里边。
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
我的建议也是在代码中使用Ubuntu的官方源,可以在文档中说明替换源的方法。
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
To my understand, if we are going to implement the design, we shouldn't add any more Dockefiles in this PR. Instead, we should have only one Dockerfile.dev, which builds the development environment.
paddle/scripts/docker/buildall.sh
Outdated
} | ||
|
||
function build_paddle_core() { | ||
docker build . -t paddle-core:$VERSION -f paddle/scripts/docker/paddle-core/Dockerfile |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@reyoung and I worked hard and reduced the number of Docker images. Why we'd need this many Docker images here? That is core
for? This differs from the two images as described in our design doc.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Will rename "paddle-core" to paddle. It is the production image
.gitignore
Outdated
@@ -1,5 +1,7 @@ | |||
*.DS_Store | |||
build/ | |||
build/paddle/math |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
git必须忽略整个build目录吧。
.dockerignore 也应该忽略整个build目录,因为实际上使用的是host上的build目录。
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
改错了。现在使用deb包安装,所以可以忽略build目录了。
cd build | ||
cmake .. -DWITH_GPU=OFF -DWITH_SWIG_PY=ON -DWITH_AVX=ON | ||
cmake .. -DWITH_GPU=OFF -DWITH_SWIG_PY=ON -DWITH_AVX=ON -DWITH_SWIG_PY=ON -DWITH_STYLE_CHECK=OFF -DCMAKE_EXPORT_COMPILE_COMMANDS=ON |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
这里这些变量都写死了,那如果要支持 GPU怎么办呢?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
这个脚本是会用cpack
打包4个deb包,分别对应cpu, cpu-noavx, gpu, gpu-noavx,在制作dockerimage的时候决定使用哪个
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
另外,后文说到要去掉 paddle/scripts/deb 目录。那这个文件是不是就不需要了?
paddle/scripts/docker/README.md
Outdated
@@ -0,0 +1,9 @@ | |||
# Build docker image | |||
|
|||
We use a docker environment to build paddle binaries and put it into a runtime image `paddle-core` for uses of most cases |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
build的结果是不是就放在host的文件系统里就行了?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
目前已经改成这样了,是mount host的目录来build的
mkdir build | ||
|
||
# clean build dir and third_party dir cache | ||
rm -rf build third_party |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
不删除build文件夹有坏处吗?好处应该可以加速build过程。好像make可以自动追踪什么文件被改变了,感觉不会有问题?
要是用户想要clean build可以执行build.sh之前rm -rf build
或者有个参数./build.sh clean
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
之前编译遇到不删除build和thirdparty,导致protobuf的依赖没被cmake找到。或许可以只删除CmakeCache?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
明白了,没事,这个不影响merge。
make -j `nproc` | ||
# FIXME(typhoonzero): add ARCH gpu noavx flag to CPACK_SYSTEM_NAME. Why -D not affect anything? | ||
cpack -D CPACK_GENERATOR='DEB' .. | ||
mv *.deb ~/dist/cpu | ||
|
||
rm -rf * |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
不是很了解cpack怎么工作的,这里和以后的几处rm -rf *
需要吗?。cmake好像不会受干扰,而且不删掉反而能加速编译过程?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
根据 #1625 ,考虑直接去掉paddle/scripts/deb目录了,使用docker目录下的build.sh完成
paddle/scripts/docker/buildall.sh
Outdated
# FIXME: need to wait a signal not sleeping | ||
BUILDER=$(docker run -d -v ${PWD}:/root/paddle -v ${DEB_DIST_DIR}:/root/dist paddle-build-env sleep 3600) | ||
# NOTICE: build deb files for real paddle image | ||
docker exec $BUILDER /bin/bash -c "/root/paddle/paddle/scripts/deb/build_scripts/build.sh" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
以上两句是不是可以改成
docker run --rm -v ${PWD}:/root/paddle -v ${DEB_DIST_DIR}:/root/dist paddle-build-env bash /root/paddle/paddle/scripts/deb/build_scripts/build.sh
然后命令结束的时候对应的返回码应该也是正确的(docker run ... ; echo $?
),
接下来的docker stop $BUILDER && docker rm $BUILDER
也可以删掉了。
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
根据 #1625 去掉了buildall.sh 自动build image的过程新开一个PR,使用travis ci完成
paddle/scripts/docker/build.sh
Outdated
MIRROR_UPDATE="\\" | ||
fi | ||
|
||
cat > /paddle/build/Dockerfile <<EOF |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
/paddle/build/Dockerfile 名字是不是也得模板化啊
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
与多个tags对应
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
是指加上.gpu
,. noavx
的后缀?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
对啊
from @wangkuiyi
我们应该会有一个build.sh,它通过执行一个Dockerfile template来生成对应不同tag的Dockerfiles。我理解最好这个模板本身就在build.sh里,而不是多个tags对应的Dockerfiles在git repo里。
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
@@ -14,14 +16,12 @@ ARG WITH_STYLE_CHECK | |||
|
|||
ENV BUILD_WOBOQ=${BUILD_WOBOQ:-OFF} | |||
ENV BUILD_AND_INSTALL=${BUILD_AND_INSTALL:-OFF} | |||
ENV WITH_GPU=ON | |||
ENV WITH_GPU=OFF |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
这里应该加一个
ARG WITH_GPU
ENV WITH_GPU=${WITH_GPU:-OFF}
@@ -1,5 +1,6 @@ | |||
#!/bin/bash | |||
set -e | |||
ARCH=$(uname -i) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
什么时候会调用这个脚本呢?我看Dockerfile里调用的是 paddle/scripts/docker/build.sh 。
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
考虑在别的PR中删除"deb"这个目录了 @helinwang
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM. 不影响merge:paddle/scripts/deb/build_scripts/build.sh这个可以删掉了吗?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
TODO:
other images like jupyter and woboq stuff?