Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Paddle do not fit PEP 513 manylinux1 standard #4050

Closed
reyoung opened this issue Sep 12, 2017 · 22 comments
Closed

Paddle do not fit PEP 513 manylinux1 standard #4050

reyoung opened this issue Sep 12, 2017 · 22 comments
Assignees
Labels

Comments

@reyoung
Copy link
Collaborator

reyoung commented Sep 12, 2017

Currently, we rename Paddle's wheel package into manylinux1. It is not good because our binary does not fit manylinux1 standard.

In manylinux1 standard, the dependencies are:

GLIBC <= 2.5
CXXABI <= 3.4.8
GLIBCXX <= 3.4.9
GCC <= 4.2.0

And we should build our wheel package in CentOS 5.

But we are using C++ 11, which depends on higher GLIBCXX. The current depends is GLIBCXX==3.4.21.

The pypa give a docker image to build manylinux1 standard wheel package in here. We should use that docker image to build our wheel package.

@reyoung reyoung added the Bug label Sep 12, 2017
@typhoonzero
Copy link
Contributor

Indeed, supporting manylinux1 is reaaaally needed, but I doubt that we should make a lot of changes to our build system. I'll give it a try.

@typhoonzero typhoonzero self-assigned this Nov 2, 2017
@typhoonzero
Copy link
Contributor

typhoonzero commented Nov 2, 2017

Found some work which port manylinux1 to centos6.

Using centos 5 may introduce too many problems, including CUDA installation or third-party libraries. Will try this on CentOS6.

By the way, auditwheel is a good tool to test whether the whl package is sufficient for the dependencies.

Trying this Dockerfile:

FROM quay.io/numenta/manylinux1_x86_64_centos6:0.1.2

RUN NVIDIA_GPGKEY_SUM=d1be581509378368edeec8c1eb2958702feedf3bc3d17011adbf24efacce4ab5 && \
    curl -fsSL https://developer.download.nvidia.com/compute/cuda/repos/rhel6/x86_64/7fa2af80.pub | sed '/^Version/d' > /etc/pki/rpm-gpg/RPM-GPG-KEY-NVIDIA && \
    echo "$NVIDIA_GPGKEY_SUM  /etc/pki/rpm-gpg/RPM-GPG-KEY-NVIDIA" | sha256sum -c -

COPY cuda.repo /etc/yum.repos.d/cuda.repo

ENV CUDA_VERSION 7.5.18

ENV CUDA_PKG_VERSION 7-5-7.5-18
RUN yum install -y \
        cuda-nvrtc-$CUDA_PKG_VERSION \
        cuda-cusolver-$CUDA_PKG_VERSION \
        cuda-cublas-$CUDA_PKG_VERSION \
        cuda-cufft-$CUDA_PKG_VERSION \
        cuda-curand-$CUDA_PKG_VERSION \
        cuda-cusparse-$CUDA_PKG_VERSION \
        cuda-npp-$CUDA_PKG_VERSION \
        cuda-cudart-$CUDA_PKG_VERSION && \
    ln -s cuda-7.5 /usr/local/cuda && \
    rm -rf /var/cache/yum/*

RUN echo "/usr/local/cuda/lib64" >> /etc/ld.so.conf.d/cuda.conf && \
    ldconfig

# nvidia-docker 1.0
LABEL com.nvidia.volumes.needed="nvidia_driver"
LABEL com.nvidia.cuda.version="${CUDA_VERSION}"

RUN echo "/usr/local/nvidia/lib" >> /etc/ld.so.conf.d/nvidia.conf && \
    echo "/usr/local/nvidia/lib64" >> /etc/ld.so.conf.d/nvidia.conf

ENV PATH /usr/local/nvidia/bin:/usr/local/cuda/bin:${PATH}
ENV LD_LIBRARY_PATH /usr/local/nvidia/lib:/usr/local/nvidia/lib64

# nvidia-container-runtime
ENV NVIDIA_VISIBLE_DEVICES all
ENV NVIDIA_DRIVER_CAPABILITIES compute,utility
ENV NVIDIA_REQUIRE_CUDA "cuda>=7.5"

# for devel
RUN yum install -y \
        cuda-core-$CUDA_PKG_VERSION \
        cuda-misc-headers-$CUDA_PKG_VERSION \
        cuda-command-line-tools-$CUDA_PKG_VERSION \
        cuda-license-$CUDA_PKG_VERSION \
        cuda-nvrtc-dev-$CUDA_PKG_VERSION \
        cuda-cusolver-dev-$CUDA_PKG_VERSION \
        cuda-cublas-dev-$CUDA_PKG_VERSION \
        cuda-cufft-dev-$CUDA_PKG_VERSION \
        cuda-curand-dev-$CUDA_PKG_VERSION \
        cuda-cusparse-dev-$CUDA_PKG_VERSION \
        cuda-npp-dev-$CUDA_PKG_VERSION \
        cuda-cudart-dev-$CUDA_PKG_VERSION \
        cuda-driver-dev-$CUDA_PKG_VERSION \
        gcc-c++ \
        yum-utils && \
    rm -rf /var/cache/yum/*

RUN mkdir /tmp/gpu-deployment-kit && cd /tmp/gpu-deployment-kit && \
    rpm2cpio $(repoquery --location  gpu-deployment-kit) | cpio -id && \
    mv usr/include/nvidia/gdk/* /usr/local/cuda/include && \
    mv usr/src/gdk/nvml/lib/* /usr/local/cuda/lib64/stubs && \
    rm -rf /tmp/gpu-deployment-kit* && \
    rm -rf /var/cache/yum/*

ENV LIBRARY_PATH /usr/local/cuda/lib64/stubs:${LIBRARY_PATH}

# for paddle
# no using /opt/devtools2
ENV PATH=/usr/local/nvidia/bin:/usr/local/cuda/bin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin

RUN wget -q https://cmake.org/files/v3.5/cmake-3.5.2.tar.gz && tar xzf cmake-3.5.2.tar.gz && \
    cd cmake-3.5.2 && ./bootstrap && \
    make -j4 && make install && cd .. && rm cmake-3.5.2.tar.gz

RUN wget --no-check-certificate -qO- https://storage.googleapis.com/golang/go1.8.1.linux-amd64.tar.gz | \
    tar -xz -C /usr/local && \
    mkdir /root/gopath && \
    mkdir /root/gopath/bin && \
    mkdir /root/gopath/src


ENV GOROOT=/usr/local/go GOPATH=/root/gopath
# no using /opt/devtools2
ENV PATH=${GOROOT}/bin:${GOPATH}/bin:/usr/local/nvidia/bin:/usr/local/cuda/bin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin
ENV LD_LIBRARY_PATH=/opt/_internal/cpython-2.7.11-ucs4/lib:${LD_LIBRARY_PATH}

# protobuf 3.1.0
RUN cd /opt && wget -q --no-check-certificate https://github.com/google/protobuf/releases/download/v3.1.0/protobuf-cpp-3.1.0.tar.gz && \
    tar xzf protobuf-cpp-3.1.0.tar.gz && \
    cd protobuf-3.1.0 && ./configure && make -j4 && make install && cd .. && rm -f protobuf-cpp-3.1.0.tar.gz

RUN /opt/python/cp27-cp27mu/bin/pip install protobuf==3.1.0

RUN yum install -y sqlite-devel zlib-devel openssl-devel boost boost-devel pcre-devel vim

RUN /opt/python/cp27-cp27mu/bin/pip install numpy && go get github.com/Masterminds/glide

RUN wget -O /opt/swig-2.0.12.tar.gz https://sourceforge.net/projects/swig/files/swig/swig-2.0.12/swig-2.0.12.tar.gz/download && \
    cd /opt && tar xzf swig-2.0.12.tar.gz && cd /opt/swig-2.0.12 && ./configure && make && make install && cd /opt && rm swig-2.0.12.tar.gz

@typhoonzero
Copy link
Contributor

Updated using quay.io/numenta/manylinux1_x86_64_centos6:0.1.2, this images seems have only cp27-cp27mu python included. Adding cp27-cp27m support is also needed, build image from https://github.com/numenta/manylinux seems not passing.

@typhoonzero
Copy link
Contributor

Update using above docker generated whl, auditwheel says:

LD_LIBRARY_PATH=/opt/_internal/cpython-3.5.1/lib:$LD_LIBRARY_PATH auditwheel show python/dist/paddlepaddle-0.10.0-cp27-cp27mu-linux_x86_64.whl
Traceback (most recent call last):
  File "/usr/local/bin/auditwheel", line 11, in <module>
    sys.exit(main())
  File "/opt/_internal/cpython-3.5.1/lib/python3.5/site-packages/auditwheel/main.py", line 49, in main
    rval = args.func(args, p)
  File "/opt/_internal/cpython-3.5.1/lib/python3.5/site-packages/auditwheel/main_show.py", line 28, in execute
    winfo = analyze_wheel_abi(args.WHEEL_FILE)
  File "/opt/_internal/cpython-3.5.1/lib/python3.5/site-packages/auditwheel/wheel_abi.py", line 73, in analyze_wheel_abi
    get_wheel_elfdata(wheel_fn)
  File "/opt/_internal/cpython-3.5.1/lib/python3.5/site-packages/auditwheel/wheel_abi.py", line 42, in get_wheel_elfdata
    so_path_split[-1])
RuntimeError: Invalid binary wheel, found shared library "core.so" in purelib folder.
The wheel has to be platlib compliant in order to be repaired by auditwheel.

@typhoonzero
Copy link
Contributor

After #5396 merged, build under above image will generate a whl package results:

auditwheel show paddlepaddle-0.10.0-cp27-cp27mu-linux_x86_64.whl

paddlepaddle-0.10.0-cp27-cp27mu-linux_x86_64.whl is consistent with
the following platform tag: "linux_x86_64".

The wheel references external versioned symbols in these system-
provided shared libraries: libm.so.6 with versions {'GLIBC_2.2.5'},
libstdc++.so.6 with versions {'GLIBCXX_3.4.11', 'GLIBCXX_3.4.10',
'GLIBCXX_3.4.9', 'CXXABI_1.3.3', 'CXXABI_1.3', 'GLIBCXX_3.4.13',
'GLIBCXX_3.4'}, libdl.so.2 with versions {'GLIBC_2.2.5'},
libgcc_s.so.1 with versions {'GCC_3.3', 'GCC_3.0'}, libc.so.6 with
versions {'GLIBC_2.3.4', 'GLIBC_2.2.5', 'GLIBC_2.3.2', 'GLIBC_2.7',
'GLIBC_2.6'}, libpthread.so.0 with versions {'GLIBC_2.2.5',
'GLIBC_2.3.2'}

This constrains the platform tag to "linux_x86_64". In order to
achieve a more compatible tag, you would to recompile a new wheel from
source on a system with earlier versions of these libraries, such as
CentOS 5.

Install this whl on a pure centos 6 works fine. This tool chain can support most of our cases. Will refine the build images/scripts and put it on CI.

@typhoonzero
Copy link
Contributor

Update: I added a new repo https://github.com/PaddlePaddle/buildtools containing scripts to build development docker images of different version.

@wangkuiyi
Copy link
Collaborator

@typhoonzero How are you going to keep the relationship between versions of Paddle and buildtools? We are facing such a challenge to keep versions of models/book up to date with Paddle. The current solution is that we need a git submodule link in the models/book repo pointing to a certain version (recently released) of PaddlePaddle. We are not yet sure if this solution is perfect.

@typhoonzero
Copy link
Contributor

@wangkuiyi No. buildtools contains only Dockerfiles that can build manylinux1 sufficient building environment, and it should be static since manylinux1 defined dependencies are static. buildtools repo is not going to update until a new PEP definition came out.

@wangkuiyi
Copy link
Collaborator

Sounds reasonable. According to my experience adding the default Dockerfile, it is static. And it seems that Dockerfile.android is also static. Let's go for your proposal for a while. If it runs all good, we can switch completely to it.

@Yancey1989
Copy link
Contributor

Yancey1989 commented Nov 15, 2017

I reopen this issue, because we also need to add some project on teamCity:

  • Build different version whl package with manlinux standard
feature gate option switch
GPU ON + CUDA/CUDNN version cuda7.5_cudnn5, cuda8.0_cudnn5, cuda8.0_cudnn7
GPU OFF + AVX AVX ON/OFF (Only use default ON)
GPU OFF + cblas MKL/OpenBlas (Only use one as default)
android build for androind
  • All PR should pass unit test with manlinux develop Docker image.
  • Production Docker image should be built from manlinux Python package.
  • C-API should be built on manlinux develop Docker image.

@Yancey1989 Yancey1989 reopened this Nov 15, 2017
@Yancey1989 Yancey1989 self-assigned this Nov 15, 2017
@luotao1
Copy link
Contributor

luotao1 commented Nov 15, 2017

How much time of TeamCity if we add all above projects? @Yancey1989
And should we enhance the TeamCity agent at first? Currently, we only have 3 agents. @helinwang
https://www.jetbrains.com/teamcity/buy/#license-type=new-license

@Yancey1989
Copy link
Contributor

Hi @luotao1
If we don't enought agents on TeamCity, maybe we can make all project except PR Test building on midnight. But I think enhance the TeamCity agent is very important, there are often more than 10 tasks waitting in the Queue.

@luotao1
Copy link
Contributor

luotao1 commented Nov 15, 2017

However, our midnight is the daytime of American colleagues. If we make all project building on midnight, do they wait a lot of time?

@Yancey1989
Copy link
Contributor

Yep, I miss this question, how about disperse the builting time, make sure all project to run once a day?

@luotao1
Copy link
Contributor

luotao1 commented Nov 15, 2017

how about disperse the builting time, make sure all project to run once a day?

I agree with you!

@Yancey1989
Copy link
Contributor

Update:

  1. Use schedule trigger instead of CSV trigger in all project except PR CI.
  2. Build different version whl Python package on TeamCity https://paddleci.ngrok.io/project.html?projectId=Manylinux1&tab=projectOverview.

@helinwang
Copy link
Contributor

How many extra projects are we planning to build at midnight Beijing time? If there are within 6 projects, and each one takes 30min, it only takes an hour on three machines, it's probably fine.

@Yancey1989
Copy link
Contributor

@helinwang There are 15 configurations for totally except the PR CI, I have already configured them as schedule trigger at different time.

@Yancey1989
Copy link
Contributor

Yancey1989 commented Nov 21, 2017

Update:
limiting the number of different version of whl package, because the community edition of TeamCity limits the number of congurations.

TeamCity Configuration WITH_AVX WITH_GPU WITH_MKL Docker Image cp27-cp27mu cp27-cp27m C-API
cpu_avx_mkl ON OFF ON paddle:latest paddlepaddle-0.10.0-cp27-cp27mu-linux_x86_64.whl paddlepaddle-0.10.0-cp27-cp27m-linux_x86_64.whl paddle.tgz
cpu_avx_openblas ON OFF ON paddle:latest-openblas paddlepaddle-0.10.0-cp27-cp27mu-linux_x86_64.whl paddlepaddle-0.10.0-cp27-cp27m-linux_x86_64.whl None
cuda7.5_cudnn5_avx_mkl ON ON ON None paddlepaddle-0.10.0-cp27-cp27mu-linux_x86_64.whl paddlepaddle-0.10.0-cp27-cp27m-linux_x86_64.whl paddle.tgz
cuda8.0_cudnn5_avx_mkl ON ON ON None paddlepaddle-0.10.0-cp27-cp27mu-linux_x86_64.whl paddlepaddle-0.10.0-cp27-cp27m-linux_x86_64.whl paddle.tgz
cuda8.0_cudnn7_avx_mkl ON ON ON paddle:latest-gpu paddlepaddle-0.10.0-cp27-cp27mu-linux_x86_64.whl paddlepaddle-0.10.0-cp27-cp27m-linux_x86_64.whl paddle.tgz

@typhoonzero
Copy link
Contributor

Also add corresponding capi links?

@Yancey1989
Copy link
Contributor

@typhoonzero will do that :)

@Yancey1989
Copy link
Contributor

Yancey1989 commented Nov 22, 2017

@typhoonzero add C-API links.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

7 participants