Skip to content
This repository has been archived by the owner on Nov 17, 2023. It is now read-only.

Commit

Permalink
Switch to C++17 and modernize toolchain + CI (#17984)
Browse files Browse the repository at this point in the history
As per #17968, require C++17 compatible compiler. For cuda code, use C++14 mode introduced in Cuda 9. C++17 support for Cuda will be available in Cuda 11.

Switching to C++17 requires modernizing the toolchain, which exposed a number  of technical debt issues in the codebase. All blocking issues are fixed as part of this PR. See the full list below.

This PR contains the following specific changes:

    Switch CI pipeline to use gcc7 on Ubuntu and CentOS
    Switch CD pipeline to CentOS 7 with https://www.softwarecollections.org/en/scls/rhscl/devtoolset-7/ This enables us to build with gcc7 C++17 compiler while keeping a relatively old glibc requirement for distribution.
    Simplify ARM Edge builds
        Switch to standard Ubuntu / Debian cross-compilation toolchain for ARMv7, ARMv8
        Switch to https://toolchains.bootlin.com/ toolchain for ARMv6 (the Debian ARMv6 toolchain is for ARMv4 + ARMv5 + ARMv6, but we wish to only target ARMv6 and make use of ARMv6 features)
        Remove reliance on dockcross for cross compilation.
    Simplify Jetson build
        Use standard Ubuntu / Debian cross-compilation toolchain for ARMv8
        Upgrade to Cuda 10 and Jetpack 4.3
        Simplify build setup
    Simplify QEMU ARM virtualization test setup on CI
        Remove complex "Virtual Machine in Docker" logic and run a QEMU based Docker container instead based on arm32v7/ubuntu
    Fix out of bounds vector accesses in
        SoftmaxGradOpType
        MKLDNNFCBackward
    Fix use of non-standard rand_r function (which is not available on anymore on newer Android toolchains and shouldn't be use in any case).
    Fix reproducibility of RNN with Dropout
    Fix reproducibility of DGL Graph Sampling Operators
    Update tests for Android Edge build to NDK19. The previously used standalone toolchain is obsolete.

Those Dockerfiles that required refactoring as part of the effort were refactored based on the following consideration

    Maximize the use of system dependencies provided by the distribution instead of manually installing dependencies from source or from third party vendors. This reduces the complexity of the installation process and essentially pins the dependency versions, increasing CI stability. Further, Dockerfile build speed is improved. To facilitate this, use recent distribution versions. We still ensure backwards compatibility via CentOS7 based build and test stages
    Minimize the number of layers in the Dockerfile. Don't have 5 different script files executed, each calling apt-get update and install, but just execute once. Speeds up the build and reduces image size. Keep each Dockerfile simple and tailored to a purpose, instead of running 20 scripts to install dependencies for every thinkable scenario, which is unmaintainable.

Some more small changes:

    Remove outdated references to Cuda 7 and Cuda 8 in various files.
    Remove C++03 support in mshadow
    Disable broken tests
        NumpyBooleanAssignForwardCPU #17990
        test_init.test_rsp_const_init #17988
        quantized_elemwise_mul #18034

List of squashed commits

* cpp standard

* Remove leftover files of Cuda 7 and Cuda 8 support

* thrust 1.9.8 for clang10

* compiler warnings

* Disable broken test_init.test_rsp_const_init

* Disable tests invoking NumpyBooleanAssignForwardCPU

* Fix out of bounds access in SoftmaxGradOpType

* Use CentOS 7 for staticbuilds

CentOS 7 fullfills the requirements for PEP 599 manylinux-2014 and provides a
C++17 toolchain.

* Fix MKLDNNFCBackward

* Update edge toolchain

* Support platforms without rand_r

* Cleanup random.h

* Greatly simplify qemu setup

* Remove unused functions in Jenkins_steps.groovy

* Skip quantized_elemwise_mul due QuantizedElemwiseMulOpShape bug

* Fix R package installation

#18042

* Fix centos ccache

* Fix GPU Makefile staticbuild on CentOS7

* CentOS7 NCCL

* CentOS7 staticbuild fix link with libculibos
  • Loading branch information
leezu authored Apr 14, 2020
1 parent f3cfaf9 commit fb73a17
Show file tree
Hide file tree
Showing 154 changed files with 1,252 additions and 3,351 deletions.
2 changes: 1 addition & 1 deletion 3rdparty/dmlc-core
2 changes: 1 addition & 1 deletion 3rdparty/mshadow/guide/Makefile
Original file line number Diff line number Diff line change
Expand Up @@ -4,7 +4,7 @@ export CXX = g++
export NVCC =nvcc
include config.mk
include ../make/mshadow.mk
export CFLAGS = -Wall -O3 -std=c++11 -I../ $(MSHADOW_CFLAGS)
export CFLAGS = -Wall -O3 -std=c++17 -I../ $(MSHADOW_CFLAGS)
export LDFLAGS= -lm $(MSHADOW_LDFLAGS)
export NVCCFLAGS = -O3 --use_fast_math -ccbin $(CXX) $(MSHADOW_NVCCFLAGS)

Expand Down
2 changes: 1 addition & 1 deletion 3rdparty/mshadow/guide/mshadow-ps/Makefile
Original file line number Diff line number Diff line change
Expand Up @@ -4,7 +4,7 @@ export CXX = g++
export NVCC =nvcc
include config.mk
include ../../make/mshadow.mk
export CFLAGS = -Wall -O3 -std=c++11 -fopenmp -I../../ $(MSHADOW_CFLAGS)
export CFLAGS = -Wall -O3 -std=c++17 -fopenmp -I../../ $(MSHADOW_CFLAGS)
export LDFLAGS= -lm $(MSHADOW_LDFLAGS)
export NVCCFLAGS = -O3 --use_fast_math -ccbin $(CXX) $(MSHADOW_NVCCFLAGS)

Expand Down
4 changes: 2 additions & 2 deletions 3rdparty/mshadow/make/mshadow.mk
Original file line number Diff line number Diff line change
Expand Up @@ -149,13 +149,13 @@ else
endif

ifeq ($(USE_DIST_PS),1)
MSHADOW_CFLAGS += -DMSHADOW_DIST_PS=1 -std=c++11 \
MSHADOW_CFLAGS += -DMSHADOW_DIST_PS=1 -std=c++17 \
-I$(PS_PATH)/src -I$(PS_THIRD_PATH)/include
PS_LIB = $(addprefix $(PS_PATH)/build/, libps.a libps_main.a) \
$(addprefix $(PS_THIRD_PATH)/lib/, libgflags.a libzmq.a libprotobuf.a \
libglog.a libz.a libsnappy.a)
# -L$(PS_THIRD_PATH)/lib -lgflags -lzmq -lprotobuf -lglog -lz -lsnappy
MSHADOW_NVCCFLAGS += --std=c++11
MSHADOW_NVCCFLAGS += --std=c++14
else
MSHADOW_CFLAGS+= -DMSHADOW_DIST_PS=0
endif
Expand Down
24 changes: 0 additions & 24 deletions 3rdparty/mshadow/mshadow/base.h
Original file line number Diff line number Diff line change
Expand Up @@ -119,18 +119,6 @@ typedef unsigned __int64 uint64_t;
#define MSHADOW_OLD_CUDA 0
#endif

/*!
* \brief macro to decide existence of c++11 compiler
*/
#ifndef MSHADOW_IN_CXX11
#if (defined(__GXX_EXPERIMENTAL_CXX0X__) ||\
__cplusplus >= 201103L || defined(_MSC_VER))
#define MSHADOW_IN_CXX11 1
#else
#define MSHADOW_IN_CXX11 0
#endif
#endif

/*! \brief whether use SSE */
#ifndef MSHADOW_USE_SSE
#define MSHADOW_USE_SSE 1
Expand Down Expand Up @@ -207,13 +195,6 @@ extern "C" {
/*! \brief cpu force inline */
#define MSHADOW_CINLINE MSHADOW_FORCE_INLINE

#if defined(__GXX_EXPERIMENTAL_CXX0X) ||\
defined(__GXX_EXPERIMENTAL_CXX0X__) || __cplusplus >= 201103L
#define MSHADOW_CONSTEXPR constexpr
#else
#define MSHADOW_CONSTEXPR const
#endif

/*!
* \brief default data type for tensor string
* in code release, change it to default_real_t
Expand All @@ -231,13 +212,8 @@ extern "C" {
#define MSHADOW_USE_GLOG DMLC_USE_GLOG
#endif // MSHADOW_USE_GLOG

#if DMLC_USE_CXX11
#define MSHADOW_THROW_EXCEPTION noexcept(false)
#define MSHADOW_NO_EXCEPTION noexcept(true)
#else
#define MSHADOW_THROW_EXCEPTION
#define MSHADOW_NO_EXCEPTION
#endif

#if defined(_MSC_VER)
#define MSHADOW_ALIGNED(x) __declspec(align(x))
Expand Down
5 changes: 5 additions & 0 deletions 3rdparty/mshadow/mshadow/logging.h
Original file line number Diff line number Diff line change
Expand Up @@ -204,7 +204,12 @@ class LogMessageFatal {
~LogMessageFatal() MSHADOW_THROW_EXCEPTION {
// throwing out of destructor is evil
// hopefully we can do it here
#pragma GCC diagnostic push
#if __GNUC__ >= 7
#pragma GCC diagnostic ignored "-Wterminate"
#endif
throw Error(log_stream_.str());
#pragma GCC diagnostic pop
}

private:
Expand Down
4 changes: 4 additions & 0 deletions 3rdparty/mshadow/mshadow/packet-inl.h
Original file line number Diff line number Diff line change
Expand Up @@ -74,7 +74,11 @@ inline void* AlignedMallocPitch(size_t *out_pitch,
if (res == NULL) {
LOG(FATAL) << "AlignedMallocPitch failed";
}
#if __GNUC__ >= 6
#pragma GCC diagnostic ignored "-Wmaybe-uninitialized"
#endif
return res;
#pragma GCC diagnostic pop
}

/*!
Expand Down
103 changes: 1 addition & 102 deletions 3rdparty/mshadow/mshadow/random.h
Original file line number Diff line number Diff line change
Expand Up @@ -14,14 +14,7 @@
#include "./base.h"
#include "./tensor.h"
#include "./tensor_container.h"

#if MSHADOW_IN_CXX11
#include <random> // use cxx11 random by default
#endif

#if _MSC_VER
#define rand_r(x) rand()
#endif
#include <random>


namespace mshadow {
Expand Down Expand Up @@ -52,9 +45,7 @@ class Random<cpu, DType> {
* \param seed seed of prng
*/
inline void Seed(int seed) {
#if MSHADOW_IN_CXX11
rnd_engine_.seed(seed);
#endif
this->rseed_ = static_cast<uint64_t>(seed);
}
/*!
Expand All @@ -71,9 +62,6 @@ class Random<cpu, DType> {
inline void set_stream(Stream<cpu> *stream) {
}

// These samplers are only avail in C++11.
#if MSHADOW_IN_CXX11

/*!
* \brief get some random integer
* \return integer as unsigned
Expand Down Expand Up @@ -226,7 +214,6 @@ class Random<cpu, DType> {
return static_cast<DType>(dist_poisson(rnd_engine_));});
}
}
#endif

/*!
* \brief return a temporal expression storing standard gaussian random variables
Expand Down Expand Up @@ -270,98 +257,10 @@ class Random<cpu, DType> {
}

private:
#if MSHADOW_IN_CXX11
/*! \brief use c++11 random engine. */
std::mt19937 rnd_engine_;
/*! \brief random number seed used in random engine */
uint64_t rseed_;

#else

/*! \brief random number seed used by PRNG */
unsigned rseed_;
// functions
template<int dim>
inline void SampleUniform(Tensor<cpu, dim, DType> *dst,
DType a = 0.0f, DType b = 1.0f) {
if (dst->CheckContiguous()) {
this->GenUniform(dst->dptr_, dst->shape_.Size(), a, b);
} else {
Tensor<cpu, 2, DType> mat = dst->FlatTo2D();
for (index_t i = 0; i < mat.size(0); ++i) {
this->GenUniform(mat[i].dptr_, mat.size(1), a, b);
}
}
}
template<int dim>
inline void SampleGaussian(Tensor<cpu, dim, DType> *dst,
DType mu = 0.0f, DType sigma = 1.0f) {
if (sigma <= 0.0f) {
*dst = mu; return;
}
if (dst->CheckContiguous()) {
this->GenGaussian(dst->dptr_, dst->shape_.Size(), mu, sigma);
} else {
Tensor<cpu, 2, DType> mat = dst->FlatTo2D();
for (index_t i = 0; i < mat.size(0); ++i) {
this->GenGaussian(mat[i].dptr_, mat.size(1), mu, sigma);
}
}
}
inline void GenUniform(float *dptr, index_t size, float a, float b) {
for (index_t j = 0; j < size; ++j) {
dptr[j] = static_cast<float>(RandNext()) * (b - a) + a;
}
}
inline void GenUniform(double *dptr, index_t size, double a, double b) {
for (index_t j = 0; j < size; ++j) {
dptr[j] = static_cast<double>(RandNext()) * (b - a) + a;
}
}
inline void GenGaussian(float *dptr, index_t size, float mu, float sigma) {
this->GenGaussianX(dptr, size, mu, sigma);
}
inline void GenGaussian(double *dptr, index_t size, double mu, double sigma) {
this->GenGaussianX(dptr, size, mu, sigma);
}
inline void GenGaussianX(DType *dptr, index_t size, DType mu, DType sigma) {
DType g1 = 0.0f, g2 = 0.0f;
for (index_t j = 0; j < size; ++j) {
if ((j & 1) == 0) {
this->SampleNormal2D(&g1, &g2);
dptr[j] = mu + g1 * sigma;
} else {
dptr[j] = mu + g2 * sigma;
}
}
}
/*! \brief get next random number from rand */
inline DType RandNext(void) {
return static_cast<DType>(rand_r(&rseed_)) /
(static_cast<DType>(RAND_MAX) + 1.0f);
}
/*! \brief return a real numer uniform in (0,1) */
inline DType RandNext2(void) {
return (static_cast<DType>(rand_r(&rseed_)) + 1.0f) /
(static_cast<DType>(RAND_MAX) + 2.0f);
}
/*!
* \brief sample iid xx,yy ~N(0,1)
* \param xx first gaussian output
* \param yy second gaussian output
*/
inline void SampleNormal2D(DType *xx_, DType *yy_) {
DType &xx = *xx_, &yy = *yy_;
DType x, y, s;
do {
x = 2.0f * RandNext2() - 1.0f;
y = 2.0f * RandNext2() - 1.0f;
s = x * x + y * y;
} while (s >= 1.0f || s == 0.0f);
DType t = std::sqrt(-2.0f * std::log(s) / s);
xx = x * t; yy = y * t;
}
#endif
/*! \brief temporal space used to store random numbers */
TensorContainer<cpu, 1, DType> buffer_;
}; // class Random<cpu, DType>
Expand Down
2 changes: 1 addition & 1 deletion 3rdparty/mshadow/test/Makefile
Original file line number Diff line number Diff line change
Expand Up @@ -20,7 +20,7 @@ test: test.cu
test_tblob: test_tblob.cc

$(BIN) :
$(CXX) $(CFLAGS) -std=c++0x -o $@ $(filter %.cpp %.o %.c %.cc, $^) $(LDFLAGS)
$(CXX) $(CFLAGS) -std=c++17 -o $@ $(filter %.cpp %.o %.c %.cc, $^) $(LDFLAGS)

$(OBJ) :
$(CXX) -c $(CFLAGS) -o $@ $(firstword $(filter %.cpp %.c, $^) )
Expand Down
44 changes: 10 additions & 34 deletions CMakeLists.txt
Original file line number Diff line number Diff line change
Expand Up @@ -7,6 +7,9 @@ if(CMAKE_CROSSCOMPILING)
endif()

project(mxnet C CXX)
set(CMAKE_CXX_STANDARD 17)
set(CMAKE_CXX_STANDARD_REQUIRED ON)
set(CMAKE_CXX_EXTENSIONS ON) # GNU extensions used by src/operator/random/shuffle_op.cc

if(CMAKE_PROJECT_NAME STREQUAL PROJECT_NAME AND EXISTS ${CMAKE_CURRENT_SOURCE_DIR}/config.cmake)
# Load config.cmake only if mxnet is not compiled as a dependency of another project
Expand Down Expand Up @@ -59,7 +62,6 @@ option(USE_PLUGIN_CAFFE "Use Caffe Plugin" OFF)
option(USE_CPP_PACKAGE "Build C++ Package" OFF)
option(USE_MXNET_LIB_NAMING "Use MXNet library naming conventions." ON)
option(USE_GPROF "Compile with gprof (profiling) flag" OFF)
option(USE_CXX14_IF_AVAILABLE "Build with C++14 if the compiler supports it" OFF)
option(USE_VTUNE "Enable use of Intel Amplifier XE (VTune)" OFF) # one could set VTUNE_ROOT for search path
option(USE_TVM_OP "Enable use of TVM operator build system." OFF)
option(ENABLE_CUDA_RTC "Build with CUDA runtime compilation support" ON)
Expand Down Expand Up @@ -98,14 +100,7 @@ if(USE_CUDA)
"Please fix your cuda installation: https://docs.nvidia.com/cuda/cuda-installation-guide-linux/index.html#mandatory-post")
endif()
enable_language(CUDA)
set(CMAKE_CUDA_STANDARD 11)
include(CheckCXXCompilerFlag)
if(USE_CXX14_IF_AVAILABLE)
check_cxx_compiler_flag("-std=c++14" SUPPORT_CXX14)
if (SUPPORT_CXX14)
set(CMAKE_CUDA_STANDARD 14)
endif()
endif()
set(CMAKE_CUDA_STANDARD 14)
set(CMAKE_CUDA_STANDARD_REQUIRED ON)
endif()

Expand Down Expand Up @@ -153,24 +148,21 @@ add_definitions(-DDMLC_MODERN_THREAD_LOCAL=0)
# disable stack trace in exception by default.
add_definitions(-DDMLC_LOG_STACK_TRACE_SIZE=0)

add_definitions(-DDMLC_USE_CXX11)
add_definitions(-DDMLC_STRICT_CXX11)
add_definitions(-DDMLC_USE_CXX14)
add_definitions(-DMSHADOW_IN_CXX11)
if(MSVC)
add_definitions(-DWIN32_LEAN_AND_MEAN)
add_definitions(-DDMLC_USE_CXX11)
add_definitions(-D_SCL_SECURE_NO_WARNINGS)
add_definitions(-D_CRT_SECURE_NO_WARNINGS)
add_definitions(-DMXNET_EXPORTS)
add_definitions(-DNNVM_EXPORTS)
add_definitions(-DDMLC_STRICT_CXX11)
add_definitions(-DNOMINMAX)
set(CMAKE_C_FLAGS "/MP")
set(CMAKE_CXX_FLAGS "${CMAKE_C_FLAGS} /bigobj")
else()
include(CheckCXXCompilerFlag)
if(USE_CXX14_IF_AVAILABLE)
check_cxx_compiler_flag("-std=c++14" SUPPORT_CXX14)
endif()
check_cxx_compiler_flag("-std=c++11" SUPPORT_CXX11)
check_cxx_compiler_flag("-std=c++0x" SUPPORT_CXX0X)
set(CMAKE_C_FLAGS "${CMAKE_C_FLAGS} -Wall -Wno-sign-compare")
if(CMAKE_BUILD_TYPE STREQUAL "Debug")
set(CMAKE_C_FLAGS "${CMAKE_C_FLAGS} -O0 -g")
Expand All @@ -184,25 +176,11 @@ else()
set(CMAKE_C_FLAGS "${CMAKE_C_FLAGS} -O3")
endif()
set(CMAKE_CXX_FLAGS "${CMAKE_CXX_FLAGS} ${CMAKE_C_FLAGS}")
if(SUPPORT_CXX14)
add_definitions(-DDMLC_USE_CXX11=1)
add_definitions(-DDMLC_USE_CXX14=1)
add_definitions(-DMSHADOW_IN_CXX11)
set(CMAKE_CXX_FLAGS "${CMAKE_CXX_FLAGS} -std=c++14")
elseif(SUPPORT_CXX11)
add_definitions(-DDMLC_USE_CXX11=1)
add_definitions(-DMSHADOW_IN_CXX11)
set(CMAKE_CXX_FLAGS "${CMAKE_CXX_FLAGS} -std=c++11")
elseif(SUPPORT_CXX0X)
add_definitions(-DDMLC_USE_CXX11=1)
add_definitions(-DMSHADOW_IN_CXX11)
set(CMAKE_CXX_FLAGS "${CMAKE_CXX_FLAGS} -std=c++0x")
endif()
endif(MSVC)
endif()

if(NOT mxnet_LINKER_LIBS)
set(mxnet_LINKER_LIBS "")
endif(NOT mxnet_LINKER_LIBS)
endif()

if(USE_GPROF)
message(STATUS "Using GPROF")
Expand Down Expand Up @@ -530,8 +508,6 @@ if(USE_PLUGIN_CAFFE)
endif()
if(NOT DEFINED CAFFE_PATH)
if(EXISTS ${CMAKE_CURRENT_SOURCE_DIR}/caffe)
# Need newer FindCUDA.cmake that correctly handles -std=c++11
cmake_minimum_required(VERSION 3.3)
set(CAFFE_PATH ${CMAKE_CURRENT_SOURCE_DIR}/caffe)
else()
set(CAFFE_PATH $ENV{CAFFE_PATH})
Expand Down
Loading

0 comments on commit fb73a17

Please sign in to comment.