Skip to content
This repository has been archived by the owner on Nov 17, 2023. It is now read-only.

Windows GPU Build Conducted by build_windows.py Failed #20206

Closed
sjiagc opened this issue Apr 23, 2021 · 2 comments
Closed

Windows GPU Build Conducted by build_windows.py Failed #20206

sjiagc opened this issue Apr 23, 2021 · 2 comments

Comments

@sjiagc
Copy link
Contributor

sjiagc commented Apr 23, 2021

Description

Windows GPU Build conducted by build_windows.py failed. The root cause is that _CONSTEXPR_IF is defined to nothing when compiling the C++ header random with CUDACC.

Error Message

>nvcc.exe -forward-unknown-to-host-compiler -DDMLC_CORE_USE_CMAKE -DDMLC_LOG_STACK_TRACE_SIZE=0 -DDMLC_MODERN_THREAD_LOCAL=0 -DDMLC_STRICT_CXX11 -DDMLC_USE_CXX11 -DDMLC_USE_CXX11=1 -DDMLC_USE_CXX14 -DMSHADOW_FORCE_STREAM -DMSHADOW_INT64_TENSOR_SIZE=1 -DMSHADOW_IN_CXX11 -DMSHADOW_USE_CBLAS=1 -DMSHADOW_USE_CUDA=1 -DMSHADOW_USE_CUDNN -DMSHADOW_USE_F16C=0 -DMSHADOW_USE_MKL=0 -DMSHADOW_USE_SSE=0 -DMXNET_EXPORTS -DMXNET_USE_BLAS_OPEN=1 -DMXNET_USE_CUDA=1 -DMXNET_USE_LAPACK=1 -DMXNET_USE_LIBJPEG_TURBO=0 -DMXNET_USE_OPENCV=1 -DMXNET_USE_OPENMP=1 -DMXNET_USE_SIGNAL_HANDLER=1 -DNNVM_EXPORTS -DNOMINMAX -DUSE_CUDNN -DWIN32_LEAN_AND_MEAN -D_CRT_SECURE_NO_WARNINGS -D_SCL_SECURE_NO_WARNINGS -D__USE_XOPEN2K8 -Dmxnet_61_EXPORTS -I..\..\include -I..\..\src -I..\..\3rdparty\tvm\nnvm\include -I..\..\3rdparty\tvm\include -I..\..\3rdparty\dmlc-core\include -I..\..\3rdparty\dlpack\include -I..\..\3rdparty\mshadow -I..\..\3rdparty\miniz -I3rdparty\dmlc-core\include -isystem=D:\develop\3rd-party\OpenBLAS\0.3.13\include -isystem=D:\develop\3rd-party\opencv\4.5.2\include -isystem=C:\tshen\tools\programming\nv\cuda\v11.3\include -D_WINDOWS -Xcompiler="/W3 /GR /EHsc" --fatbin-options --compress-all -Xcompiler="-MD -O2 -Ob2" -DNDEBUG --gpu-architecture=compute_61 --gpu-code=sm_61,compute_61 "-Xcompiler=-MD -Gy /bigobj" -std=c++14 -MD -MT CMakeFiles\mxnet_61.dir\src\ndarray\ndarray_function.cu.obj -MF CMakeFiles\mxnet_61.dir\src\ndarray\ndarray_function.cu.obj.d -x cu -c ..\..\src\ndarray\ndarray_function.cu -o CMakeFiles\mxnet_61.dir\src\ndarray\ndarray_function.cu.obj -Xcompiler=-FdCMakeFiles\mxnet_61.dir\,-FS
ndarray_function.cu

C:\Program Files (x86)\Microsoft Visual Studio\2019\Community\VC\Tools\MSVC\14.28.29910\include\random(2044): error: shift count is negative
          detected during:
            instantiation of "_Flt std::_Float_upper_bound<_Flt,_Ty>(_Ty) [with _Flt=double, _Ty=std::make_unsigned_t<int>]"
(2337): here
            instantiation of "std::poisson_distribution<_Ty>::result_type std::poisson_distribution<_Ty>::_Eval(_Engine &, const std::poisson_distribution<_Ty>::param_type &) const [with _Ty=int, _Engine=std::mt19937]"
(2305): here
            instantiation of "std::poisson_distribution<_Ty>::result_type std::poisson_distribution<_Ty>::operator()(_Engine &) const [with _Ty=int, _Engine=std::mt19937]"
d:\develop\oss\mxnet\3rdparty\mshadow\mshadow\./random.h(196): here
            instantiation of "void mshadow::Random<mshadow::cpu, DType>::SamplePoisson(mshadow::Tensor<mshadow::cpu, dim, DType> *, PType) [with DType=float, dim=2, PType=float]"
d:\develop\oss\mxnet\src\ndarray\./ndarray_function-inl.h(306): here

1 error detected in the compilation of "../../src/ndarray/ndarray_function.cu".

To Reproduce

In \path\to\mxnet\ci directory, run python build_windows.py -f WIN_GPU.

Steps to reproduce

(Paste the commands you ran that produced the error.)

  1. cd \path\to\mxnet\ci
  2. set OpenBLAS_HOME=\OpenBLAS\0.3.13
  3. set OpenCV_DIR=\opencv\4.5.2
  4. set CUDA_PATH=\cuda\v11.3
  5. python build_windows.py -f WIN_GPU

What have you tried to solve it?

  1. Created a small piece of sample code to narrow down the problem.
// sampe.cu
// It has to be a cu file and is compiled by nvcc

#include <iostream>
#include <random>

int main()
{
    std::random_device rd;
    std::mt19937 gen(rd());
    std::poisson_distribution<int> d(4);
    std::cout << d(gen) << std::endl;
    return 0;
}

Actually, it's kind of MSVC limitation.

  1. I will create a PR to propose a fix.

Environment

Environment Information
----------Python Info----------
Version      : 3.9.4
Compiler     : MSC v.1916 64 bit (AMD64)
Build        : ('default', 'Apr  9 2021 11:43:21')
Arch         : ('64bit', 'WindowsPE')
------------Pip Info-----------
Version      : 21.0.1
Directory    : D:\develop\py-envs\mxnet\lib\site-packages\pip
----------MXNet Info-----------
An error occured trying to import mxnet.
This is very likely due to missing missing or incompatible library files.
Traceback (most recent call last):
  File "D:\develop\oss\diagnose.py", line 96, in check_mxnet
    print('Version      :', mxnet.__version__)
AttributeError: module 'mxnet' has no attribute '__version__'

----------System Info----------
Platform     : Windows-10-10.0.19041-SP0
system       : Windows
node         : sjiagc-laptop
release      : 10
version      : 10.0.19041
----------Hardware Info----------
machine      : AMD64
processor    : Intel64 Family 6 Model 158 Stepping 10, GenuineIntel
Name
Intel(R) Core(TM) i7-8750H CPU @ 2.20GHz

----------Network Test----------
Setting timeout: 10
Timing for MXNet: https://github.com/apache/incubator-mxnet, DNS: 0.1067 sec, LOAD: 1.8341 sec.
Timing for Gluon Tutorial(en): http://gluon.mxnet.io, DNS: 0.3620 sec, LOAD: 0.4518 sec.
Error open Gluon Tutorial(cn): https://zh.gluon.ai, <urlopen error [SSL: CERTIFICATE_VERIFY_FAILED] certificate verify failed: certificate has expired (_ssl.c:1123)>, DNS finished in 0.6243276596069336 sec.
Timing for FashionMNIST: https://apache-mxnet.s3-accelerate.dualstack.amazonaws.com/gluon/dataset/fashion-mnist/train-labels-idx1-ubyte.gz, DNS: 0.2793 sec, LOAD: 1.0372 sec.
Timing for PYPI: https://pypi.python.org/pypi/pip, DNS: 0.1217 sec, LOAD: 0.9116 sec.
Error open Conda: https://repo.continuum.io/pkgs/free/, HTTP Error 403: Forbidden, DNS finished in 0.0009951591491699219 sec.
----------Environment----------
@github-actions
Copy link

Welcome to Apache MXNet (incubating)! We are on a mission to democratize AI, and we are glad that you are contributing to it by opening this issue.
Please make sure to include all the relevant context, and one of the @apache/mxnet-committers will be here shortly.
If you are interested in contributing to our project, let us know! Also, be sure to check out our guide on contributing to MXNet and our development guides wiki.

sjiagc pushed a commit to sjiagc/incubator-mxnet that referenced this issue Apr 23, 2021
sjiagc pushed a commit to sjiagc/incubator-mxnet that referenced this issue Apr 23, 2021
sjiagc pushed a commit to sjiagc/incubator-mxnet that referenced this issue Apr 23, 2021
sjiagc pushed a commit to sjiagc/incubator-mxnet that referenced this issue Apr 23, 2021
sjiagc pushed a commit to sjiagc/incubator-mxnet that referenced this issue Apr 24, 2021
sjiagc pushed a commit to sjiagc/incubator-mxnet that referenced this issue Apr 25, 2021
szha pushed a commit that referenced this issue Apr 30, 2021
pull bot pushed a commit to vishalbelsare/incubator-mxnet that referenced this issue Apr 30, 2021
chinakook pushed a commit to chinakook/mxnet that referenced this issue May 2, 2021
@sjiagc
Copy link
Contributor Author

sjiagc commented May 3, 2021

Close due to PR apach#20207 was merged.

@sjiagc sjiagc closed this as completed May 3, 2021
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

No branches or pull requests

1 participant