Sentiment analysis failures: invalid device function #34

elvinpoon · 2016-09-03T06:12:27Z

Here is the error code.

./train.sh
I0903 14:10:57.917793 18690 Util.cpp:144] commandline: /data2/package/pypaddle/bin/../opt/paddle/bin/paddle_trainer --config=trainer_config.py --save_dir=./model_output --job=train --use_gpu=1 --trainer_count=4 --num_passes=10 --log_period=10 --dot_period=20 --show_parameter_stats_period=100 --test_all_data_in_one_period=1
I0903 14:11:01.704715 18690 Util.cpp:113] Calling runInitFunctions
I0903 14:11:01.705032 18690 Util.cpp:126] Call runInitFunctions done.
[INFO 2016-09-03 14:11:02,367 networks.py:1122] The input order is [word, label]
[INFO 2016-09-03 14:11:02,368 networks.py:1129] The output order is [__cost_0__]
I0903 14:11:02.395427 18690 Trainer.cpp:169] trainer mode: Normal
I0903 14:11:02.395754 18690 MultiGradientMachine.cpp:108] numLogicalDevices=1 numThreads=4 numDevices=4
F0903 14:11:02.400593 18690 hl_gpu_matrix_kernel.cuh:181] Check failed: cudaSuccess == err (0 vs. 8) [hl_gpu_apply_unary_op failed] CUDA error: invalid device function
*** Check failure stack trace: ***
    @     0x7fc9c37175cd  google::LogMessage::Fail()
    @     0x7fc9c3719433  google::LogMessage::SendToLog()
    @     0x7fc9c371715b  google::LogMessage::Flush()
    @     0x7fc9c3719e1e  google::LogMessageFatal::~LogMessageFatal()
    @           0x7d65b2  hl_gpu_apply_unary_op<>()
    @           0x79d156  paddle::BaseMatrixT<>::applyUnary<>()
    @           0x79ccf0  paddle::BaseMatrixT<>::applyUnary<>()
    @           0x780733  paddle::BaseMatrixT<>::zero()
    @           0x561960  paddle::Parameter::enableType()
    @           0x564531  paddle::parameterInitNN()
    @           0x567fe9  paddle::NeuralNetwork::init()
    @           0x55ee4b  paddle::TrainerThread::TrainerThread()
    @           0x55fab7  paddle::MultiGradientMachine::MultiGradientMachine()
    @           0x58788e  paddle::GradientMachine::create()
    @           0x6e296d  paddle::TrainerInternal::init()
    @           0x6dc144  paddle::Trainer::init()
    @           0x54622d  main
    @     0x7fc9c2699830  __libc_start_main
    @           0x54db19  _start
    @              (nil)  (unknown)
/data2/package/pypaddle/bin/paddle: line 46: 18690 Aborted                 ${DEBUGGER} $MYDIR/../opt/paddle/bin/paddle_trainer ${@:2}

The text was updated successfully, but these errors were encountered:

gangliao · 2016-09-03T07:29:06Z

#18 Looks like you use cuda 8.0 with modern Gpu.

invalid device function indicates that you have a CUDA / GPU incompatibility.
Maybe you can modify CMake to fix it.

open cmake/flags.cmake and add following code:

if (CUDA_VERSION VERSION_GREATER "8.0")

list(APPEND __arch_flags " -gencode arch=compute_60,code=sm_60")

endif()

then, rebuild the project

elvinpoon · 2016-09-06T01:40:51Z

I tried this but it doesn't work...same error

gangliao · 2016-10-08T02:02:09Z

Fix CUDA_VERSION Comparsion #165

update paddle.io

* add fft_c2r numpy implementation

* 1. add interface for fft; 2. add data type predicate; 3. fix paddle.roll. * add fft c2c cufft kernel * implement argument checking & op calling parts for fft_c2c and fftn_c2c * add operator and opmaker definitions * only register float and double for cpu. * add common code for implementing FFT, add pocketfft as a dependency * add fft c2c cufft kernel function * fix bugs in python interface * add support for c2r, r2c operators, op makers, kernels and kernel functors. * test and fix bugs * 1. fft_c2c function: add support for onesided=False; 2. add complex<float>, complex<double> support for concat and flip. * 1. fft: fix python api bugs; 2. shape_op: add support for complex data types. * fft c2c cufft kernel done with complie and link * fix shape_op, add mkl placeholder * remove mkl * complete fft c2c in gpu * 1. implement mkl-based fft, FFTC2CFunctor and common function exec_fft; 2. change the design, add input and output typename as template parameter for all FFTFunctors, update pocketfft-based implementation. * complete fft c2c on gpu in ND * complete fft c2c on gpu in ND * complete fft c2c backward in ND * fix MKL-based implementation * Add frame op and CPU/GPU kernels. * Add frame op forward unittest. * Add frame op forward unittest. * Remove axis parameter in FrameFunctor. * Add frame op grad CPU/GPU kernels and unittest. * Add frame op grad CPU/GPU kernels and unittest. * Update doc string. * Update after review and remove librosa requirement in unittest. * Update grad kernel. * add fft_c2r op * Remove data allocation in TransCompute function. * add fft r2c onesided with cpu(pocketfft/mkl) and gpu * last fft c2r functor * fix C2R and R2C for cufft, becase the direction is not an option in these cases. * add fft r2c onesided with cpu(pocketfft/mkl) and gpu * fix bugs in python APIs * fix fft_c2r grad kernal * fix bugs in python APIs * add cuda fft c2r grad kernal functor * clean code * fix fft_c2r python API * fill fft r2c result with conjugate symmetry (#19) fill fft r2c result with conjugate symmetry * add placeholder for unittests (#24) * simple parameterize test function by auto generate test case from parm list (#25) * miscellaneous fixes for python APIs (#26) * add placeholder for unittests * resize fft inputs before computation is n or s is provided. * add complex kernels for pad and pad_grad * simplify argument checking. * add type promotion * add int to float or complex promotion * fix output data type for static mode * fix fft's input dtype dispatch, import fft to paddle * fix typos in axes checking (#27) * fix typos in axes checking * fix argument checking (#28) * fix argument checking * Add C2R Python layer normal and abnormal use cases (#29) * documents and single case * test c2r case * New C2R Python layer normal and exception use cases * complete rfft,rfft2,rfftn,ihfft,ihfft2,ihfftn unittest and doc string (#30) * Documentation of the common interfaces of c2r and c2c (#31) * Documentation of the common interfaces of c2r and c2c * clean c++ code (#32) * clean code * Add numpy-based implementation of spectral ops (#33) * add numpy reference implementation of spectral ops * Add fft_c2r numpy based implementation for unittest. (#34) * add fft_c2r numpy implementation * Add deframe op and stft/istft api. (#23) * Add frame api * Add deframe op and kernels. * Add stft and istft apis. * Add deframe api. Update stft and istft apis. * Fix bug in frame_from_librosa function when input dims >= 3 * Rename deframe to overlap_add. * Update istft. * Update after code review. * Add overlap_add op and stft/istft api unittest (#35) * Add overlap_add op unittest. * Register complex kernels of squeeze/unsquuze op. * Add stft/istft api unittest. * Add unittest for fft helper functions (#36) * add unittests for fft helper functions. add complex kernel for roll op. * complete static graph unittest for all public api (#37) * Unittest of op with FFT C2C, C2R and r2c added (#38) * documents and single case * test c2r case * New C2R Python layer normal and exception use cases * Documentation of the common interfaces of c2r and c2c * Unittest of op with FFT C2C, C2R and r2c added Co-authored-by: lijiaqi <[email protected]> * add fft related options to CMakeLists.txt * fix typos and clean code (#39) * fix invisible character in mkl branch and fix error in error message * clean code: remove docstring from unittest for signal.py. * always convert numpy array to paddle.Tensor to avoid comparing numpy dtype with paddle dtype. (#40) * always convert numpy array to paddle.Tensor to avoid comparing numpy dtype with paddle dtype. * fix CI Errors: numpy dtype comparison, thrust when cuda is not available (#41) 1. always convert numpy array to paddle.Tensor to avoid comparing numpy dtype with paddle dtype. 2. promote floating point tensor to complex tensor ior fft_c2c and fft_c2r; 3. fix unittest to catch UnImplementedError and RuntimeError; 4. fix compile error by avoid using thrust when cuda is not available. 5. fix sample code, use paddle.fft instead of paddle.tensor.fft * remove inclusion of thrust, add __all__ list for fft (#42) * Add api doc and update unittest. (#43) * Add doc strings. * Update overlap_add op unittest * fix MKL-based FFT implementation (#44) * fix MKL-based FFT implementation, MKL CDFT's FORWARD DOMAIN is always REAL for R2C and C2R * remove code for debug (#45) * use dynload for cufft (#46) * use std::ptrdiff_t as datatype of stride (instead of int64_t) to avoid argument mismatch on some platforms. * add complex support for fill_zeros_like * use dynload for cufft * Update doc and unittest. (#47) * Add doc of frame op and overlap_add op. * Update unittest. * use dynload for cufft (#48) 1. use dynload for cufft 2. fix unittest; 3. temporarily disable Rocm. * fix conflicts and merge upstream (#49) fix conflicts and merge upstream * fix compile error: only link dyload_cuda when cuda is available (#50) * fix compile error: only link dyload_cuda when cuda is available * fix dynload for cufft on windows (#51) 1. fix dynload for cufft on windows; 2. fix unittests. * add NOMINMAX to compile on windows (#52) add NOMINMAX to compile on windows * explicitly specify capture mode for lambdas (#55) explicitly specify capture mode for lambdas * fix fft sample (#53) * fix fft sample * update scipy and numpy version for unittests of fft (#56) update scipy and numpy version for unittests of fft * Add static graph unittests of frame and overlap_add api. (#57) * Remove cache of cuFFT & Disable ONEMKL (#59) 1. replace numpy.fft with scipy.fft as numpy<1.20 not support ortho norm 2. remove cache of cufft plans; 3. enhance error checking. 4. default WITH_ONEMKL to OFF Co-authored-by: jeff41404 <[email protected]> Co-authored-by: root <[email protected]> Co-authored-by: KP <[email protected]> Co-authored-by: lijiaqi <[email protected]> Co-authored-by: Xiaoxu Chen <[email protected]> Co-authored-by: lijiaqi0612 <[email protected]>

* 1. add interface for fft; 2. add data type predicate; 3. fix paddle.roll. * add fft c2c cufft kernel * implement argument checking & op calling parts for fft_c2c and fftn_c2c * add operator and opmaker definitions * only register float and double for cpu. * add common code for implementing FFT, add pocketfft as a dependency * add fft c2c cufft kernel function * fix bugs in python interface * add support for c2r, r2c operators, op makers, kernels and kernel functors. * test and fix bugs * 1. fft_c2c function: add support for onesided=False; 2. add complex<float>, complex<double> support for concat and flip. * 1. fft: fix python api bugs; 2. shape_op: add support for complex data types. * fft c2c cufft kernel done with complie and link * fix shape_op, add mkl placeholder * remove mkl * complete fft c2c in gpu * 1. implement mkl-based fft, FFTC2CFunctor and common function exec_fft; 2. change the design, add input and output typename as template parameter for all FFTFunctors, update pocketfft-based implementation. * complete fft c2c on gpu in ND * complete fft c2c on gpu in ND * complete fft c2c backward in ND * fix MKL-based implementation * Add frame op and CPU/GPU kernels. * Add frame op forward unittest. * Add frame op forward unittest. * Remove axis parameter in FrameFunctor. * Add frame op grad CPU/GPU kernels and unittest. * Add frame op grad CPU/GPU kernels and unittest. * Update doc string. * Update after review and remove librosa requirement in unittest. * Update grad kernel. * add fft_c2r op * Remove data allocation in TransCompute function. * add fft r2c onesided with cpu(pocketfft/mkl) and gpu * last fft c2r functor * fix C2R and R2C for cufft, becase the direction is not an option in these cases. * add fft r2c onesided with cpu(pocketfft/mkl) and gpu * fix bugs in python APIs * fix fft_c2r grad kernal * fix bugs in python APIs * add cuda fft c2r grad kernal functor * clean code * fix fft_c2r python API * fill fft r2c result with conjugate symmetry (#19) fill fft r2c result with conjugate symmetry * add placeholder for unittests (#24) * simple parameterize test function by auto generate test case from parm list (#25) * miscellaneous fixes for python APIs (#26) * add placeholder for unittests * resize fft inputs before computation is n or s is provided. * add complex kernels for pad and pad_grad * simplify argument checking. * add type promotion * add int to float or complex promotion * fix output data type for static mode * fix fft's input dtype dispatch, import fft to paddle * fix typos in axes checking (#27) * fix typos in axes checking * fix argument checking (#28) * fix argument checking * Add C2R Python layer normal and abnormal use cases (#29) * documents and single case * test c2r case * New C2R Python layer normal and exception use cases * complete rfft,rfft2,rfftn,ihfft,ihfft2,ihfftn unittest and doc string (PaddlePaddle#30) * Documentation of the common interfaces of c2r and c2c (PaddlePaddle#31) * Documentation of the common interfaces of c2r and c2c * clean c++ code (PaddlePaddle#32) * clean code * Add numpy-based implementation of spectral ops (PaddlePaddle#33) * add numpy reference implementation of spectral ops * Add fft_c2r numpy based implementation for unittest. (PaddlePaddle#34) * add fft_c2r numpy implementation * Add deframe op and stft/istft api. (#23) * Add frame api * Add deframe op and kernels. * Add stft and istft apis. * Add deframe api. Update stft and istft apis. * Fix bug in frame_from_librosa function when input dims >= 3 * Rename deframe to overlap_add. * Update istft. * Update after code review. * Add overlap_add op and stft/istft api unittest (PaddlePaddle#35) * Add overlap_add op unittest. * Register complex kernels of squeeze/unsquuze op. * Add stft/istft api unittest. * Add unittest for fft helper functions (PaddlePaddle#36) * add unittests for fft helper functions. add complex kernel for roll op. * complete static graph unittest for all public api (PaddlePaddle#37) * Unittest of op with FFT C2C, C2R and r2c added (PaddlePaddle#38) * documents and single case * test c2r case * New C2R Python layer normal and exception use cases * Documentation of the common interfaces of c2r and c2c * Unittest of op with FFT C2C, C2R and r2c added Co-authored-by: lijiaqi <[email protected]> * add fft related options to CMakeLists.txt * fix typos and clean code (PaddlePaddle#39) * fix invisible character in mkl branch and fix error in error message * clean code: remove docstring from unittest for signal.py. * always convert numpy array to paddle.Tensor to avoid comparing numpy dtype with paddle dtype. (PaddlePaddle#40) * always convert numpy array to paddle.Tensor to avoid comparing numpy dtype with paddle dtype. * fix CI Errors: numpy dtype comparison, thrust when cuda is not available (PaddlePaddle#41) 1. always convert numpy array to paddle.Tensor to avoid comparing numpy dtype with paddle dtype. 2. promote floating point tensor to complex tensor ior fft_c2c and fft_c2r; 3. fix unittest to catch UnImplementedError and RuntimeError; 4. fix compile error by avoid using thrust when cuda is not available. 5. fix sample code, use paddle.fft instead of paddle.tensor.fft * remove inclusion of thrust, add __all__ list for fft (PaddlePaddle#42) * Add api doc and update unittest. (PaddlePaddle#43) * Add doc strings. * Update overlap_add op unittest * fix MKL-based FFT implementation (PaddlePaddle#44) * fix MKL-based FFT implementation, MKL CDFT's FORWARD DOMAIN is always REAL for R2C and C2R * remove code for debug (PaddlePaddle#45) * use dynload for cufft (PaddlePaddle#46) * use std::ptrdiff_t as datatype of stride (instead of int64_t) to avoid argument mismatch on some platforms. * add complex support for fill_zeros_like * use dynload for cufft * Update doc and unittest. (PaddlePaddle#47) * Add doc of frame op and overlap_add op. * Update unittest. * use dynload for cufft (PaddlePaddle#48) 1. use dynload for cufft 2. fix unittest; 3. temporarily disable Rocm. * fix conflicts and merge upstream (PaddlePaddle#49) fix conflicts and merge upstream * fix compile error: only link dyload_cuda when cuda is available (PaddlePaddle#50) * fix compile error: only link dyload_cuda when cuda is available * fix dynload for cufft on windows (PaddlePaddle#51) 1. fix dynload for cufft on windows; 2. fix unittests. * add NOMINMAX to compile on windows (PaddlePaddle#52) add NOMINMAX to compile on windows * explicitly specify capture mode for lambdas (PaddlePaddle#55) explicitly specify capture mode for lambdas * fix fft sample (PaddlePaddle#53) * fix fft sample * update scipy and numpy version for unittests of fft (PaddlePaddle#56) update scipy and numpy version for unittests of fft * Add static graph unittests of frame and overlap_add api. (PaddlePaddle#57) * Remove cache of cuFFT & Disable ONEMKL (PaddlePaddle#59) 1. replace numpy.fft with scipy.fft as numpy<1.20 not support ortho norm 2. remove cache of cufft plans; 3. enhance error checking. 4. default WITH_ONEMKL to OFF Co-authored-by: jeff41404 <[email protected]> Co-authored-by: root <[email protected]> Co-authored-by: KP <[email protected]> Co-authored-by: lijiaqi <[email protected]> Co-authored-by: Xiaoxu Chen <[email protected]> Co-authored-by: lijiaqi0612 <[email protected]>

fea/init codegen c

Co-authored-by: jianghaicheng <[email protected]>

add keep_extend_vocab_only option

* gpu_graph_infer * simplify infer * fix * remove logs * remove logs * change logs

update readme

Merge pull request #29 from qingshui/paddlebox

optimize for async

…addle#34)

Merge develop

optimize load fc tunefile performance.

* update * update readme * update * update

[MTAI-484] fix(build): repleace murand_uniform with murand_uniform2

Is Pattern check

…dlevlp add unittest

gangliao added feature request labels Sep 3, 2016

gangliao closed this as completed Sep 5, 2016

hedaoyuan mentioned this issue Jul 18, 2017

在云上的机器跑gpu的版本的报错 #2931

Closed

typhoonzero mentioned this issue Aug 4, 2017

Paddle预测在P4机器上运行出错 #3206

Closed

typhoonzero mentioned this issue Mar 21, 2018

RuntimeError: function_attributes(): after cudaFuncGetAttributes: invalid device function #9290

Closed

xiuechen mentioned this issue Oct 28, 2019

预测出core，能帮忙看下啥原因不？paddle训练和预测的版本都是v1.3.0 #20859

Closed

qingqing01 pushed a commit to qingqing01/Paddle that referenced this issue Apr 30, 2020

Merge pull request PaddlePaddle#34 from heavengate/update_paddle_io

d8541ea

update paddle.io

DemoMoon mentioned this issue Mar 24, 2021

oneDNN 如何能提升DeepSpeech的语音处理性能 #31838

Closed

zhangting2020 pushed a commit to zhangting2020/Paddle that referenced this issue Aug 25, 2021

Try to fix compiling error of ci. (PaddlePaddle#34)

834fe71

KPatr1ck pushed a commit to KPatr1ck/Paddle that referenced this issue Sep 10, 2021

Add fft_c2r numpy based implementation for unittest. (PaddlePaddle#34)

fcd9069

* add fft_c2r numpy implementation

thisjiang pushed a commit to thisjiang/Paddle that referenced this issue Oct 28, 2021

Merge pull request PaddlePaddle#34 from Superjomn/fea/init-codegen-c

73ec6d2

fea/init codegen c

gglin001 pushed a commit to graphcore/Paddle-fork that referenced this issue Dec 8, 2021

Fix get device context error (PaddlePaddle#34)

5fc456d

Co-authored-by: jianghaicheng <[email protected]>

wangxicoding pushed a commit to wangxicoding/Paddle that referenced this issue Dec 9, 2021

Add keep_extended_vocab_only option (PaddlePaddle#34)

1b203ca

add keep_extend_vocab_only option

paddle-bot-old bot referenced this issue Jan 6, 2022

update notes/docs

6ab7dbb

paddle-bot-old bot referenced this issue Jan 7, 2022

rm unused lines

76f7556

danleifeng pushed a commit to danleifeng/Paddle that referenced this issue Jun 16, 2022

support graph inference (PaddlePaddle#34)

f19ca37

* gpu_graph_infer * simplify infer * fix * remove logs * remove logs * change logs

zmxdream added a commit to zmxdream/Paddle that referenced this issue Jul 4, 2022

fix hashtable_inl.h (PaddlePaddle#34)

eb10366

AnnaTrainingG pushed a commit to AnnaTrainingG/Paddle that referenced this issue Sep 19, 2022

Merge pull request PaddlePaddle#34 from LielinJiang/readme

5ab5b55

update readme

jack603047588 referenced this issue in jack603047588/Paddle Nov 9, 2022

Merge pull request #34 from jiaoxuewu/paddlebox

1c603d5

Merge pull request #29 from qingshui/paddlebox

jack603047588 referenced this issue in jack603047588/Paddle Nov 9, 2022

Merge pull request #34 from chao9527/chao9527/PaddleBox

3860903

optimize for async

marsbzp mentioned this issue Jan 11, 2023

多线程调用C++推理库进行RNN算子崩溃问题！！！！ #49737

Open

qizhaoaoe pushed a commit to qizhaoaoe/Paddle that referenced this issue Mar 3, 2023

parameterize lr_decay_factor, step_boundaries and log_period (PaddleP…

c87574b

…addle#34)

chlyzzo mentioned this issue Mar 29, 2023

paddle/fluid/core_avx.so paddle::memory::allocation::MemoryMapFdSet::Clear() #52269

Closed

zyfncg pushed a commit to zyfncg/Paddle that referenced this issue Sep 27, 2023

Merge pull request PaddlePaddle#34 from zyfncg/drr_pass

b731d39

Merge develop

zmxdream pushed a commit to zmxdream/Paddle that referenced this issue Jan 9, 2024

Merge pull request PaddlePaddle#34 from tiancaitzp/paddlebox

1ef774f

optimize load fc tunefile performance.

lizexu123 pushed a commit to lizexu123/Paddle that referenced this issue Feb 23, 2024

update init temperature and reduce rate docs for sa (PaddlePaddle#34)

138fe14

* update * update readme * update * update

hanhaowen-mt pushed a commit to hanhaowen-mt/Paddle that referenced this issue Feb 29, 2024

Merge pull request PaddlePaddle#34 from mthreads/fix_distribution_bug

172dc98

[MTAI-484] fix(build): repleace murand_uniform with murand_uniform2

tc20042008 pushed a commit to tc20042008/Paddle that referenced this issue Mar 7, 2024

Merge pull request PaddlePaddle#34 from feifei-111/cinn-trivalop-fuse

edcdb07

Is Pattern check

wwbitejotunn pushed a commit to wwbitejotunn/Paddle that referenced this issue Jun 21, 2024

fix Is_equal_qk error (PaddlePaddle#34)

5fc132a

WAYKEN-TSE pushed a commit to WAYKEN-TSE/Paddle that referenced this issue Dec 6, 2024

Merge pull request PaddlePaddle#34 from jerrywgz/add_unittest_for_pad…

d48c387

…dlevlp add unittest

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Sentiment analysis failures: invalid device function #34

Sentiment analysis failures: invalid device function #34

elvinpoon commented Sep 3, 2016

gangliao commented Sep 3, 2016 •

edited

Loading

elvinpoon commented Sep 6, 2016

gangliao commented Oct 8, 2016

Sentiment analysis failures: invalid device function #34

Sentiment analysis failures: invalid device function #34

Comments

elvinpoon commented Sep 3, 2016

gangliao commented Sep 3, 2016 • edited Loading

elvinpoon commented Sep 6, 2016

gangliao commented Oct 8, 2016

gangliao commented Sep 3, 2016 •

edited

Loading