Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Compatiblity issue with CUDA 8.0 #18

Closed
stoneyang opened this issue Sep 1, 2016 · 16 comments
Closed

Compatiblity issue with CUDA 8.0 #18

stoneyang opened this issue Sep 1, 2016 · 16 comments
Assignees

Comments

@stoneyang
Copy link
Contributor

Hi, there,

I can successfully build Paddle on my machine installed Linux 14.04 LTS and CUDA 8.0 as the official guide. And for sure, the CPU version runs well except the speed....

When I ran the image classification demo with the script train.sh in GPU mode (see train.sh for more details), it unfortunately failed and threw out the following info:

I0901 19:18:02.916951 31272 Util.cpp:144] commandline: ../../build/paddle/trainer/paddle_trainer --config=vgg_16_cifar.py --dot_period=10 --log_period=100 --test_all_data_in_one_period=1 --use_gpu=1 --gpu_id=0 --trainer_count=1 --num_passes=200 --save_dir=./cifar_vgg_model
I0901 19:18:09.213749 31272 Util.cpp:113] Calling runInitFunctions
I0901 19:18:09.428228 31272 Util.cpp:126] Call runInitFunctions done.
[INFO 2016-09-01 19:18:09,580 layers.py:1430] channels=3 size=3072
[INFO 2016-09-01 19:18:09,580 layers.py:1430] output size for __conv_0__ is 32
[INFO 2016-09-01 19:18:09,583 layers.py:1430] channels=64 size=65536
[INFO 2016-09-01 19:18:09,583 layers.py:1430] output size for __conv_1__ is 32
[INFO 2016-09-01 19:18:09,586 layers.py:1490] output size for __pool_0__ is 16*16
[INFO 2016-09-01 19:18:09,587 layers.py:1430] channels=64 size=16384
[INFO 2016-09-01 19:18:09,587 layers.py:1430] output size for __conv_2__ is 16
[INFO 2016-09-01 19:18:09,590 layers.py:1430] channels=128 size=32768
[INFO 2016-09-01 19:18:09,590 layers.py:1430] output size for __conv_3__ is 16
[INFO 2016-09-01 19:18:09,592 layers.py:1490] output size for __pool_1__ is 8*8
[INFO 2016-09-01 19:18:09,593 layers.py:1430] channels=128 size=8192
[INFO 2016-09-01 19:18:09,594 layers.py:1430] output size for __conv_4__ is 8
[INFO 2016-09-01 19:18:09,596 layers.py:1430] channels=256 size=16384
[INFO 2016-09-01 19:18:09,597 layers.py:1430] output size for __conv_5__ is 8
[INFO 2016-09-01 19:18:09,599 layers.py:1430] channels=256 size=16384
[INFO 2016-09-01 19:18:09,599 layers.py:1430] output size for __conv_6__ is 8
[INFO 2016-09-01 19:18:09,601 layers.py:1490] output size for __pool_2__ is 4*4
[INFO 2016-09-01 19:18:09,602 layers.py:1430] channels=256 size=4096
[INFO 2016-09-01 19:18:09,603 layers.py:1430] output size for __conv_7__ is 4
[INFO 2016-09-01 19:18:09,605 layers.py:1430] channels=512 size=8192
[INFO 2016-09-01 19:18:09,605 layers.py:1430] output size for __conv_8__ is 4
[INFO 2016-09-01 19:18:09,608 layers.py:1430] channels=512 size=8192
[INFO 2016-09-01 19:18:09,608 layers.py:1430] output size for __conv_9__ is 4
[INFO 2016-09-01 19:18:09,610 layers.py:1490] output size for __pool_3__ is 2*2
[INFO 2016-09-01 19:18:09,611 layers.py:1490] output size for __pool_4__ is 1*1
[INFO 2016-09-01 19:18:09,615 networks.py:960] The input order is [image, label]
[INFO 2016-09-01 19:18:09,615 networks.py:963] The output order is [__cost_0__]
I0901 19:18:09.653937 31272 Trainer.cpp:169] trainer mode: Normal
F0901 19:18:09.658243 31272 hl_gpu_matrix_kernel.cuh:181] Check failed: cudaSuccess == err (0 vs. 8) [hl_gpu_apply_unary_op failed] CUDA error: invalid device function
*** Check failure stack trace: ***
    @     0x7efd1b172daa  (unknown)
    @     0x7efd1b172ce4  (unknown)
    @     0x7efd1b1726e6  (unknown)
    @     0x7efd1b175687  (unknown)
    @           0x78b159  hl_gpu_apply_unary_op<>()
    @           0x753edf  paddle::BaseMatrixT<>::applyUnary<>()
    @           0x753ac9  paddle::BaseMatrixT<>::applyUnary<>()
    @           0x73e04f  paddle::BaseMatrixT<>::zero()
    @           0x62af8e  paddle::Parameter::enableType()
    @           0x6272ec  paddle::parameterInitNN()
    @           0x62975b  paddle::NeuralNetwork::init()
    @           0x62eda3  paddle::GradientMachine::create()
    @           0x6a84e5  paddle::TrainerInternal::init()
    @           0x6a4907  paddle::Trainer::init()
    @           0x543935  main
    @     0x7efd1a37ef45  (unknown)
    @           0x54efd5  (unknown)
    @              (nil)  (unknown)
Aborted (core dumped)
No data to plot. Exiting!

It seems that Paddle still does not support the latest version of CUDA....

Appended my train.sh as a clue:

(omitted the original copyright info)
#!/bin/bash

set -e
config=vgg_16_cifar.py
output=./cifar_vgg_model
log=train.log

../../build/paddle/trainer/paddle_trainer \
--config=$config \
--dot_period=10 \
--log_period=100 \
--test_all_data_in_one_period=1 \
--use_gpu=1 \
--gpu_id=0 \
--trainer_count=1 \
--num_passes=200 \
--save_dir=$output \
2>&1 | tee $log

python -m paddle.utils.plotcurve -i $log > plot.png
@emailweixu emailweixu assigned hedaoyuan and unassigned hedaoyuan Sep 1, 2016
@reyoung
Copy link
Collaborator

reyoung commented Sep 2, 2016

duplicated with #3 .

CUDA 8.0 support issue is under resolving.

@reyoung
Copy link
Collaborator

reyoung commented Sep 2, 2016

Please track #3 for recent progresses.

@reyoung reyoung closed this as completed Sep 2, 2016
@stoneyang stoneyang mentioned this issue Sep 2, 2016
@stoneyang
Copy link
Contributor Author

stoneyang commented Sep 2, 2016

Hi, @reyoung

The problem I posted here is an issue on using PaddlePaddle AFTER SUCCESSFULLY BUILDING THE CODE, which is different from #3 you try to referenced. As far as I know, guy who reported an issue of the building process with CUDA 8.0! That issue can be easily shot down by my merged PR #15 .

So I don't think the two issue can be treated like the same problem.

Appended is my nvcc version for your reference:

nvcc: NVIDIA (R) Cuda compiler driver
Copyright (c) 2005-2016 NVIDIA Corporation
Built on Wed_May__4_21:01:56_CDT_2016
Cuda compilation tools, release 8.0, V8.0.26

Hope it helps :)

@gangliao
Copy link
Contributor

gangliao commented Sep 3, 2016

Hi, invalid device function indicates that you have a CUDA / GPU incompatibility.
Since you use GTX 1080, you can modify CMake to fix it.

open cmake/flags.cmake and add following code:
if (CUDA_VERSION VERSION_GREATER "8.0")

list(APPEND __arch_flags " -gencode arch=compute_60,code=sm_60")

endif()

then, rebuild the project

@stoneyang
Copy link
Contributor Author

Thanks for your reply @gangliao !
I'll try it later.

@stoneyang
Copy link
Contributor Author

stoneyang commented Sep 5, 2016

@gangliao same errors when running train.sh as my original post ....

I appended the lines at the end of flags.cmake under cmake directory as:

if (CUDA_VERSION VERSION_GREATER "7.0")
    list(APPEND __arch_flags " -gencode arch=compute_52,code=sm_52")
endif()

if(CUDA_VERSION VERSION_GREATER "8.0")
    list(APPEND __arch_flags " -gencode arch=compute_60,code=sm_52")
endif()

set(CUDA_NVCC_FLAGS ${__arch_flags} ${CUDA_NVCC_FLAGS})

and run the same building steps and it did success.

P.S.: Same error when directly replaced if() ... endif() for 7.0 check and configure with 8.0:

# if (CUDA_VERSION VERSION_GREATER "7.0")
#     list(APPEND __arch_flags " -gencode arch=compute_52,code=sm_52")
# endif()

if(CUDA_VERSION VERSION_GREATER "8.0")
    list(APPEND __arch_flags " -gencode arch=compute_60,code=sm_52")
endif()

set(CUDA_NVCC_FLAGS ${__arch_flags} ${CUDA_NVCC_FLAGS})

@gangliao
Copy link
Contributor

gangliao commented Sep 5, 2016

@stoneyang Why do you set sm_52? how about -gencode arch=compute_60,code=sm_60 ??

if (CUDA_VERSION VERSION_GREATER "7.0")
    list(APPEND __arch_flags " -gencode arch=compute_52,code=sm_52")
endif()

if(CUDA_VERSION VERSION_GREATER "8.0")
    list(APPEND __arch_flags " -gencode arch=compute_60,code=sm_60")
endif()

set(CUDA_NVCC_FLAGS ${__arch_flags} ${CUDA_NVCC_FLAGS})

@stoneyang
Copy link
Contributor Author

stoneyang commented Sep 5, 2016

@gangliao sorry for the typo ....

I changed those lines (see PR #40). And the new CUDA code generation configuration works fine, which is different from your suggestion. :)

foreach(capability 30 35 50 52 60)
    list(APPEND __arch_flags " -gencode arch=compute_${capability},code=sm_${capability}")
endforeach()

Please note that the former if() ... endif()-s for CUDA 7.0 or later are commented out since it is invalid (they keep producing errors in my original post). This might be the cmake-thing, which I do not have enough time to verify it for now.

But new error occurs when running:

$ sh train.sh

The log:

I0905 13:50:01.279917 20299 Util.cpp:144] commandline: /path/to/Paddle/build/paddle/trainer/paddle_trainer --config=vgg_16_cifar.py --dot_period=10 --log_period=100 --test_all_data_in_one_period=1 --use_gpu=1 --trainer_count=1 --num_passes=200 --save_dir=./cifar_vgg_model
I0905 13:50:02.728484 20299 Util.cpp:113] Calling runInitFunctions
I0905 13:50:02.728711 20299 Util.cpp:126] Call runInitFunctions done.
[INFO 2016-09-05 13:50:02,782 layers.py:1438] channels=3 size=3072
[INFO 2016-09-05 13:50:02,782 layers.py:1438] output size for __conv_0__ is 32
[INFO 2016-09-05 13:50:02,784 layers.py:1438] channels=64 size=65536
[INFO 2016-09-05 13:50:02,784 layers.py:1438] output size for __conv_1__ is 32
[INFO 2016-09-05 13:50:02,785 layers.py:1499] output size for __pool_0__ is 16*16
[INFO 2016-09-05 13:50:02,786 layers.py:1438] channels=64 size=16384
[INFO 2016-09-05 13:50:02,786 layers.py:1438] output size for __conv_2__ is 16
[INFO 2016-09-05 13:50:02,787 layers.py:1438] channels=128 size=32768
[INFO 2016-09-05 13:50:02,787 layers.py:1438] output size for __conv_3__ is 16
[INFO 2016-09-05 13:50:02,788 layers.py:1499] output size for __pool_1__ is 8*8
[INFO 2016-09-05 13:50:02,789 layers.py:1438] channels=128 size=8192
[INFO 2016-09-05 13:50:02,789 layers.py:1438] output size for __conv_4__ is 8
[INFO 2016-09-05 13:50:02,790 layers.py:1438] channels=256 size=16384
[INFO 2016-09-05 13:50:02,790 layers.py:1438] output size for __conv_5__ is 8
[INFO 2016-09-05 13:50:02,791 layers.py:1438] channels=256 size=16384
[INFO 2016-09-05 13:50:02,792 layers.py:1438] output size for __conv_6__ is 8
[INFO 2016-09-05 13:50:02,793 layers.py:1499] output size for __pool_2__ is 4*4
[INFO 2016-09-05 13:50:02,793 layers.py:1438] channels=256 size=4096
[INFO 2016-09-05 13:50:02,793 layers.py:1438] output size for __conv_7__ is 4
[INFO 2016-09-05 13:50:02,794 layers.py:1438] channels=512 size=8192
[INFO 2016-09-05 13:50:02,795 layers.py:1438] output size for __conv_8__ is 4
[INFO 2016-09-05 13:50:02,796 layers.py:1438] channels=512 size=8192
[INFO 2016-09-05 13:50:02,796 layers.py:1438] output size for __conv_9__ is 4
[INFO 2016-09-05 13:50:02,797 layers.py:1499] output size for __pool_3__ is 2*2
[INFO 2016-09-05 13:50:02,797 layers.py:1499] output size for __pool_4__ is 1*1
[INFO 2016-09-05 13:50:02,799 networks.py:1122] The input order is [image, label]
[INFO 2016-09-05 13:50:02,799 networks.py:1129] The output order is [__cost_0__]
I0905 13:50:02.806649 20299 Trainer.cpp:169] trainer mode: Normal
I0905 13:50:02.840154 20299 PyDataProvider2.cpp:219] loading dataprovider image_provider::processData
[INFO 2016-09-05 13:50:03,068 image_provider.py:52] Image size: 32
[INFO 2016-09-05 13:50:03,069 image_provider.py:53] Meta path: data/cifar-out/batches/batches.meta
[INFO 2016-09-05 13:50:03,069 image_provider.py:58] DataProvider Initialization finished
I0905 13:50:03.069367 20299 PyDataProvider2.cpp:219] loading dataprovider image_provider::processData
[INFO 2016-09-05 13:50:03,069 image_provider.py:52] Image size: 32
[INFO 2016-09-05 13:50:03,069 image_provider.py:53] Meta path: data/cifar-out/batches/batches.meta
[INFO 2016-09-05 13:50:03,070 image_provider.py:58] DataProvider Initialization finished
I0905 13:50:03.070163 20299 GradientMachine.cpp:134] Initing parameters..
I0905 13:50:03.706565 20299 GradientMachine.cpp:141] Init parameters done.
.........
I0905 13:50:44.860791 20299 TrainerInternal.cpp:162]  Batch=100 samples=12800 AvgCost=2.34658 CurrentCost=2.34658 Eval: classification_error_evaluator=0.825234  CurrentEval: classification_error_evaluator=0.825234
.........
I0905 13:50:52.310623 20299 TrainerInternal.cpp:162]  Batch=200 samples=25600 AvgCost=2.16351 CurrentCost=1.98044 Eval: classification_error_evaluator=0.782852  CurrentEval: classification_error_evaluator=0.740469
.........
I0905 13:50:59.720386 20299 TrainerInternal.cpp:162]  Batch=300 samples=38400 AvgCost=2.01217 CurrentCost=1.70949 Eval: classification_error_evaluator=0.743021  CurrentEval: classification_error_evaluator=0.663359
.........I0905 13:51:06.456202 20299 TrainerInternal.cpp:179]  Pass=0 Batch=391 samples=50048 AvgCost=1.89668 Eval: classification_error_evaluator=0.705
F0905 13:51:08.651224 20299 hl_cuda_cudnn.cc:779] Check failed: CUDNN_STATUS_SUCCESS == cudnnStat (0 vs. 5) Cudnn Error: CUDNN_STATUS_INVALID_VALUE
*** Check failure stack trace: ***
    @     0x7fc6006fddaa  (unknown)
    @     0x7fc6006fdce4  (unknown)
    @     0x7fc6006fd6e6  (unknown)
    @     0x7fc600700687  (unknown)
    @           0x8b4594  hl_convolution_forward()
    @           0x5b3fac  paddle::CudnnConvLayer::forward()
    @           0x62a01c  paddle::NeuralNetwork::forward()
    @           0x6bdaff  paddle::Tester::testOneBatch()
    @           0x6be412  paddle::Tester::testOnePeriod()
    @           0x6a28d4  paddle::Trainer::trainOnePass()
    @           0x6a5cc7  paddle::Trainer::train()
    @           0x5439f3  main
    @     0x7fc5ff909f45  (unknown)
    @           0x54efd5  (unknown)
    @              (nil)  (unknown)
Aborted (core dumped)

Looks like an cuDNN issue about the function cudnnConvolutionForward() defined in Nvidia's cuDNN library. Might it is a wrapper issue?

@gangliao
Copy link
Contributor

gangliao commented Sep 5, 2016

GPU K20/40 works fine on this demo. Currently, this bug only appeared on GPU GTX, one of my colleagues already reproduced it and will solve it soon. Thanks.

@gangliao gangliao added the Bug label Sep 5, 2016
@gangliao gangliao reopened this Sep 5, 2016
@stoneyang
Copy link
Contributor Author

stoneyang commented Sep 5, 2016

Thanks for reopening this issue! @gangliao

Appended is the version of cuDNN: 5.0.5.

Hope it is helpful.

@hedaoyuan
Copy link
Contributor

Fixed #107, and closed.

@stoneyang
Copy link
Contributor Author

stoneyang commented Sep 30, 2016

Same issue as the original report still persists after new commits were fetched and re-make-ed. Any further suggestions? @gangliao @hedaoyuan

@hedaoyuan
Copy link
Contributor

@stoneyang Can you try adding GLOG_v=3 option on the command line(like this GLOG_v=3 paddle ...). This will print out more debug information.

@stoneyang
Copy link
Contributor Author

@hedaoyuan , Hi, thanks for your reply! Do you mean add GLOG_v=3 in train.sh under demo\image_classification? I tried this and Paddle gives me the following error log (seems a little different from the one in the original post):

I1004 10:36:08.486361  8402 Util.cpp:151] commandline: ../../build/paddle/trainer/paddle_trainer --config=vgg_16_cifar.py --dot_period=10 --log_period=100 --test_all_data_in_one_period=1 --use_gpu=1 --trainer_count=1 --num_passes=200 --save_dir=./cifar_vgg_model
I1004 10:36:14.866709  8402 Util.cpp:126] Calling runInitFunctions
I1004 10:36:14.866930  8402 Util.cpp:139] Call runInitFunctions done.
I1004 10:36:14.875442  8402 TrainerConfigHelper.cpp:55] Parsing trainer config vgg_16_cifar.py
[INFO 2016-10-04 10:36:15,093 layers.py:1620] channels=3 size=3072
[INFO 2016-10-04 10:36:15,093 layers.py:1620] output size for __conv_0__ is 32
[INFO 2016-10-04 10:36:15,096 layers.py:1620] channels=64 size=65536
[INFO 2016-10-04 10:36:15,097 layers.py:1620] output size for __conv_1__ is 32
[INFO 2016-10-04 10:36:15,099 layers.py:1681] output size for __pool_0__ is 16*16
[INFO 2016-10-04 10:36:15,100 layers.py:1620] channels=64 size=16384
[INFO 2016-10-04 10:36:15,100 layers.py:1620] output size for __conv_2__ is 16
[INFO 2016-10-04 10:36:15,101 layers.py:1620] channels=128 size=32768
[INFO 2016-10-04 10:36:15,102 layers.py:1620] output size for __conv_3__ is 16
[INFO 2016-10-04 10:36:15,103 layers.py:1681] output size for __pool_1__ is 8*8
[INFO 2016-10-04 10:36:15,103 layers.py:1620] channels=128 size=8192
[INFO 2016-10-04 10:36:15,104 layers.py:1620] output size for __conv_4__ is 8
[INFO 2016-10-04 10:36:15,105 layers.py:1620] channels=256 size=16384
[INFO 2016-10-04 10:36:15,105 layers.py:1620] output size for __conv_5__ is 8
[INFO 2016-10-04 10:36:15,106 layers.py:1620] channels=256 size=16384
[INFO 2016-10-04 10:36:15,106 layers.py:1620] output size for __conv_6__ is 8
[INFO 2016-10-04 10:36:15,108 layers.py:1681] output size for __pool_2__ is 4*4
[INFO 2016-10-04 10:36:15,108 layers.py:1620] channels=256 size=4096
[INFO 2016-10-04 10:36:15,108 layers.py:1620] output size for __conv_7__ is 4
[INFO 2016-10-04 10:36:15,110 layers.py:1620] channels=512 size=8192
[INFO 2016-10-04 10:36:15,110 layers.py:1620] output size for __conv_8__ is 4
[INFO 2016-10-04 10:36:15,111 layers.py:1620] channels=512 size=8192
[INFO 2016-10-04 10:36:15,111 layers.py:1620] output size for __conv_9__ is 4
[INFO 2016-10-04 10:36:15,112 layers.py:1681] output size for __pool_3__ is 2*2
[INFO 2016-10-04 10:36:15,113 layers.py:1681] output size for __pool_4__ is 1*1
[INFO 2016-10-04 10:36:15,115 networks.py:1125] The input order is [image, label]
[INFO 2016-10-04 10:36:15,115 networks.py:1132] The output order is [__cost_0__]
I1004 10:36:15.138290  8402 Trainer.cpp:170] trainer mode: Normal
F1004 10:36:15.139696  8402 hl_gpu_matrix_kernel.cuh:181] Check failed: cudaSuccess == err (0 vs. 8) [hl_gpu_apply_unary_op failed] CUDA error: invalid device function
*** Check failure stack trace: ***
    @     0x7fdd6ccaedaa  (unknown)
    @     0x7fdd6ccaece4  (unknown)
    @     0x7fdd6ccae6e6  (unknown)
    @     0x7fdd6ccb1687  (unknown)
    @           0x78ffa9  hl_gpu_apply_unary_op<>()
    @           0x758d2f  paddle::BaseMatrixT<>::applyUnary<>()
    @           0x758919  paddle::BaseMatrixT<>::applyUnary<>()
    @           0x742e9f  paddle::BaseMatrixT<>::zero()
    @           0x62c94e  paddle::Parameter::enableType()
    @           0x628e0c  paddle::parameterInitNN()
    @           0x62b27b  paddle::NeuralNetwork::init()
    @           0x630e93  paddle::GradientMachine::create()
    @           0x6ad345  paddle::TrainerInternal::init()
    @           0x6a9727  paddle::Trainer::init()
    @           0x542a65  main
    @     0x7fdd6bebaf45  (unknown)
    @           0x54e355  (unknown)
    @              (nil)  (unknown)
Aborted (core dumped)
No data to plot. Exiting!

@gangliao
Copy link
Contributor

gangliao commented Oct 8, 2016

@stoneyang fixed Fix CUDA_VERSION Comparsion #165

@stoneyang
Copy link
Contributor Author

@gangliao , seems perfect now!

junjun315 added a commit that referenced this issue Apr 3, 2019
zhhsplendid pushed a commit to zhhsplendid/Paddle that referenced this issue Sep 25, 2019
qingqing01 pushed a commit to qingqing01/Paddle that referenced this issue Apr 30, 2020
joey12300 pushed a commit to joey12300/Paddle that referenced this issue Nov 2, 2020
…backward

Merge pull request PaddlePaddle#18 from joey12300/add-lstm-cudnn_cpu_backward
paddle-bot-old bot pushed a commit that referenced this issue Jul 11, 2021
speed up graph neighbors sampling
thisjiang pushed a commit to thisjiang/Paddle that referenced this issue Aug 19, 2021
* compiler test pr

* Init compiler dir.

* init version for piano_op_search_and_replace_pass

* save status, compile ok

* add some test for piano_search_and_replace_pass

* basic test case is ok

* pass unittest

* piano_search_and_replace_pass to piano_op_search_and_replace_pass

* add some comments for piano_op_search_and_replace_pass

* add links between cluster_inputs/cluster_outputs and compiled_cluster

* use const& cluster_inputs instead of * p_cluster_inputs

* add some comments and encapsulate some sub functions

* rename piano_op_search_and_replace_pass to piano_compile_pass

* a small modify in comments

Co-authored-by: Zhen Wang <[email protected]>
thisjiang pushed a commit to thisjiang/Paddle that referenced this issue Oct 28, 2021
wangxicoding pushed a commit to wangxicoding/Paddle that referenced this issue Dec 9, 2021
zhoutianzi666 pushed a commit to zhoutianzi666/Paddle that referenced this issue May 23, 2022
Thunderbrook pushed a commit to Thunderbrook/Paddle that referenced this issue Jun 7, 2022
* Optimize memory overhead for gpugraph.

* Optimize memory overhead for gpugraph.
jack603047588 referenced this issue in jack603047588/Paddle Nov 9, 2022
jack603047588 referenced this issue in jack603047588/Paddle Nov 9, 2022
qizhaoaoe pushed a commit to qizhaoaoe/Paddle that referenced this issue Mar 3, 2023
delete update notes from README.md
zyfncg pushed a commit to zyfncg/Paddle that referenced this issue Aug 22, 2023
[DRR] Refactor ther drr pattern class
zmxdream pushed a commit to zmxdream/Paddle that referenced this issue Nov 4, 2023
lizexu123 pushed a commit to lizexu123/Paddle that referenced this issue Feb 23, 2024
hanhaowen-mt pushed a commit to hanhaowen-mt/Paddle that referenced this issue Feb 29, 2024
[MTAI-484] fix(build): change commit id in eigen.cmake
NKNaN pushed a commit to NKNaN/Paddle that referenced this issue Mar 3, 2024
wwbitejotunn pushed a commit to wwbitejotunn/Paddle that referenced this issue Jun 21, 2024
Slightly improve installation process
wwbitejotunn pushed a commit to wwbitejotunn/Paddle that referenced this issue Jun 21, 2024
WAYKEN-TSE pushed a commit to WAYKEN-TSE/Paddle that referenced this issue Dec 6, 2024
add gatherwithgrad && delete extra comments
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

5 participants