-
Notifications
You must be signed in to change notification settings - Fork 5.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
speed up LSTM with DNI #191
Comments
We received an email for this feature request. The speed up is not the most amazing thing, it can also train a very very large model on multiple devices. Related to #141. But how to implement it in current Paddle framework is still under thinking right now. In this paper, the datasets are quite small(cifar, mnist). Is this algorithm still work for a larger, more complex dataset with a lot of noises, which could make gradient/error prediction very difficult? @pengli09 in our team is also very interested in this method. |
Too long ago, free to reopen it. |
* persistable var support * add performance test for paddle inference
…d_polish_detail Remove boost dep and polish detail
Co-authored-by: root <[email protected]>
* add set slot_num for psgpuwraper (#177) * add set slot_num_for_pull_feature for psgpuwarper * Add get_epoch_finish python interface (#182) * add get_epoch_finish interface * add return * delete return * add unzip op (#183) * fix miss key for error dataset (#186) * fix miss key for error dataset * fix miss key for error dataset Co-authored-by: yangjunchao <[email protected]> * add excluded_train_pair and infer_node_type (#187) * support return of degree (#188) * fix task stuck in barrier (#189) Co-authored-by: yangjunchao <[email protected]> * check node/feature format when loading (#190) * check node&feature format when loading * check node&feature format when loading (2£ (2) * degrade log (#191) * [PGLBOX]fix conflict * [PGLBOX]fix conflict * [PGLBOX]replace LodTensor with phi::DenseTensor * [PGLBOX]fix gpu_primitives.h include path * [PGLBOX]from platform::PADDLE_CUDA_NUM_THREADS to phi::PADDLE_CUDA_NUM_THREADS * [PGLBOX]fix unzip example code * [PGLBOX]fix unzip example code * [PGLBOX]fix unzip example code * [PGLBOX]fix unzip example code * [PGLBOX]fix unzip ut * [PGLBOX]fix unzip ut * [PGLBOX]fix code style * [PGLBOX]fix code style * [PGLBOX]fix code style * fix code style * fix code style * fix unzip ut * fix unzip ut * fix unzip ut * fix unzip * fix code stype * add ut * add c++ ut & fix train_mode_ set * fix load into memory * fix c++ ut * fix c++ ut * fix c++ ut * fix c++ ut * fix code style * fix collective * fix unzip_op.cc * fix barrier * fix code style * fix barrier * fix barrier * fix code styple * fix unzip * add unzip.py * add unzip.py * fix unzip.py --------- Co-authored-by: chao9527 <[email protected]> Co-authored-by: Siming Dai <[email protected]> Co-authored-by: huwei02 <[email protected]> Co-authored-by: yangjunchao <[email protected]>
I found the method described in Deepmind's DNI paper to speed up LSTM is promising. It showed that unrolling the LSTM and decoupling with DNI would get 2x faster. Can we implement it on paddle?
The paper is Decoupled Neural Interfaces using Synthetic Gradients
The text was updated successfully, but these errors were encountered: