Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix compatibility checks for 18.04 container #4

Merged
merged 1 commit into from
May 23, 2018
Merged

Conversation

cbcase
Copy link
Contributor

@cbcase cbcase commented May 23, 2018

Our 18.04 container is weird -- kind of an intermediate state of tensor / variable merge. This handles it better.

@mcarilli mcarilli merged commit 1737ce1 into master May 23, 2018
@cbcase cbcase deleted the amp_compat_fix branch July 11, 2018 00:15
@jinserk jinserk mentioned this pull request Aug 30, 2018
rohithkrn pushed a commit to rohithkrn/apex that referenced this pull request May 8, 2020
* fix dropout scaling from p to 1/(1-p) (NVIDIA#816)

Co-authored-by: Sukru Eryilmaz <[email protected]>

* Improvements to apex.mlp (NVIDIA#804)

* update fused bias relu backward kernel

* adding support for not require first layer dgrad

* fix bug: wrong layer in requires grad

* add infrastructure for optional bias and activation, currently only support no bias and no relu

* make bias and relu optional separately

* add sigmoid activation option

* enable wider load/store for multi_tensor_apply kernels (NVIDIA#763)

* modify MTA axpby for wider load/store

* Make scale/axpby/l2/adam/lamb multi_tensor uses wider load

* Changes to make xentropysoftmax load/store vectorized when possible: (NVIDIA#725)

* Changes to make xentropysoftmax load/store vectorized when possible:
Increase default ILP so that each thread handle 16 Bytes data in one step
Make thread load/store longest vector possible
Make unroll case handle adjacent data instead of strided, so same order compare to vector case

* Add shift for not aligned case. Remove less than 16 bytes aligned access

Co-authored-by: Burc Eryilmaz <[email protected]>
Co-authored-by: Sukru Eryilmaz <[email protected]>
Co-authored-by: Deyu Fu <[email protected]>
thorjohnsen pushed a commit that referenced this pull request Sep 15, 2020
Accept custom (layer type:param name) to include in sparse_parameter …
ptrblck pushed a commit that referenced this pull request Jul 17, 2021
* Added support for fused ReLU and dropout into transducer joint

* Reorganized code selection path in transducer joint fwd
* Added support for fused ReLU+dropout into transducer joint

* Vectorize transducer loss backward with fused softmax (#3)

* Nanz/transducer loss (#4)

* Vectorize transducer loss backward with fused softmax

* Added a predicate to avoid potential IMA

* Nanz/transducer loss (#5)

* Vectorize transducer loss backward with fused softmax

* Added a predicate to avoid potentional IMA

* Added more predicates to avoid IMAs

* Updated documentations for newly added features.

* Fixed a error in transducer.py
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants