Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
* zero++ tutorial PR (#3783) * [Fix] _conv_flops_compute when padding is a str and stride=1 (#3169) * fix conv_flops_compute when padding is a str when stride=1 * fix error * change type of paddings to tuple * fix padding calculation * apply formatting check --------- Co-authored-by: Cheng Li <[email protected]> Co-authored-by: Olatunji Ruwase <[email protected]> * fix interpolate flops compute (#3782) * use `Flops Profiler` to test `model.generate()` (#2515) * Update profiler.py * pre-commit run --all-files * Delete .DS_Store * Delete .DS_Store * Delete .DS_Store --------- Co-authored-by: Jeff Rasley <[email protected]> Co-authored-by: Cheng Li <[email protected]> * revert PR #3611 (#3786) * bump to 0.9.6 * ZeRO++ chinese blog (#3793) * zeropp chinese blog * try better quality images * make title larger * even larger... * various fix * center captions * more fixes * fix format * remove staging trigger (#3792) * DeepSpeed-Triton for Inference (#3748) Co-authored-by: Stephen Youn <[email protected]> Co-authored-by: Arash Bakhtiari <[email protected]> Co-authored-by: Cheng Li <[email protected]> Co-authored-by: Ethan Doe <[email protected]> Co-authored-by: yidoe <[email protected]> Co-authored-by: Jeff Rasley <[email protected]> * ZeRO++ (#3784) Co-authored-by: HeyangQin <[email protected]> Co-authored-by: GuanhuaWang <[email protected]> Co-authored-by: cmikeh2 <[email protected]> Co-authored-by: Ammar Ahmad Awan <[email protected]> Co-authored-by: Jeff Rasley <[email protected]> Co-authored-by: Michael Wyatt <[email protected]> Co-authored-by: Olatunji Ruwase <[email protected]> Co-authored-by: Reza Yazdani <[email protected]> * adding zero++ to navigation panel of deepspeed.ai (#3796) * Add ZeRO++ Japanese blog (#3797) * zeropp chinese blog * try better quality images * make title larger * even larger... * various fix * center captions * more fixes * fix format * add ZeRO++ Japanese blog * add links --------- Co-authored-by: HeyangQin <[email protected]> Co-authored-by: Conglong Li <[email protected]> * Bug Fixes for autotuner and flops profiler (#1880) * fix autotuner when backward is not called * fix format --------- Co-authored-by: Olatunji Ruwase <[email protected]> * Missing strided copy for gated MLP (#3788) Co-authored-by: Ammar Ahmad Awan <[email protected]> Co-authored-by: Jeff Rasley <[email protected]> Co-authored-by: Logan Adams <[email protected]> * Requires grad checking. (#3789) Co-authored-by: Jeff Rasley <[email protected]> * bump to 0.10.0 * Fix Bug in transform.cu (#3534) * Bug fix * Fixed formatting error --------- Co-authored-by: Logan Adams <[email protected]> * bug fix: triton importing error (#3799) Co-authored-by: Stephen Youn <[email protected]> Co-authored-by: Jeff Rasley <[email protected]> * init commit for mixed precision lora * fix format * patch _allgather_params & minor fixes * make sure initial quantization are finished * make sure dequantization is finished * skip quantization for small parameters * fix format * remove unused async_op * lazy load of quantizer kernels * add mixed precision lora tutorial * cleanup mics * cleanup mics * replace get_accelerator().current_device() * add kwargs to mics * fix format * seperate code and tutorial * fix _all_gather in zero3 --------- Co-authored-by: Bill Luo <[email protected]> Co-authored-by: Cheng Li <[email protected]> Co-authored-by: Olatunji Ruwase <[email protected]> Co-authored-by: Guorun <[email protected]> Co-authored-by: Jeff Rasley <[email protected]> Co-authored-by: stephen youn <[email protected]> Co-authored-by: Stephen Youn <[email protected]> Co-authored-by: Arash Bakhtiari <[email protected]> Co-authored-by: Ethan Doe <[email protected]> Co-authored-by: yidoe <[email protected]> Co-authored-by: GuanhuaWang <[email protected]> Co-authored-by: cmikeh2 <[email protected]> Co-authored-by: Ammar Ahmad Awan <[email protected]> Co-authored-by: Michael Wyatt <[email protected]> Co-authored-by: Reza Yazdani <[email protected]> Co-authored-by: Masahiro Tanaka <[email protected]> Co-authored-by: Conglong Li <[email protected]> Co-authored-by: Logan Adams <[email protected]> Co-authored-by: Joe Mayer <[email protected]> Co-authored-by: Ramya Ramineni <[email protected]>
- Loading branch information