Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fixing file permissions #1

Merged
merged 2 commits into from
Feb 3, 2020
Merged

Fixing file permissions #1

merged 2 commits into from
Feb 3, 2020

Conversation

ShadenSmith
Copy link
Contributor

No description provided.

@ShadenSmith ShadenSmith changed the title Adding executable perms to install.sh Fixing file permissions Feb 3, 2020
@ShadenSmith ShadenSmith merged commit b18eae2 into master Feb 3, 2020
@ShadenSmith ShadenSmith deleted the install_perms branch February 3, 2020 18:55
kouml pushed a commit to kouml/DeepSpeed that referenced this pull request Apr 3, 2020
Fixing file permissions.
arashashari added a commit that referenced this pull request Sep 4, 2020
ShadenSmith referenced this pull request in ShadenSmith/DeepSpeed Sep 10, 2020
* Tied module indexing bugfix.

* Train and inference pipeline schedules.

* Move code quality tests to Azure-hosted agents. (microsoft#368)
cli99 added a commit that referenced this pull request Jan 13, 2021
Co-authored-by: Cheng Li <[email protected]>

Co-authored-by: Jeff Rasley <[email protected]>
rraminen pushed a commit to rraminen/DeepSpeed that referenced this pull request Apr 28, 2021
…cript

Added ds_train_bert_bsz32k_seq512_pipeclean.sh
liamcli referenced this pull request in determined-ai/DeepSpeed Sep 27, 2021
Fix all Pipeline Module Parameters being sent to cuda:0
pengwa pushed a commit to pengwa/DeepSpeed that referenced this pull request Oct 14, 2022
* threaded tf_dl+presplit sentences+shuffled dataset with resume

* elaborate in readme
pengwa pushed a commit to pengwa/DeepSpeed that referenced this pull request Oct 14, 2022
Megatron + DeepSpeed + Pipeline Parallelism
pengwa pushed a commit to pengwa/DeepSpeed that referenced this pull request Oct 14, 2022
* Enable Megatron-LM workload on ROCm (microsoft#1)

* Enable Megatron workload on ROCm

* Added ds_pretrain_gpt_350M_dense_pipeclean.sh

* removed a file

* Removed an extra line

* Fix to resolve the below rsqrtf() error on ROCm

/root/Megatron-DeepSpeed/megatron/fused_kernels/layer_norm_hip_kernel.hip:298:10: error: no matching function for call to 'rsqrtf'
  return rsqrtf(v);
         ^~~~~~
/opt/rocm-5.2.0/llvm/lib/clang/14.0.0/include/__clang_hip_math.h:521:7: note: candidate function not viable: call to __device__ function from __host__ function
float rsqrtf(float __x) { return __ocml_rsqrt_f32(__x); }
      ^

* Simplified code

* Simplified the code

* Removed extra spaces
guoyejun pushed a commit to guoyejun/DeepSpeed that referenced this pull request Nov 10, 2022
don't gather partitioned activations for mp size 1 (microsoft#2454)
loadams pushed a commit that referenced this pull request Mar 6, 2024
* Add workspace capability to DSKernel

* Add to injection pipeline

* Validated
loadams pushed a commit that referenced this pull request Mar 6, 2024
* Initialize the fp6-quant-kernel integration.
* Add necessary parameters of kernel interfaces and the linear layer selection logic.
* upload kernel code
* The simple script for debugging.
* fix typo
* update
* fix split k
* Fix some errors and add test case.
* Workspace for Inference Kernels (#1)
* Add transform_param functions and update format.
* kernel debug
* fix include
* Update core_ops.cpp
* Add split k support
* fix
* Fix kernel error
* update
* update
* Fix rebase errors.
* Add missed include.
* Fix the bug that the attribute uses the weight information for mem alloc.
* Avoid GPU preallocation during weight loading.
* Add support of larger shapes for gated activation kernel.
* update
* model update
* fix all weight preprocessing
* Add split-k heuristic.
* Avoid reading scale attribute on non-quantized tensors.
* Change the scales from attributes to new tensors. Provide the end-to-end script given HuggingFace model id.
* Hard-coded commented out the scales in the kernel to workaround the bug.
* Support the user config for quantization. Fix kernel bug.
* Per operator test functions.
* Multiply scales by 1e12 according to the kernel design.
* Revert "Workspace for Inference Kernels (#1)". This reverts commit 1528732.
* Remove the format-only changes.
* Put the quantization into the transform_param function.

---------

Co-authored-by: Shiyang Chen <[email protected]>
Co-authored-by: Haojun Xia <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant