Move workspace memory-allocation to PyTorch #661

RezaYazdaniAminabadi · 2021-01-12T05:10:21Z

No description provided.

eltonzheng · 2021-01-12T18:03:37Z

csrc/includes/context.h

@@ -64,17 +64,18 @@ class Context {
        return _ctx;
    }

-    void GenWorkSpace(size_t size)
+    void GenWorkSpace(void* workspace)  // (size_t size)


Rename it to SetWorkSpace()?

Thanks Elton, will do! :)

eltonzheng · 2021-01-12T18:04:04Z

csrc/includes/context.h

            assert(_workspace == nullptr);
            cudaMalloc(&_workspace, size);
        } else if (_workSpaceSize < size) {
            cudaFree(_workspace);
            cudaMalloc(&_workspace, size);
        }
-
-        _workSpaceSize = size;
+        _workSpaceSize = size;*/


_workSpaceSize can be deleted from context class?

yes agreed!

eltonzheng · 2021-01-12T18:04:43Z

csrc/includes/context.h

-        if (!_workspace) {
+        if (!workspace) { throw std::runtime_error("Workspace is null."); }
+        _workspace = workspace;
+        /*if (!_workspace) {


remove those commented code if they are useless.

I have just commented them to verify that this new way is working properly, after some tests I will remove the comments

eltonzheng · 2021-01-12T18:05:01Z

csrc/transformer/ds_transformer_cuda.cpp

-                                                           _training,
-                                                           _gelu_checkpoint));
-
+    // Context::Instance().GenWorkSpace(get_workspace_size<T>(_batch_size,


eltonzheng · 2021-01-12T18:09:24Z

csrc/transformer/ds_transformer_cuda.cpp

+                                                         layer->GetNumHeads(),
+                                                         layer->IsTrainingMode(),
+                                                         layer->GeluCheckpoint())},
+                                  options);


can we use g_output.options() here instead of creating a new option?

good point 👍

conglongli

Tested Bert seq512 and worked well. I think it is good to go after applying Elton's comments.

move workspace memory-allocation to PyTorch

f1ad30b

RezaYazdaniAminabadi requested review from arashashari, awan-10, cli99, conglongli, eltonzheng, jeffra, minjiaz, niumanar, samyam, ShadenSmith and tjruwase as code owners January 12, 2021 05:10

conglongli marked this pull request as draft January 12, 2021 05:26

eltonzheng reviewed Jan 12, 2021

View reviewed changes

conglongli marked this pull request as ready for review January 12, 2021 21:48

conglongli approved these changes Jan 12, 2021

View reviewed changes

Reza Yazdani added 3 commits January 13, 2021 01:42

refine the code based on the comments

6223ce5

remove unnecessary options

d0036c3

remove bsz from set_seq_len function

2378f43

eltonzheng approved these changes Jan 13, 2021

View reviewed changes

Merge branch 'master' into reyazda/pytorch-workspace-allocate

e0112df

RezaYazdaniAminabadi merged commit 981bc7d into master Jan 13, 2021

bobisapotato mentioned this pull request Jan 24, 2021

Another thing to merge. (MY EYES HURT) bobisai/DeepSpeed#1

Merged

conglongli mentioned this pull request Feb 18, 2021

Fix transformer kernel CUDA illegal memory access error #765

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Move workspace memory-allocation to PyTorch #661

Move workspace memory-allocation to PyTorch #661

RezaYazdaniAminabadi commented Jan 12, 2021

eltonzheng Jan 12, 2021

RezaYazdaniAminabadi Jan 12, 2021

eltonzheng Jan 12, 2021

RezaYazdaniAminabadi Jan 12, 2021

eltonzheng Jan 12, 2021

RezaYazdaniAminabadi Jan 12, 2021

eltonzheng Jan 12, 2021

eltonzheng Jan 12, 2021

RezaYazdaniAminabadi Jan 12, 2021

conglongli left a comment •

edited

Loading

Move workspace memory-allocation to PyTorch #661

Move workspace memory-allocation to PyTorch #661

Conversation

RezaYazdaniAminabadi commented Jan 12, 2021

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

conglongli left a comment • edited Loading

Choose a reason for hiding this comment

conglongli left a comment •

edited

Loading