some question for dataparallel #33

WFLiu0327 · 2023-05-06T07:09:02Z

when i try to train this model on double GPUs, an error occurred, suggesting that in models.maniqa.py in line 117 x = torch.cat((x6, x7, x8, x9), dim=2) appeared some data in CUDA0, some data in cuda1, please ask how this problem should be solved? I don't see some suitable solution on the internet.

TianheWu · 2023-05-15T16:50:28Z

Hi, I know this question.
This reason is the problem of implementation. It just support one GPU to training.

TianheWu · 2023-05-15T16:52:52Z

One solution is to return the output of each layer directly, without using the SaveOut module

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

some question for dataparallel #33

some question for dataparallel #33

WFLiu0327 commented May 6, 2023

TianheWu commented May 15, 2023

TianheWu commented May 15, 2023

some question for dataparallel #33

some question for dataparallel #33

Comments

WFLiu0327 commented May 6, 2023

TianheWu commented May 15, 2023

TianheWu commented May 15, 2023