Fix GPU OOM for `mistral.py::Mask4DTestHard` #31212

ydshieh · 2024-06-03T15:47:56Z

What does this PR do?

Fix GPU OOM for mistral.py::Mask4DTestHard

tests/models/mistral/test_modeling_mistral.py::Mask4DTestHard::test_partial_stacked_causal_mask
(line 1159)  torch.cuda.OutOfMemoryError: CUDA out of memory. Tried to allocate 224.00 MiB. GPU
tests/models/mistral/test_modeling_mistral.py::Mask4DTestHard::test_stacked_causal_mask
(line 1159)  torch.cuda.OutOfMemoryError: CUDA out of memory. Tried to allocate 224.00 MiB. GPU

job run page

https://github.com/huggingface/transformers/actions/runs/9343352211/job/25712939733

ydshieh · 2024-06-03T16:02:36Z

tests/models/mistral/test_modeling_mistral.py

+        if self.__class__._model is None:
+            self.__class__._model = MistralForCausalLM.from_pretrained(
+                self.model_name, torch_dtype=self.model_dtype
+            ).to(torch_device)
+        return self.__class__._model


we need to modify the class attribute instead of self._model.

Without this but just @cached_property, it still GPU OOM.

This is not ideal, but since it's slow tests which is meant to be run in a single process, we can accept this somehow hacky solution to avoid GPU OOM.

Do we even need @cached_property here?

No :-), you have a very good 👁️ ! I will remove it.
(Actually, after this PR, I started to doubt if cached_property is working well with tests!)

ydshieh · 2024-06-03T16:12:10Z

tests/models/mistral/test_modeling_mistral.py

-        self.model_dtype = torch.float32
-        self.tokenizer = AutoTokenizer.from_pretrained(model_name, use_fast=False)
-        self.model = MistralForCausalLM.from_pretrained(model_name, torch_dtype=self.model_dtype).to(torch_device)
+        self.model_dtype = torch.float16


This change is also necessary, otherwise still GPU OOM

HuggingFaceDocBuilderDev · 2024-06-03T16:21:55Z

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.

amyeroberts

Thanks for fixing!

amyeroberts · 2024-06-03T16:28:32Z

tests/models/mistral/test_modeling_mistral.py

+        if self.__class__._model is None:
+            self.__class__._model = MistralForCausalLM.from_pretrained(
+                self.model_name, torch_dtype=self.model_dtype
+            ).to(torch_device)
+        return self.__class__._model


Do we even need @cached_property here?

* build * build * build * build --------- Co-authored-by: ydshieh <[email protected]>

ydshieh added 3 commits June 3, 2024 17:42

build

4c5082b

build

d4ce971

build

e7b48d8

ydshieh commented Jun 3, 2024

View reviewed changes

ydshieh requested a review from amyeroberts June 3, 2024 16:03

ydshieh commented Jun 3, 2024

View reviewed changes

amyeroberts approved these changes Jun 3, 2024

View reviewed changes

build

db725e9

ydshieh merged commit 8a1a23a into main Jun 3, 2024
21 checks passed

ydshieh deleted the fix_Mask4DTestHard_for_mistral branch June 3, 2024 17:25

zucchini-nlp pushed a commit to zucchini-nlp/transformers that referenced this pull request Jun 11, 2024

Fix GPU OOM for mistral.py::Mask4DTestHard (huggingface#31212)

12c0d24

* build * build * build * build --------- Co-authored-by: ydshieh <[email protected]>

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix GPU OOM for `mistral.py::Mask4DTestHard` #31212

Fix GPU OOM for `mistral.py::Mask4DTestHard` #31212

ydshieh commented Jun 3, 2024 •

edited

Loading

ydshieh Jun 3, 2024

ydshieh Jun 3, 2024

amyeroberts Jun 3, 2024

ydshieh Jun 3, 2024

ydshieh Jun 3, 2024

HuggingFaceDocBuilderDev commented Jun 3, 2024

amyeroberts left a comment

amyeroberts Jun 3, 2024

Fix GPU OOM for mistral.py::Mask4DTestHard #31212

Fix GPU OOM for mistral.py::Mask4DTestHard #31212

Conversation

ydshieh commented Jun 3, 2024 • edited Loading

What does this PR do?

ydshieh Jun 3, 2024

Choose a reason for hiding this comment

ydshieh Jun 3, 2024

Choose a reason for hiding this comment

amyeroberts Jun 3, 2024

Choose a reason for hiding this comment

ydshieh Jun 3, 2024

Choose a reason for hiding this comment

ydshieh Jun 3, 2024

Choose a reason for hiding this comment

HuggingFaceDocBuilderDev commented Jun 3, 2024

amyeroberts left a comment

Choose a reason for hiding this comment

amyeroberts Jun 3, 2024

Choose a reason for hiding this comment

Fix GPU OOM for `mistral.py::Mask4DTestHard` #31212

Fix GPU OOM for `mistral.py::Mask4DTestHard` #31212

ydshieh commented Jun 3, 2024 •

edited

Loading