Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Error handling for accelerator="mps" and ddp strategy pairing #16153

Merged
merged 33 commits into from
Jan 12, 2023
Merged
Show file tree
Hide file tree
Changes from 7 commits
Commits
Show all changes
33 commits
Select commit Hold shift + click to select a range
d5f4a56
Added misconfigurationexception for incorrect accelerator/strategy pa…
Dec 21, 2022
6332a42
Improved check and added test
Dec 21, 2022
daa3599
updated test
Dec 21, 2022
2aec459
modified test to correct MPS
Dec 21, 2022
d8dc01b
[pre-commit.ci] auto fixes from pre-commit.com hooks
pre-commit-ci[bot] Dec 21, 2022
b4c5087
decorated with runif(mps=true)
Dec 21, 2022
9f8e4f8
Apply suggestions from code review
Borda Dec 21, 2022
1e6b914
Apply suggestions from code review
Borda Dec 21, 2022
511557a
[pre-commit.ci] auto fixes from pre-commit.com hooks
pre-commit-ci[bot] Dec 21, 2022
a0370de
Apply suggestions from code review
Borda Dec 21, 2022
d10250c
Merge branch 'master' into shenoy/mps-distributed-check
justusschock Dec 21, 2022
35a37e5
Added edge cases, improved error message and imprved tests
Dec 21, 2022
3db458c
Added hpu, tpu, ipu false backend condition
Dec 21, 2022
eca8ad8
Merge branch 'master' into shenoy/mps-distributed-check
shenoynikhil Dec 21, 2022
38a2b17
Merge branch 'Lightning-AI:master' into shenoy/mps-distributed-check
shenoynikhil Dec 21, 2022
10e6d75
commented out gpu mps setting
Dec 21, 2022
c3296e4
checking if removing not tpu not hpu checks work
Dec 21, 2022
a3d3028
fixed comments
Dec 21, 2022
603c8f4
Merge branch 'master' into shenoy/mps-distributed-check
shenoynikhil Dec 22, 2022
90827dc
Merge branch 'master' into shenoy/mps-distributed-check
shenoynikhil Dec 22, 2022
91e2746
Merge branch 'master' into shenoy/mps-distributed-check
shenoynikhil Dec 24, 2022
8ccfbdd
Merge branch 'master' into shenoy/mps-distributed-check
shenoynikhil Dec 26, 2022
424b07c
Merge branch 'master' into shenoy/mps-distributed-check
shenoynikhil Jan 1, 2023
58289db
Merge branch 'master' into shenoy/mps-distributed-check
shenoynikhil Jan 3, 2023
4151e74
Merge branch 'master' into shenoy/mps-distributed-check
shenoynikhil Jan 4, 2023
f5fdcd3
handle all edge cases and adapt connector tests
awaelchli Jan 11, 2023
9124343
parametrize test
awaelchli Jan 11, 2023
891017d
update tests
awaelchli Jan 11, 2023
728929c
[pre-commit.ci] auto fixes from pre-commit.com hooks
pre-commit-ci[bot] Jan 11, 2023
0877766
Merge branch 'master' into shenoy/mps-distributed-check
awaelchli Jan 11, 2023
c47c6be
simplification
awaelchli Jan 11, 2023
7de3e63
drop special case of custom strategy
awaelchli Jan 12, 2023
fdcb87f
add edge case for deepspeed
awaelchli Jan 12, 2023
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Original file line number Diff line number Diff line change
Expand Up @@ -283,6 +283,25 @@ def _check_config_and_set_final_flags(
f" Available names are: {', '.join(self._accelerator_types)}."
)

# mps accelerator incompatible with ddp-family
shenoynikhil marked this conversation as resolved.
Show resolved Hide resolved
ddp_classes = (
DDPStrategy,
DDPSpawnShardedStrategy,
DDPShardedStrategy,
DDPFullyShardedNativeStrategy,
DDPFullyShardedStrategy,
DDPSpawnStrategy,
)
is_ddp_str = isinstance(strategy, str) and "ddp" in strategy
if (
accelerator is not None
and strategy is not None
carmocca marked this conversation as resolved.
Show resolved Hide resolved
and (isinstance(accelerator, str) and accelerator == "mps")
and (is_ddp_str or (isinstance(strategy, ddp_classes))
)
Borda marked this conversation as resolved.
Show resolved Hide resolved
):
raise MisconfigurationException("With `accelerator=mps`, strategies from DDP Family are not compatible")
carmocca marked this conversation as resolved.
Show resolved Hide resolved
shenoynikhil marked this conversation as resolved.
Show resolved Hide resolved

self._accelerator_flag = accelerator

if precision is not None:
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -369,6 +369,33 @@ def test_exception_invalid_strategy():
Trainer(strategy="tpu_spawn")


@pytest.mark.parametrize(
["strategy", "strategy_class"],
(
("ddp_spawn", DDPSpawnStrategy),
("ddp_spawn_find_unused_parameters_false", DDPSpawnStrategy),
("ddp", DDPStrategy),
("ddp_find_unused_parameters_false", DDPStrategy),
("dp", DataParallelStrategy),
("ddp_sharded", DDPShardedStrategy),
("ddp_sharded_spawn", DDPSpawnShardedStrategy),
),
)
@RunIf(mps=True)
def test_exception_invalid_ddp_strategy_with_mps(strategy, strategy_class):
# check with str
with pytest.raises(
MisconfigurationException, match=r"With `accelerator=mps`, strategies from DDP Family are not compatible"
):
Trainer(accelerator="mps", strategy=strategy)
shenoynikhil marked this conversation as resolved.
Show resolved Hide resolved

# check with instance
with pytest.raises(
MisconfigurationException, match=r"With `accelerator=mps`, strategies from DDP Family are not compatible"
):
Trainer(accelerator="mps", strategy=strategy_class())


@pytest.mark.parametrize(
["strategy", "strategy_class"],
[
Expand Down