You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Not clear whether --cpu, --gpu arguments are overrided by -j arguments, although in my testing (launch then run top, etc.) it seems they are?
What should it say?
Both the docs and the --help output for dist.ddp could be more clear on this front. More generally, I am wondering if there exists a torchx equivalent of torchrun --standalone --nnodes=1 --nproc_per_node=auto ....
Why?
Clearly I wouldn't want --gpu=0 with -j 1x2, right? As such the listed defaults in docs --help are a little confusing.
The text was updated successfully, but these errors were encountered:
📚 Documentation
Link
https://pytorch.org/torchx/latest/components/distributed.html
What does it currently say?
Not clear whether --cpu, --gpu arguments are overrided by -j arguments, although in my testing (launch then run top, etc.) it seems they are?
What should it say?
Both the docs and the --help output for dist.ddp could be more clear on this front. More generally, I am wondering if there exists a torchx equivalent of
torchrun --standalone --nnodes=1 --nproc_per_node=auto ...
.Why?
Clearly I wouldn't want
--gpu=0
with-j 1x2
, right? As such the listed defaults in docs --help are a little confusing.The text was updated successfully, but these errors were encountered: