From 9783c650ac06f40f68b919386031f37b9ab9f01d Mon Sep 17 00:00:00 2001 From: raresgaia123 <137071040+raresgaia123@users.noreply.github.com> Date: Fri, 26 Jul 2024 19:44:12 +0300 Subject: [PATCH] doc: added doc for reduce example (#56) created readme file with steps to run the reduce example Co-authored-by: Rares Gaia --- examples/README.md | 1 + examples/reduce/README.md | 37 +++++++++++++++++++++++++++++++++++++ 2 files changed, 38 insertions(+) create mode 100644 examples/reduce/README.md diff --git a/examples/README.md b/examples/README.md index 61e4d22..1efebca 100644 --- a/examples/README.md +++ b/examples/README.md @@ -13,3 +13,4 @@ The list of available examples can be found here: * [`all_reduce`](all_reduce) This script demonstrates a case where all_reduce on tensors are executed for different worlds, without any interference across different worlds. * [`all_gather`](all_gather) This script demonstrates a case where all_gather on tensors are executed for different worlds, without any interference across different worlds. * [`broadcast`](broadcast) This script demonstrates a case where broadcast is executed for different worlds, without any interference across different worlds, with a different src on each step. +* [`reduce`](reduce) This script demonstrates a case where reduce is executed for different worlds, on a different destination rank, without any interference across different worlds diff --git a/examples/reduce/README.md b/examples/reduce/README.md new file mode 100644 index 0000000..7a847a8 --- /dev/null +++ b/examples/reduce/README.md @@ -0,0 +1,37 @@ +# Reduce + +This file provides an example of collective communication using reduce across single and multiple worlds. This exaplme will perform reduce 100 times on each rank from each world using a destination rank from a range from 0 to 2. + +`--worldinfo` argument is composed by the world index(1, 2) and the rank in that world (0, 1 or 2). + +## Running the Script in a Single World + +The single world example can be executed by opening 3 separate terminal windows to have 3 different processes and running the following commands in each terminal window: + +```bash +# on terminal window 1 - will initialize 2 worlds (world1 and world2) with rank 0 +python m8d.py --backend nccl --worldinfo 1,0 --worldinfo 2,0 +# on terminal window 2 - will initialize world1 with rank 1 +python m8d.py --backend nccl --worldinfo 1,1 +# on terminal window 3 - will initialize world1 with rank 2 +python m8d.py --backend nccl --worldinfo 1,2 +``` + +## Running the Script in Multiple Worlds + +The multiple world examplecan be executed by opening 5 separate terminal windows to have 5 different processes and running the following commands in each terminal window: + +```bash +# on terminal window 1 - will initialize 2 worlds (world1 and world2) with rank 0 +python m8d.py --backend nccl --worldinfo 1,0 --worldinfo 2,0 +# on terminal window 2 - will initialize world1 with rank 1 +python m8d.py --backend nccl --worldinfo 1,1 +# on terminal window 3 - will initialize world1 with rank 2 +python m8d.py --backend nccl --worldinfo 1,2 +# on terminal window 4 - will initialize world2 with rank 1 +python m8d.py --backend nccl --worldinfo 2,1 +# on terminal window 5 - will initialize world2 with rank 2 +python m8d.py --backend nccl --worldinfo 2,2 +``` + +To run processes on different hosts, `--addr` arugment can be used witn host's IP address. (`python m8d.py --backend nccl --worldinfo 1,0 --worldinfo 2,0 --addr 10.20.1.50`)