Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

how overall throughout calculate about all2all #267

Open
ltm920716 opened this issue Nov 19, 2024 · 2 comments
Open

how overall throughout calculate about all2all #267

ltm920716 opened this issue Nov 19, 2024 · 2 comments

Comments

@ltm920716
Copy link

Hi,
I have six H100 nodes,and each with 8*400Gb cx7 nics. And for RDMA, I use RoCE. I want to see the overall throughout.

about allreduce, it seems that the params effect little,and the busbw is the overall throughout?
Image

abou all2all,the params effect large,as follows:
Image

Image

and for all2all,the busbw is for single node or something else?How can I calculate the overall throughout?I can not understand deeply about the busbw for all2all,and what params are the best to test alltoall?the performence will down with the same config when add more node

thanks!

@sjeaugey
Copy link
Member

All numbers look good. Alltoall cannot aggregate the bandwidth of multiple NICs, so the performance you should see is the performance of a single NIC.

@ltm920716
Copy link
Author

so how should I calculate the single nic performance?or could you recommend me a link to understand the calculation formula?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants