[RFC] Add collective_broadcast to the StableHLO specification #1809

chaserileyroberts · 2023-10-17T03:16:30Z

This RFC proposes adding collective_broadcast as one of the collective communication primitives.
Please provide any feedback you feel is valuable.

google-cla · 2023-10-17T03:16:36Z

Thanks for your pull request! It looks like this may be your first contribution to a Google open source project. Before we can look at your pull request, you'll need to sign a Contributor License Agreement (CLA).

View this failed invocation of the CLA check for more information.

For the most up to date status, view the checks section at the bottom of the pull request.

rfcs/20231017-collective-broadcast.md

GleasonK · 2023-10-17T15:47:26Z

View this failed invocation of the CLA check for more information.

Also - could you follow this link and sign the CLA when you get a chance.

chaserileyroberts · 2023-10-17T17:07:04Z

Also - could you follow this link and sign the CLA when you get a chance.

I work at Nvidia so I need to go through the separate channel. I'll take care of it don't worry.

rfcs/20231017-collective-broadcast.md

chaserileyroberts

Made updates based on the new semantics.

rfcs/20231017-collective-broadcast.md

andydavis1 · 2023-11-08T20:52:21Z

Thank you for the RFC. The current set of collective operations are currently used in the SPMD context, where each device is running the same program. With the proposed collective_broadcast, do you have an MPMD use case for this, or were you expecting other devices which were not broadcasting to produce an empty buffer as a result of the broadcast?

chaserileyroberts · 2023-11-08T22:44:03Z

With the proposed collective_broadcast, do you have an MPMD use case for this, or were you expecting other devices which were not broadcasting to produce an empty buffer as a result of the broadcast

No, the idea is for this to still be targeting exclusively SPMD. On the cuda side, the goal is to have this op lower to the exact same nccl.Broadcast operation on all devices. Devices that are not in any replica_group of the SPMD op return zeros, as per the specification and latest example.

chaserileyroberts · 2023-11-08T22:50:17Z

To give a very concrete example of how this operation could be used in a realistic SPMD setting, let me show you some code.

Here, we have an example of how we could implement the 2D pgemm algorithm summa in JAX once this spec is in place.

@partial(shard_map, mesh=Mesh(...), 
    in_specs=(P('x', 'y'), P('x', 'y')), out_specs=P('x', 'y'))
def summa_matrix_multiply(a, b):
  for i in range(N):
     abcast = pcollective_broadcast(a, 'y', root=i)
     bbcast = pcollective_broadcast(b, 'x', root=i)
     if i == 0:
        c = abcast @ bbcast
     c += abcast @ bbcast
  return  c

GleasonK

RFC Approved. I'll send a follow-up with markdownlint fixes.

chaserileyroberts · 2023-11-20T20:40:03Z

🥳 🎉

Thanks again Kevin for getting this over the finish line! I'll get to work on getting this implemented in JAX and cuda xla backend.

@GleasonK

This is the first PR for RFC #1809. I did not add an interpreter implementation as @GleasonK specifically asked me to leave that for new staff joining his team.

burmako added the RFC label Oct 17, 2023

GleasonK reviewed Oct 17, 2023

View reviewed changes

rfcs/20231017-collective-broadcast.md Show resolved Hide resolved

rfcs/20231017-collective-broadcast.md Outdated Show resolved Hide resolved

trevor-m reviewed Oct 17, 2023

View reviewed changes

rfcs/20231017-collective-broadcast.md Outdated Show resolved Hide resolved

burmako assigned GleasonK Oct 26, 2023

ghpvnist reviewed Oct 31, 2023

View reviewed changes

chaserileyroberts force-pushed the chase/rfc/collective_broadcast branch from 58152f9 to 1fd801a Compare November 1, 2023 18:28

chaserileyroberts added 4 commits November 1, 2023 11:28

Added collective broadcast RFC

7744275

Added discussion thread

cadaa2d

Changed definition to return 0s when processes are missing.

87521c0

Updates based on new semantics

059d932

chaserileyroberts force-pushed the chase/rfc/collective_broadcast branch from 1fd801a to 059d932 Compare November 1, 2023 18:28

chaserileyroberts commented Nov 1, 2023

View reviewed changes

changm reviewed Nov 3, 2023

View reviewed changes

rfcs/20231017-collective-broadcast.md Show resolved Hide resolved

changm reviewed Nov 3, 2023

View reviewed changes

rfcs/20231017-collective-broadcast.md Show resolved Hide resolved

ghpvnist reviewed Nov 6, 2023

View reviewed changes

rfcs/20231017-collective-broadcast.md Show resolved Hide resolved

rfcs/20231017-collective-broadcast.md Outdated Show resolved Hide resolved

Added ghpvnist's suggestions

660ef4c

chaserileyroberts commented Nov 6, 2023

View reviewed changes

rfcs/20231017-collective-broadcast.md Show resolved Hide resolved

changm reviewed Nov 7, 2023

View reviewed changes

rfcs/20231017-collective-broadcast.md Show resolved Hide resolved

chaserileyroberts mentioned this pull request Nov 20, 2023

extension of all_gather jax-ml/jax#5392

Open

GleasonK approved these changes Nov 20, 2023

View reviewed changes

GleasonK merged commit 76e25a5 into openxla:main Nov 20, 2023
6 of 7 checks passed

chaserileyroberts mentioned this pull request Nov 22, 2023

Integration of collective_broadcast into spec #1856

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[RFC] Add collective_broadcast to the StableHLO specification #1809

[RFC] Add collective_broadcast to the StableHLO specification #1809

chaserileyroberts commented Oct 17, 2023

google-cla bot commented Oct 17, 2023

GleasonK commented Oct 17, 2023

chaserileyroberts commented Oct 17, 2023

chaserileyroberts left a comment

andydavis1 commented Nov 8, 2023

chaserileyroberts commented Nov 8, 2023

chaserileyroberts commented Nov 8, 2023 •

edited

Loading

GleasonK left a comment

chaserileyroberts commented Nov 20, 2023

[RFC] Add collective_broadcast to the StableHLO specification #1809

[RFC] Add collective_broadcast to the StableHLO specification #1809

Conversation

chaserileyroberts commented Oct 17, 2023

google-cla bot commented Oct 17, 2023

GleasonK commented Oct 17, 2023

chaserileyroberts commented Oct 17, 2023

chaserileyroberts left a comment

Choose a reason for hiding this comment

andydavis1 commented Nov 8, 2023

chaserileyroberts commented Nov 8, 2023

chaserileyroberts commented Nov 8, 2023 • edited Loading

GleasonK left a comment

Choose a reason for hiding this comment

chaserileyroberts commented Nov 20, 2023

chaserileyroberts commented Nov 8, 2023 •

edited

Loading