Skip to content

Commit

Permalink
[Bugfix] Fix CustomAllreduce nvlink topology detection (vllm-project#…
Browse files Browse the repository at this point in the history
…3974)

[Bugfix] Fix CustomAllreduce pcie nvlink topology detection (vllm-project#3974) (vllm-project#4159)
  • Loading branch information
agt authored and jimpang committed Apr 19, 2024
1 parent 991ab98 commit 57329c9
Showing 1 changed file with 4 additions and 2 deletions.
6 changes: 4 additions & 2 deletions vllm/distributed/device_communicators/custom_all_reduce.py
Original file line number Diff line number Diff line change
Expand Up @@ -145,8 +145,10 @@ def _is_full_nvlink(rank, world_size):
for i in range(world_size):
if i != rank:
try:
link_state = pynvml.nvmlDeviceGetNvLinkState(handle, i)
if not link_state:
peer_handle = pynvml.nvmlDeviceGetHandleByIndex(i)
p2p_status = pynvml.nvmlDeviceGetP2PStatus(
handle, peer_handle, pynvml.NVML_P2P_CAPS_INDEX_NVLINK)
if p2p_status != pynvml.NVML_P2P_STATUS_OK:
return False
except pynvml.NVMLError as error:
logger.info(
Expand Down

0 comments on commit 57329c9

Please sign in to comment.