Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support NCCL APIs #319

Merged
merged 5 commits into from
Jun 27, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
6 changes: 6 additions & 0 deletions CMakeLists.txt
Original file line number Diff line number Diff line change
Expand Up @@ -17,6 +17,7 @@ list(APPEND CMAKE_MODULE_PATH ${CMAKE_CURRENT_SOURCE_DIR}/cmake)
option(ENABLE_TRACE "Enable tracing" OFF)
option(BUILD_TESTS "Build tests" ON)
option(BUILD_PYTHON_BINDINGS "Build Python bindings" ON)
option(BUILD_APPS_NCCL "Build NCCL interfaces" ON)
option(USE_CUDA "Use NVIDIA/CUDA." OFF)
option(USE_ROCM "Use AMD/ROCm." OFF)
option(BYPASS_GPU_CHECK "Bypass GPU check." OFF)
Expand Down Expand Up @@ -154,3 +155,8 @@ endif()
if(BUILD_PYTHON_BINDINGS)
add_subdirectory(python)
endif()

# NCCL interfaces
if(BUILD_APPS_NCCL)
add_subdirectory(apps/nccl)
endif()
39 changes: 39 additions & 0 deletions apps/nccl/CMakeLists.txt
Original file line number Diff line number Diff line change
@@ -0,0 +1,39 @@
# Copyright (c) Microsoft Corporation.
# Licensed under the MIT license.

file(GLOB_RECURSE SOURCES CONFIGURE_DEPENDS src/*)
file(GLOB_RECURSE HEADERS CONFIGURE_DEPENDS include/nccl.h)

if(USE_ROCM)
set_source_files_properties(${SOURCES} PROPERTIES LANGUAGE CXX)
endif()

add_library(mscclpp_nccl_obj OBJECT)
target_sources(mscclpp_nccl_obj PRIVATE ${SOURCES})
target_sources(mscclpp_nccl_obj PUBLIC FILE_SET HEADERS FILES ${HEADERS})
target_include_directories(mscclpp_nccl_obj PRIVATE include SYSTEM PRIVATE ${GPU_INCLUDE_DIRS})
target_link_libraries(mscclpp_nccl_obj PRIVATE ${GPU_LIBRARIES} PUBLIC mscclpp_obj)
set_target_properties(mscclpp_nccl_obj PROPERTIES LINKER_LANGUAGE CXX POSITION_INDEPENDENT_CODE 1 VERSION ${MSCCLPP_VERSION} SOVERSION ${MSCCLPP_SOVERSION})
if(USE_CUDA)
target_compile_definitions(mscclpp_nccl_obj PRIVATE USE_CUDA)
elseif(USE_ROCM)
target_compile_definitions(mscclpp_nccl_obj PRIVATE USE_ROCM)
endif()

add_library(mscclpp_nccl SHARED)
target_link_libraries(mscclpp_nccl PUBLIC mscclpp_obj mscclpp_nccl_obj)
set_target_properties(mscclpp_nccl PROPERTIES VERSION ${MSCCLPP_VERSION} SOVERSION ${MSCCLPP_SOVERSION})
add_library(mscclpp_nccl_static STATIC)
target_link_libraries(mscclpp_nccl_static PUBLIC mscclpp_obj mscclpp_nccl_obj)
set_target_properties(mscclpp_nccl_static PROPERTIES VERSION ${MSCCLPP_VERSION} SOVERSION ${MSCCLPP_SOVERSION})

install(TARGETS mscclpp_nccl_obj
FILE_SET HEADERS DESTINATION ${INSTALL_PREFIX}/include)
install(TARGETS mscclpp_nccl
LIBRARY DESTINATION ${INSTALL_PREFIX}/lib)
install(TARGETS mscclpp_nccl_static
ARCHIVE DESTINATION ${INSTALL_PREFIX}/lib)

if(BUILD_TESTS)
add_subdirectory(test)
endif()
46 changes: 46 additions & 0 deletions apps/nccl/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,46 @@
## NCCL Over MSCCL++

### Limitations

Current NCCL over MSCCL++ has a few limitations.

* We do not cover all APIs yet. See the [API Support Table](#api-support-table) for details.
* Multi-node communication is not supported yet.
* Currently, collective communication functions may not work correctly if the buffer address is differed from that of previous function calls while sharing the same base address (returned by [cuMemGetAddressRange](https://docs.nvidia.com/cuda/cuda-driver-api/group__CUDA__MEM.html#group__CUDA__MEM_1g64fee5711274a2a0573a789c94d8299b)) with the previous address. This is because the current implementation performs zero-copy communication over user buffers, and it is difficult to efficiently inform all ranks if the buffer address dynamically changes.

### API Support Table

The table below lists all NCCL APIs (v2.21). We may cover more APIs in the future.

| API Name | Supported |
| :----------------------- | :-------: |
| ncclGetLastError | X |
| ncclGetErrorString | O |
| ncclGetVersion | O |
| ncclGetUniqueId | O |
| ncclCommInitRank | O |
| ncclCommInitAll | X |
| ncclCommInitRankConfig | X |
| ncclCommSplit | X |
| ncclCommFinalize | O |
| ncclCommDestroy | O |
| ncclCommAbort | X |
| ncclCommGetAsyncError | O |
| ncclCommCount | O |
| ncclCommCuDevice | O |
| ncclCommUserRank | O |
| ncclCommRegister | X |
| ncclCommDeregister | X |
| ncclMemAlloc | X |
| ncclMemFree | X |
| ncclAllReduce | O |
| ncclBroadcast | X |
| ncclReduce | X |
| ncclAllGather | O |
| ncclReduceScatter | X |
| ncclGroupStart | O |
| ncclGroupEnd | O |
| ncclSend | X |
| ncclRecv | X |
| ncclRedOpCreatePreMulSum | X |
| ncclRedOpDestroy | X |
Loading
Loading