forked from openucx/ucx
-
Notifications
You must be signed in to change notification settings - Fork 14
/
README
248 lines (188 loc) · 8.41 KB
/
README
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
<div align="center">
<a href="http://www.openucx.org/"><img src="./docs/doxygen/UCX_Logo_930x933.png" width="200"></a>
<br>
<a href="https://twitter.com/intent/follow?screen_name=openucx"> <img src="https://img.shields.io/twitter/follow/openucx?style=social&logo=twitter" alt="follow on Twitter"></a>
<a href="https://openucx.github.io/ucx/api/latest/html/"><img src="docs/doxygen/api.svg"></a>
<a href='https://openucx.readthedocs.io/en/master/?badge=master'><img src='https://readthedocs.org/projects/openucx/badge/?version=master' alt='Documentation Status' />
<a href="https://github.com/openucx/ucx/releases/latest"><img src="docs/doxygen/release.svg"></a>
</div>
<!-- TOC generated by https://github.com/ekalinin/github-markdown-toc -->
<hr>
* [Unified Communication X](#unified-communication-x)
* [Using UCX](#using-ucx)
* [Building and Running Internal Unit Tests](#building-and-running-internal-unit-tests)
* [UCX Performance Test](#ucx-performance-test)
* [Our Community](#our-community)
* [Licenses](#licenses)
* [Contributor Agreement and Guidelines](#contributor-agreement-and-guidelines)
* [UCX Publications](#ucx-publications)
* [UCX Architecture](#ucx-architecture)
* [Supported Transports](#supported-transports)
* [Supported CPU Architectures](#supported-cpu-architectures)
<hr>
# Unified Communication X
Unified Communication X (UCX) provides an optimized communication
layer for Message Passing ([MPI](https://www.mpi-forum.org/)),
[PGAS](http://www.pgas.org/)/[OpenSHMEM](http://www.openshmem.org/)
libraries and RPC/data-centric applications.
UCX utilizes high-speed networks for inter-node communication, and
shared memory mechanisms for efficient intra-node communication.
## Using UCX
### Release Builds
Building UCX is typically a combination of running "configure" and "make".
Execute the following commands to install the UCX system from within the
directory at the top of the tree:
```sh
$ ./autogen.sh
$ ./contrib/configure-release --prefix=/where/to/install
$ make -j8
$ make install
```
NOTE: Compiling support for various networks or other specific hardware may
require additional command line flags when running configure.
### Developer Builds
```bash
$ ./autogen.sh
$ ./contrib/configure-devel --prefix=$PWD/install-debug
```
*** NOTE: Developer builds of UCX typically include a large performance
penalty at run-time because of extra debugging code.
### Running internal unit tests
```sh
$ make -C test/gtest test
```
### Build RPM package
```bash
$ contrib/buildrpm.sh -s -b
```
### Build DEB package
```bash
$ dpkg-buildpackage -us -uc
```
### Build Doxygen documentation
```bash
$ make docs
```
### OpenMPI and OpenSHMEM installation with UCX
[Wiki page](http://github.com/openucx/ucx/wiki/OpenMPI-and-OpenSHMEM-installation-with-UCX)
### MPICH installation with UCX
[Wiki page](http://github.com/openucx/ucx/wiki/MPICH-installation-with-UCX)
### UCX Performance Test
Start server:
```sh
$ ./src/tools/perf/ucx_perftest -c 0
```
Connect client:
```sh
$ ./src/tools/perf/ucx_perftest <server-hostname> -t tag_lat -c 1
```
Note: the `-c` flag sets CPU affinity. If running both commands on same host, make sure you set the affinity to different CPU cores.
## Our Community
* [Project Website](http://www.openucx.org/)
* [ReadTheDocs](https://openucx.readthedocs.io/en/master/)
* [Github](http://www.github.com/openucx/ucx/)
* [Software Releases](http://www.github.com/openucx/ucx/releases)
* [Mailing List](https://elist.ornl.gov/mailman/listinfo/ucx-group)
* [Twitter](https://twitter.com/openucx)
## Licenses
UCX is licensed as:
* [BSD3](LICENSE)
## Contributor Agreement and Guidelines
In order to contribute to UCX, please sign up with an appropriate
[Contributor Agreement](http://www.openucx.org/license/).
Follow these
[instructions](https://github.com/openucx/ucx/wiki/Guidance-for-contributors)
when submitting contributions and changes.
## UCX Publications
To reference UCX in a publication, please use the following entry:
```bibtex
@inproceedings{shamis2015ucx,
title={UCX: an open source framework for HPC network APIs and beyond},
author={Shamis, Pavel and Venkata, Manjunath Gorentla and Lopez, M Graham and Baker, Matthew B and Hernandez, Oscar and Itigin, Yossi and Dubman, Mike and Shainer, Gilad and Graham, Richard L and Liss, Liran and others},
booktitle={2015 IEEE 23rd Annual Symposium on High-Performance Interconnects},
pages={40--43},
year={2015},
organization={IEEE}
}
```
To reference the UCX website:
```bibtex
@misc{openucx-website,
title = {{The Unified Communication X Library}},
key = {{{The Unified Communication X Library}},
howpublished = {{\url{http://www.openucx.org}}}
}
```
## UCX Architecture
![](docs/doxygen/Architecture.png)
| Component | Role | Description |
| :---: | :---: | --- |
| UCP | Protocol | Implements high-level abstractions such as tag-matching, streams, connection negotiation and establishment, multi-rail, and handling different memory types |
| UCT | Transport | Implements low-level communication primitives such as active messages, remote memory access, and atomic operations |
| UCS | Services | A collection of data structures, algorithms, and system utilities for common use |
| UCM | Memory | Intercepts memory allocation and release events, used by the memory registration cache |
## Supported Transports
* [Infiniband](https://www.infinibandta.org/)
* [Omni-Path](https://www.intel.com/content/www/us/en/high-performance-computing-fabrics/omni-path-driving-exascale-computing.html)
* [RoCE](http://www.roceinitiative.org/)
* [Cray Gemini and Aries](https://www.cray.com/)
* [CUDA](https://developer.nvidia.com/cuda-zone)
* [ROCm](https://rocm.github.io/)
* Shared Memory
* posix, sysv, [cma](https://dl.acm.org/citation.cfm?id=2616532), [knem](http://knem.gforge.inria.fr/), and [xpmem](https://github.com/hjelmn/xpmem)
* TCP/IP
## Supported CPU Architectures
* [x86_64](https://en.wikipedia.org/wiki/X86-64)
* [Power8/9](https://www.ibm.com/support/knowledgecenter/en/POWER9/p9hdx/POWER9welcome.htm)
* [Arm v8](https://www.arm.com/products/silicon-ip-cpu)
## Huawei Optimization Introduction
Based on performance consideration, UCX **DO NOT** provide the functionalities related to transmission security.
There are three optimized collective operations:
- MPI_Allreduce
- MPI_Bcast
- MPI_Barrier
- MPI_Alltoallv
Select specific algorithm with parameters which is showed in the table below.
Bcast:
| UCX_BUILTIN_BCAST_ALGORITHM | Algorithm |
| ---- | ---- |
| 1 | Binomial tree |
| 2 | Topo-aware tree (Binomial tree + Binomial tree) |
| 3 | Topo-aware tree (K-nomial tree + Binomial tree) |
| 4 | Topo-aware tree (K-nomial tree + K-nomial tree) |
| 5 | Node-aware In Network Computing (INC) |
Allreduce:
| UCX_BUILTIN_ALLREDUCE_ALGORITHM | Algorithm |
| ---- | ---- |
| 1 | Recursive |
| 2 | Topo-aware Recursive (ppn inside node) |
| 3 | Topo-aware Recursive (ppn inside socket) |
| 4 | Ring |
| 5 | Topo-aware Recursive (with K-nomial tree for intra node) |
| 6 | Topo-aware Recursive (with K-nomial tree for intra node, ppn inside socket) |
| 7 | Topo-aware FANIN-FANOUT (with K-nomial tree for intra node, ppn inside node) |
| 8 | Topo-aware FANIN-FANOUT (with K-nomial tree for intra node, ppn inside socket) |
| 9 | Node-aware In Network Computing (INC) |
| 10 | Socket-aware In Network Computing (INC) |
| 11 | Node-Aware Parallel algorithm (NAP) |
| 12 | Rabenseifner's algorithm (binary block) |
| 13 | Rabenseifner's algorithm (node aware binary block) |
| 14 | Rabenseifner's algorithm (socket aware binary block) |
Barrier:
| UCX_BUILTIN_BARRIER_ALGORITHM | Algorithm |
| ---- | ---- |
| 1 | Recursive |
| 2 | Topo-aware Recursive (ppn inside node) |
| 3 | Topo-aware Recursive (ppn inside socket) |
| 4 | Topo-aware Recursive (with K-nomial tree for intra node) |
| 5 | Topo-aware Recursive (with K-nomial tree for intra node, ppn inside socket) |
| 6 | Topo-aware FANIN-FANOUT (with K-nomial tree for intra node, ppn inside node) |
| 7 | Topo-aware FANIN-FANOUT (with K-nomial tree for intra node, ppn inside socket) |
| 8 | Node-aware In Network Computing (INC) |
| 9 | Socket-aware In Network Computing (INC) |
| 10 | Node-Aware Parallel algorithm (NAP) |
Alltoallv:
| UCX_BUILTIN_ALLTOALLV_ALGORITHM | Algorithm |
| ---- | ---- |
| 1 | Throttled scattered destination |
| 2 | gatherv+alltoallv+scatterv |