Skip to content
This repository has been archived by the owner on Dec 20, 2022. It is now read-only.
Yuval Degani edited this page Sep 27, 2017 · 5 revisions

SparkRDMA is a high-performance, scalable and efficient ShuffleManager plugin for Apache Spark. It utilizes RDMA (Remote Direct Memory Access) technology to reduce CPU cycles needed for Shuffle data transfers. It reduces memory usage by reusing memory for transfers instead of copying data multiple times down the traditional TCP-stack.

RDMA is supported on various types of networks, such as traditional Ethernet with RoCE (RDMA over Converged Ethernet), Infiniband and more.

SparkRDMA is build to provide the best performance out of the box. However, if one wishes to squeeze the most out of its installation, we provide multiple configuration properties to precisely tune SparkRDMA on a per-job basis. For more information on how to tune a system, please refer to guides offered in this wiki: