UCSet library provides std::set
-like class templates for C++, where every operation is noexcept
, and no update can leave the container in a partial state.
There are 3 containers to choose from:
consistent_set
: serializable consistency, fully sorted, based onstd::set
.consistent_avl
: serializable consistency, fully sorted, based on AVL trees.versioning_avl
: snapshot isolation via MVCC, fully sorted, based on AVL trees.
All of them:
- are
noexcept
top to bottom! - are templated, to be used with any
noexcept
-movable anddefault
-constructible types. - can be wrapped into
locked_gt
, to make them thread-safe. - can be wrapped into
partitioned_gt
, to make them concurrent.
If you want your exceptions and classical interfaces back, you can also wrap any container into crazy_gt
.
The entire library is header-only and requires C++17. You can copy-paste it, but it is not 2022 anymore. We suggest using CMake:
include(FetchContent)
FetchContent_Declare(
ucset
GIT_REPOSITORY https://github.com/unum-cloud/ucset
GIT_TAG main
CONFIGURE_COMMAND "" # Nothing to configure, its that simple :)
BUILD_COMMAND "" # No build needed, UCSet is header-only
)
FetchContent_MakeAvailable(ucset)
include_directories(${consistent_set_SOURCE_DIR})
Hate for std::bad_alloc
.
If you consider "Out of Memory" an exception, you are always underutilizing your system.
It happened way too many times that a program crashed when I was only getting to an exciting place.
Especially with:
- Neo4J and every other JVM-based project.
- With big batch sizes beyond VRAM sizes of GPU when doing ML.
At Unum, we live in conditions where machines can easily have 1 TB of RAM per CPU socket, but it is still at least 100x less than the datasets we are trying to swallow.
So when we started working on UKV to build high-speed hardware-friendly databases, we needed something better than Standard Templates Library, with features uncommon to other libraries as well:
- Accessing the allocator state by reference.
- Reserving memory for tree nodes before inserting.
- Explicitly traversing trees for random sampling.
- Speed!
Now UCSet powers the in-memory backend of UKV.
Concurrent containers in the library are blocking. Their performance greatly depends on the "mutexes" you are using. So we allow different implementations:
- STL:
std::shared_mutex
, - Intel One API:
tbb::rw_mutex
, - Or anything else with the same interfaces.