- Make device_memory_resource::do_get_mem_info() and supports_get_mem_info() not pure virtual. Remove derived implementations and calls in RMM (#1430) @harrism
- Deprecate detail::available_device_memory, most detail/aligned.hpp utilities, and optional pool_memory_resource initial size (#1424) @harrism
- Require explicit pool size in
pool_memory_resource
and move some things out of detail namespace (#1417) @harrism - Remove HTML builds of librmm (#1415) @vyasr
- Update to CCCL 2.2.0. (#1404) @bdice
- Switch to scikit-build-core (#1287) @vyasr
- Exclude tests from builds (#1459) @vyasr
- Update CODEOWNERS (#1410) @raydouglass
- Correct signatures for torch allocator plug in (#1407) @wence-
- Fix Arena MR to support simultaneous access by PTDS and other streams (#1395) @tgravescs
- Fix else-after-throw clang tidy error (#1391) @harrism
- remove references to setup.py in docs (#1420) @jameslamb
- Remove HTML builds of librmm (#1415) @vyasr
- Update GPU support docs to drop Pascal (#1413) @harrism
- Make device_memory_resource::do_get_mem_info() and supports_get_mem_info() not pure virtual. Remove derived implementations and calls in RMM (#1430) @harrism
- Deprecate detail::available_device_memory, most detail/aligned.hpp utilities, and optional pool_memory_resource initial size (#1424) @harrism
- Add a host-pinned memory resource that can be used as upstream for
pool_memory_resource
. (#1392) @harrism
- Remove usages of rapids-env-update (#1423) @KyleFromNVIDIA
- Refactor CUDA versions in dependencies.yaml. (#1422) @bdice
- Require explicit pool size in
pool_memory_resource
and move some things out of detail namespace (#1417) @harrism - Update dependencies.yaml to support CUDA 12.*. (#1414) @bdice
- Define python dependency range as a matrix fallback. (#1409) @bdice
- Use latest cuda-python within CUDA major version. (#1406) @bdice
- Update to CCCL 2.2.0. (#1404) @bdice
- Remove RMM_BUILD_WHEELS and standardize Python builds (#1401) @vyasr
- Update to fmt 10.1.1 and spdlog 1.12.0. (#1374) @bdice
- Switch to scikit-build-core (#1287) @vyasr
- Document minimum CUDA version of 11.4 (#1385) @harrism
- Store and set the correct CUDA device in device_buffer (#1370) @harrism
- Use
cuda::mr::memory_resource
instead of rawdevice_memory_resource
(#1095) @miscco
- Update actions/labeler to v4 (#1397) @raydouglass
- Backport arena MR fix for simultaneous access by PTDS and other streams (#1396) @bdice
- Deliberately leak PTDS thread_local events in stream ordered mr (#1375) @wence-
- Add missing CUDA 12 dependencies and fix dlopen library names (#1366) @vyasr
- Document minimum CUDA version of 11.4 (#1385) @harrism
- Fix more doxygen issues (#1367) @vyasr
- Add groups to the doxygen docs (#1358) @vyasr
- Enable doxygen XML and fix issues (#1348) @vyasr
- Make internally stored default argument values public (#1373) @vyasr
- Store and set the correct CUDA device in device_buffer (#1370) @harrism
- Update rapids-cmake functions to non-deprecated signatures (#1357) @robertmaynard
- Generate unified Python/C++ docs (#1324) @vyasr
- Use
cuda::mr::memory_resource
instead of rawdevice_memory_resource
(#1095) @miscco
- Silence false gcc warning (#1381) @miscco
- Build concurrency for nightly and merge triggers (#1380) @bdice
- Update
shared-action-workflows
references (#1363) @AyodeAwe - Use branch-23.12 workflows. (#1360) @bdice
- Update devcontainers to 23.12 (#1355) @raydouglass
- Generate proper, consistent nightly versions for pip and conda packages (#1347) @vyasr
- RMM: Build CUDA 12.0 ARM conda packages. (#1330) @bdice
- Compile cdef public functions from torch_allocator with C ABI (#1350) @wence-
- Make doxygen only a conda dependency. (#1344) @bdice
- Use
conda mambabuild
notmamba mambabuild
(#1338) @wence- - Fix stream_ordered_memory_resource attempt to record event in stream from another device (#1333) @harrism
- Clean up headers in CMakeLists.txt. (#1341) @bdice
- Add pre-commit hook to validate doxygen (#1334) @vyasr
- Fix doxygen warnings (#1317) @vyasr
- Treat warnings as errors in Python documentation (#1316) @vyasr
- Update image names (#1346) @AyodeAwe
- Update to clang 16.0.6. (#1343) @bdice
- Update doxygen to 1.9.1 (#1337) @vyasr
- Simplify wheel build scripts and allow alphas of RAPIDS dependencies (#1335) @divyegala
- Use
copy-pr-bot
(#1329) @ajschmidt8 - Add RMM devcontainers (#1328) @trxcllnt
- Add Python bindings for
limiting_resource_adaptor
(#1327) @pentschev - Fix missing jQuery error in docs (#1321) @AyodeAwe
- Use fetch_rapids.cmake. (#1319) @bdice
- Update to Cython 3.0.0 (#1313) @vyasr
- Branch 23.10 merge 23.08 (#1312) @vyasr
- Branch 23.10 merge 23.08 (#1309) @vyasr
- Stop invoking setup.py (#1300) @vyasr
- Remove now-deprecated top-level allocator functions (#1281) @wence-
- Remove padding from device_memory_resource (#1278) @vyasr
- Fix typo in wheels-test.yaml. (#1310) @bdice
- Add a missing '#include <array>' in logger.hpp (#1295) @valgur
- Use gbench
thread_index()
accessor to fix replay bench compilation (#1293) @harrism - Ensure logger tests don't generate temp directories in build dir (#1289) @robertmaynard
- Switch to new CI wheel building pipeline (#1305) @vyasr
- Revert CUDA 12.0 CI workflows to branch-23.08. (#1303) @bdice
- Update linters: remove flake8, add ruff, update cython-lint (#1302) @vyasr
- Adding identify minimum version requirement (#1301) @hyperbolic2346
- Stop invoking setup.py (#1300) @vyasr
- Use cuda-version to constrain cudatoolkit. (#1296) @bdice
- Update to CMake 3.26.4 (#1291) @vyasr
- use rapids-upload-docs script (#1288) @AyodeAwe
- Reorder parameters in RMM_EXPECTS (#1286) @vyasr
- Remove documentation build scripts for Jenkins (#1285) @ajschmidt8
- Remove padding from device_memory_resource (#1278) @vyasr
- Unpin scikit-build upper bound (#1275) @vyasr
- RMM: Build CUDA 12 packages (#1223) @bdice
- Ensure Logger tests aren't run in parallel (#1277) @robertmaynard
- Pin to scikit-build<0.17.2. (#1262) @bdice
- Require Numba 0.57.0+ & NumPy 1.21.0+ (#1279) @jakirkham
- Align test_cpp.sh with conventions in other RAPIDS repos. (#1269) @bdice
- Switch back to using primary shared-action-workflows branch (#1268) @vyasr
- Update recipes to GTest version >=1.13.0 (#1263) @bdice
- Support CUDA 12.0 for pip wheels (#1259) @bdice
- Add build vars (#1258) @AyodeAwe
- Enable sccache hits from local builds (#1257) @AyodeAwe
- Revert to branch-23.06 for shared-action-workflows (#1256) @shwina
- run docs builds nightly too (#1255) @AyodeAwe
- Build wheels using new single image workflow (#1254) @vyasr
- Update minimum Python version to Python 3.9 (#1252) @shwina
- Remove usage of rapids-get-rapids-version-from-git (#1251) @jjacobelli
- Remove wheel pytest verbosity (#1249) @sevagh
- Update clang-format to 16.0.1. (#1246) @bdice
- Remove uses-setup-env-vars (#1242) @vyasr
- Move RMM_LOGGING_ASSERT into separate header (#1241) @ahendriksen
- Use ARC V2 self-hosted runners for GPU jobs (#1239) @jjacobelli
- Remove MANIFEST.in use auto-generated one for sdists and package_data for wheels (#1233) @vyasr
- Fix update-version.sh. (#1227) @vyasr
- Specify include_package_data to setup (#1218) @vyasr
- Revert changes overriding rapids-cmake repo. (#1209) @bdice
- Synchronize stream in
DeviceBuffer.c_from_unique_ptr
constructor (#1100) @shwina
- Use rapids-cmake parallel testing feature (#1183) @robertmaynard
- Stop setting package version attribute in wheels (#1236) @vyasr
- Add codespell as a linter (#1231) @bdice
- Pass
AWS_SESSION_TOKEN
andSCCACHE_S3_USE_SSL
vars to conda build (#1230) @ajschmidt8 - Update to GCC 11 (#1228) @bdice
- Fix some minor oversights in the conversion to pyproject.toml (#1226) @vyasr
- Remove pickle compatibility layer in tests for Python < 3.8. (#1224) @bdice
- Move external allocators into rmm.allocators module to defer imports (#1221) @wence-
- Generate pyproject.toml dependencies using dfg (#1219) @vyasr
- Run rapids-dependency-file-generator via pre-commit (#1217) @vyasr
- Skip docs job in nightly runs (#1215) @AyodeAwe
- CI: Remove specification of manual stage for check_style.sh script. (#1214) @csadorf
- Use script rather than environment variable to modify package names (#1212) @vyasr
- Reduce error handling verbosity in CI tests scripts (#1204) @AjayThorve
- Update shared workflow branches (#1203) @ajschmidt8
- Use date in build string instead of in the version. (#1195) @bdice
- Stop using versioneer to manage versions (#1190) @vyasr
- Update to spdlog>=1.11.0, fmt>=9.1.0. (#1177) @bdice
- Migrate as much as possible to
pyproject.toml
(#1151) @jakirkham
- pre-commit: Update isort version to 5.12.0 (#1197) @wence-
- Revert "Upgrade to spdlog 1.10 (#1173)" (#1176" (#1176)) @bdice
- Ensure
UpstreamResourceAdaptor
is not cleared by the Python GC (#1170) @shwina
- Update shared workflow branches (#1201) @ajschmidt8
- Fix update-version.sh (#1199) @raydouglass
- Use CTK 118/cp310 branch of wheel workflows (#1193) @sevagh
- Update
build.yaml
workflow to reduce verbosity (#1192) @AyodeAwe - Fix
build.yaml
workflow (#1191) @ajschmidt8 - add docs_build step (#1189) @AyodeAwe
- Upkeep/wheel param cleanup (#1187) @sevagh
- Update workflows for nightly tests (#1186) @ajschmidt8
- Build CUDA
11.8
and Python3.10
Packages (#1184) @ajschmidt8 - Build wheels alongside conda CI (#1182) @sevagh
- Update conda recipes. (#1180) @bdice
- Update PR Workflow (#1174) @ajschmidt8
- Upgrade to spdlog 1.10 (#1173) @kkraus14
- Enable
codecov
(#1171) @ajschmidt8 - Add support for Python 3.10. (#1166) @bdice
- Update pre-commit hooks (#1154) @bdice
- Don't use CMake 3.25.0 as it has a show stopping FindCUDAToolkit bug (#1162) @robertmaynard
- Relax test for async memory pool IPC handle support (#1130) @bdice
- Use rapidsai CODE_OF_CONDUCT.md (#1159) @bdice
- Fix doxygen formatting for set_stream. (#1153) @bdice
- Document required Python dependencies to build from source (#1146) @ccoulombe
- fix failed automerge (Branch 22.12 merge 22.10) (#1131) @harrism
- Align version with wheel version (#1161) @sevagh
- Add
ninja
& Update CI environment variables (#1155) @ajschmidt8 - Remove CUDA 11.0 from dependencies.yaml. (#1152) @bdice
- Update dependencies schema. (#1147) @bdice
- Enable sccache for python build (#1145) @Ethyling
- Remove Jenkins scripts (#1143) @ajschmidt8
- Use
ninja
in GitHub Actions (#1142) @ajschmidt8 - Switch to using rapids-cmake for gbench. (#1139) @vyasr
- Remove stale labeler (#1137) @raydouglass
- Add a public
copy
API toDeviceBuffer
(#1128) @galipremsagar - Format gdb script. (#1127) @bdice
- Ensure consistent spdlog dependency target no matter the source (#1101) @robertmaynard
- Remove cuda event deadlocking issues in device mr tests (#1097) @robertmaynard
- Propagate exceptions raised in Python callback functions (#1096) @madsbk
- Avoid unused parameter warnings in do_get_mem_info (#1084) @fkallen
- Use rapids-cmake 22.10 best practice for RAPIDS.cmake location (#1083) @robertmaynard
- Document that minimum required CMake version is now 3.23.1 (#1098) @robertmaynard
- Fix docs for module-level API (#1091) @bdice
- Improve DeviceBuffer docs. (#1090) @bdice
- Branch 22.10 merge 22.08 (#1089) @harrism
- Improve docs formatting and update links. (#1086) @bdice
- Add resources section to README. (#1085) @bdice
- Simplify PR template. (#1080) @bdice
- Add
gdb
pretty-printers for rmm types (#1088) @upsj - Support using THRUST_WRAPPED_NAMESPACE (#1077) @robertmaynard
- GH Actions - Enforce
checks
before builds run (#1125) @ajschmidt8 - Update GH Action Workflows (#1123) @ajschmidt8
- Add
cudatoolkit
versions todependencies.yaml
(#1119) @ajschmidt8 - Remove
rmm
installation fromlibrmm
tests` (#1117) @ajschmidt8 - Add GitHub Actions workflows (#1104) @Ethyling
build.sh
: accept--help
(#1093) @madsbk- Move clang dependency to conda develop packages. (#1092) @bdice
- Add device_uvector::reserve and device_buffer::reserve (#1079) @upsj
- Bifurcate Dependency Lists (#1073) @ajschmidt8
- Specify
language
as'en'
instead ofNone
(#1059) @jakirkham - Add a missed
except *
(#1057) @shwina - Properly handle cudaMemHandleTypeNone and cudaErrorInvalidValue in is_export_handle_type_supported (#1055) @gerashegalov
- Centralize common css & js code in docs (#1075) @galipremsagar
- Add the ability to register and unregister reinitialization hooks (#1072) @shwina
- Update isort to 5.10.1 (#1069) @vyasr
- Forward merge 22.06 into 22.08 (#1067) @vyasr
- Forward merge 22.06 into 22.08 (#1066) @vyasr
- Pin max version of
cuda-python
to11.7
(#1062) @galipremsagar - Change build.sh to find C++ library by default and avoid shadowing CMAKE_ARGS (#1053) @vyasr
- Clarifies Python requirements and version constraints (#1037) @jakirkham
- Use
lib
(notlib64
) for libraries (#1024) @jakirkham - Properly enable Cython docstrings. (#1020) @vyasr
- Update
RMMNumbaManager
to handleNUMBA_CUDA_USE_NVIDIA_BINDING=1
(#1004) @brandon-b-miller
- Clarify using RMM with other Python libraries (#1034) @jrhemstad
- Replace
to_device
withDeviceBuffer.to_device
(#1033) @wence- - Documentation Fix: Replace
cudf::logic_error
withrmm::logic_error
(#1021) @codereport
- Fix conda recipes for conda compilers (#1043) @Ethyling
- Use new rapids-cython component of rapids-cmake to simplify builds (#1031) @vyasr
- Merge branch-22.04 to branch-22.06 (#1028) @jakirkham
- Update CMake pinning to just avoid 3.23.0. (#1023) @vyasr
- Build python using conda in GPU jobs (#1017) @Ethyling
- Remove pip requirements file. (#1015) @bdice
- Clean up Thrust includes. (#1011) @bdice
- Update black version (#1010) @vyasr
- Update cmake-format version for pre-commit and environments. (#995) @vyasr
- Use conda compilers (#977) @Ethyling
- Build conda packages using mambabuild (#900) @Ethyling
- Add cuda-python dependency to pyproject.toml (#994) @sevagh
- Disable opportunistic reuse in async mr when cuda driver < 11.5 (#993) @rongou
- Use CUDA 11.2+ features via dlopen (#990) @robertmaynard
- Skip async mr tests when cuda runtime/driver < 11.2 (#986) @rongou
- Fix warning/error in debug assertion in device_uvector.hpp (#979) @harrism
- Fix signed/unsigned comparison warning (#970) @jlowe
- Fix comparison of async MRs with different underlying pools. (#965) @harrism
- Temporarily disable new
ops-bot
functionality (#1005) @ajschmidt8 - Rename
librmm_tests
tolibrmm-tests
(#1000) @ajschmidt8 - Update
librmm
conda
recipe (#997) @ajschmidt8 - Remove
no_cma
/has_cma
variants (#996) @ajschmidt8 - Fix free-before-alloc in multithreaded test (#992) @aladram
- Add
.github/ops-bot.yaml
config file (#991) @ajschmidt8 - Log allocation failures (#988) @rongou
- Update
librmm
conda
outputs (#983) @ajschmidt8 - Bump Python requirements in
setup.cfg
andrmm_dev.yml
(#982) @shwina - New benchmark compares concurrent throughput of device_vector and device_uvector (#981) @harrism
- Update
librmm
recipe to outputlibrmm_tests
package (#978) @ajschmidt8 - Update upload.sh to use
--croot
(#975) @AyodeAwe - Fix
conda
uploads (#974) @ajschmidt8 - Add CMake
install
rules for tests (#969) @ajschmidt8 - Add device_buffer::ssize() and device_uvector::ssize() (#966) @harrism
- Added yml file for cudatoolkit version 11.6 (#964) @alhad-deshpande
- Replace
ccache
withsccache
(#963) @ajschmidt8 - Make
pool_memory_resource::pool_size()
public (#962) @shwina - Allow construction of cuda_async_memory_resource from existing pool (#889) @fkallen
- Use numba to get CUDA runtime version. (#946) @bdice
- Temporarily disable warnings for unknown pragmas (#942) @harrism
- Build benchmarks in RMM CI (#941) @harrism
- Headers that use
std::thread
now include <thread> (#938) @robertmaynard - Fix failing stream test with a debug-only death test (#934) @harrism
- Prevent
DeviceBuffer
DeviceMemoryResource premature release (#931) @viclafargue - Fix failing tracking test (#929) @harrism
- Prepare upload scripts for Python 3.7 removal (#952) @Ethyling
- Fix imports tests syntax (#935) @Ethyling
- Remove
IncludeCategories
from.clang-format
(#933) @codereport - Replace use of custom CUDA bindings with CUDA-Python (#930) @shwina
- Remove
setup.py
fromupdate-release.sh
script (#926) @ajschmidt8 - Improve C++ Test Coverage (#920) @harrism
- Improve the Arena allocator to reduce memory fragmentation (#916) @rongou
- Simplify CMake linting with cmake-format (#913) @vyasr
- Update recipes for Enhanced Compatibility (#910) @ajschmidt8
- Fix
librmm
uploads (#909) @ajschmidt8 - Use spdlog/fmt/ostr.h as it supports external fmt library (#907) @robertmaynard
- Fix variable names in logging macro calls (#897) @harrism
- Keep rapids cmake version in sync (#876) @robertmaynard
- Replace
to_device()
in docs withDeviceBuffer.to_device()
(#902) @shwina - Fix return value docs for supports_get_mem_info (#884) @harrism
- suppress spurious clang-tidy warnings in debug macros (#914) @rongou
- C++ code coverage support (#905) @harrism
- Provide ./build.sh flag to control CUDA async malloc support (#901) @robertmaynard
- Parameterize exception type caught by failure_callback_resource_adaptor (#898) @harrism
- Throw
rmm::out_of_memory
when we know for sure (#894) @rongou - Update
conda
recipes for Enhanced Compatibility effort (#893) @ajschmidt8 - Add functions to query the stream of device_uvector and device_scalar (#887) @fkallen
- Add spdlog to install export set (#886) @trxcllnt
- Delete cuda_async_memory_resource copy/move ctors/operators (#860) @jrhemstad
- Fix parameter name in asserts (#875) @vyasr
- Disallow zero-size stream pools (#873) @harrism
- Correct namespace usage in host memory resources (#872) @divyegala
- fix race condition in limiting resource adapter (#869) @rongou
- Install the right cudatoolkit in the conda env in gpu/build.sh (#864) @shwina
- Disable copy/move ctors and operator= from free_list classes (#862) @harrism
- Delete cuda_async_memory_resource copy/move ctors/operators (#860) @jrhemstad
- Improve concurrency of stream_ordered_memory_resource by stealing less (#851) @harrism
- Use the new RAPIDS.cmake to fetch rapids-cmake (#838) @robertmaynard
- Forward-merge branch-21.08 to branch-21.10 (#846) @jakirkham
- Forward-merge
branch-21.08
intobranch-21.10
(#877) @ajschmidt8 - Add .clang-tidy and fix clang-tidy warnings (#857) @harrism
- Update to use rapids-cmake 21.10 pre-configured packages (#854) @robertmaynard
- Clean up: use std::size_t, include cstddef and aligned.hpp where missing (#852) @harrism
- tweak the arena mr to reduce fragmentation (#845) @rongou
- Fix transitive include in cuda_device header (#843) @wphicks
- Refactor cmake style (#842) @robertmaynard
- add multi stream allocations benchmark. (#841) @cwharris
- Enforce default visibility for
get_map
. (#833) @trivialfis - ENH Replace gpuci_conda_retry with gpuci_mamba_retry (#823) @dillon-cullinan
- Execution policy class (#816) @viclafargue
- Refactor
rmm::device_scalar
in terms ofrmm::device_uvector
(#789) @harrism - Explicit streams in device_buffer (#775) @harrism
- Pin spdlog in dev conda envs (#835) @trxcllnt
- Pinning spdlog because recent updates are causing compile issues. (#831) @cjnolet
- update isort to 5.6.4 (#822) @cwharris
- fix align_up namespace in aligned_resource_adaptor.hpp (#820) @rongou
- Run updated isort hook on pxd files (#812) @charlesbluca
- find_package(RMM) can now be called multiple times safely (#811) @robertmaynard
- Fix building on CUDA 11.3 (#809) @benfred
- Remove leading zeros in version_config.hpp (#793) @hcho3
- Fix PoolMemoryResource Python doc examples (#807) @harrism
- Fix incorrect href in README.md (#804) @benchislett
- Update build instruction in README (#797) @hcho3
- Document compute sanitizer memcheck support (#790) @harrism
- Bump isort, enable Cython package resorting (#806) @charlesbluca
- Support multiple output sinks in logging_resource_adaptor (#791) @harrism
- Add Statistics Resource Adaptor and cython bindings to
tracking_resource_adaptor
andstatistics_resource_adaptor
(#626) @mdemoret-nv
- Fix isort in cuda_stream_view.pxd (#827) @harrism
- Cython extension for rmm::cuda_stream_pool (#818) @divyegala
- Fix building on cuda 11.4 (#817) @benfred
- Updating Clang Version to 11.0.0 (#814) @codereport
- Add spdlog to
rmm-exports
if found by CPM (#810) @trxcllnt - Fix
21.08
forward-merge conflicts (#803) @ajschmidt8 - RMM now leverages rapids-cmake to reduce CMake boilerplate (#800) @robertmaynard
- Refactor
rmm::device_scalar
in terms ofrmm::device_uvector
(#789) @harrism - make it easier to include rmm in other projects (#788) @rongou
- Compile Cython with C++17. (#787) @vyasr
- Fix Merge Conflicts (#786) @ajschmidt8
- Explicit streams in device_buffer (#775) @harrism
- FindThrust now guards against multiple inclusion by different consumers (#784) @robertmaynard
- Update environment variable used to determine
cuda_version
(#785) @ajschmidt8 - Update
CHANGELOG.md
links for calver (#781) @ajschmidt8 - Merge
branch-0.19
intobranch-21.06
(#779) @ajschmidt8 - Update docs build script (#776) @ajschmidt8
- upgrade spdlog to 1.8.5 (#658) @rongou
- Fix typo in setup.py (#746) @galipremsagar
- Revert "Update
rmm
conda recipe pinning oflibrmm
" (#743) @raydouglass - Update
rmm
conda recipe pinning oflibrmm
(#738) @mike-wendt - RMM doesn't require the CUDA language to be enabled by consumers (#737) @robertmaynard
- Fix setup.py to work in a non-conda environment setup (#733) @galipremsagar
- Fix auto-detecting GPU architectures (#727) @trxcllnt
- CMAKE_CUDA_ARCHITECTURES doesn't change when build-system invokes cmake (#726) @robertmaynard
- Ship memory_resource_wrappers.hpp as package_data (#715) @shwina
- Only include SetGPUArchs in the top-level CMakeLists.txt (#713) @trxcllnt
- Fix unknown CMake command "CPMFindPackage" (#699) @standbyme
- Fix host_memory_resource signature typo (#728) @miguelusque
- Clarify log file name behaviour in docs (#722) @shwina
- Add Cython definitions for device_uvector (#720) @shwina
- Python bindings for
cuda_async_memory_resource
(#718) @shwina
- Fix cython tests (#749) @galipremsagar
- Add requirements for rmm (#739) @galipremsagar
- device_uvector can be used within thrust::optional (#734) @robertmaynard
- arena_memory_resource optimization: disable tracking allocated blocks by default (#732) @rongou
- Remove CMAKE_CURRENT_BINARY_DIR path in rmm's target_include_directories (#731) @trxcllnt
- set CMAKE_CUDA_ARCHITECTURES to OFF instead of undefined (#729) @trxcllnt
- Avoid potential race conditions in device_scalar/device_uvector setters (#725) @harrism
- Update Changelog Link (#723) @ajschmidt8
- Prepare Changelog for Automation (#717) @ajschmidt8
- Update 0.18 changelog entry (#716) @ajschmidt8
- Simplify cmake cuda architectures handling (#709) @robertmaynard
- Build only
compute
for the newest arch in CMAKE_CUDA_ARCHITECTURES (#706) @robertmaynard - ENH Build with Ninja & Pass ccache variables to conda recipe (#705) @dillon-cullinan
- pool_memory_resource optimization: disable tracking allocated blocks by default (#702) @harrism
- Allow the build directory of rmm to be used for
find_package(rmm)
(#698) @robertmaynard - Adds a linear accessor to RMM cuda stream pool (#696) @afender
- Fix merge conflicts for #692 (#694) @ajschmidt8
- Fix merge conflicts for #692 (#693) @ajschmidt8
- Remove C++ Wrappers in
memory_resource_adaptors.hpp
Needed by Cython (#662) @mdemoret-nv - Improve Cython Lifetime Management by Adding References in
DeviceBuffer
(#661) @mdemoret-nv - Add support for streams in CuPy allocator (#654) @pentschev
- Remove DeviceBuffer synchronization on default stream (#650) @pentschev
- Add a Stream class that wraps CuPy/Numba/CudaStream (#636) @shwina
- SetGPUArchs updated to work around a CMake FindCUDAToolkit issue (#695) @robertmaynard
- Remove duplicate conda build command (#670) @raydouglass
- Update CMakeLists.txt VERSION to 0.18.0 (#665) @trxcllnt
- Fix wrong attribute names leading to DEBUG log build issues (#653) @pentschev
- Correct inconsistencies in README and CONTRIBUTING docs (#682) @robertmaynard
- Enable tag generation for doxygen (#672) @ajschmidt8
- Document that
managed_memory_resource
does not work with NVIDIA vGPU (#656) @harrism
- Enabling/disabling logging after initialization (#678) @shwina
cuda_async_memory_resource
built oncudaMallocAsync
(#676) @harrism- Create labeler.yml (#669) @jolorunyomi
- Expose the version string in C++ and Python (#666) @hcho3
- Add a CUDA stream pool (#659) @harrism
- Add a Stream class that wraps CuPy/Numba/CudaStream (#636) @shwina
- Update stale GHA with exemptions & new labels (#707) @mike-wendt
- Add GHA to mark issues/prs as stale/rotten (#700) @Ethyling
- Auto-label PRs based on their content (#691) @ajschmidt8
- Prepare Changelog for Automation (#688) @ajschmidt8
- Build.sh use cmake --build to drive build system invocation (#686) @robertmaynard
- Fix failed automerge (#683) @harrism
- Auto-label PRs based on their content (#681) @jolorunyomi
- Build RMM tests/benchmarks with -Wall flag (#674) @trxcllnt
- Remove DeviceBuffer synchronization on default stream (#650) @pentschev
- Simplify
rmm::exec_policy
and refactor Thrust support (#647) @harrism
- PR #609 Adds
polymorphic_allocator
andstream_allocator_adaptor
- PR #596 Add
tracking_memory_resource_adaptor
to help catch memory leaks - PR #608 Add stream wrapper type
- PR #632 Add RMM Python docs
- PR #604 CMake target cleanup, formatting, linting
- PR #599 Make the arena memory resource work better with the producer/consumer mode
- PR #612 Drop old Python
device_array*
API - PR #603 Always test both legacy and per-thread default stream
- PR #611 Add a note to the contribution guide about requiring 2 C++ reviewers
- PR #615 Improve gpuCI Scripts
- PR #627 Cleanup gpuCI Scripts
- PR #635 Add Python docs build to gpuCI
- PR #592 Add
auto_flush
tomake_logging_adaptor
- PR #602 Fix
device_scalar
and its tests so that they use the correct CUDA stream - PR #621 Make
rmm::cuda_stream_default
aconstexpr
- PR #625 Use
librmm
conda artifact when buildingrmm
conda package - PR #631 Force local conda artifact install
- PR #634 Fix conda uploads
- PR #639 Fix release script version updater based on CMake reformatting
- PR #641 Fix adding "LANGUAGES" after version number in CMake in release script
- PR #529 Add debug logging and fix multithreaded replay benchmark
- PR #560 Remove deprecated
get/set_default_resource
APIs - PR #543 Add an arena-based memory resource
- PR #580 Install CMake config with RMM
- PR #591 Allow the replay bench to simulate different GPU memory sizes
- PR #594 Adding limiting memory resource adaptor
- PR #474 Use CMake find_package(CUDAToolkit)
- PR #477 Just use
None
forstrides
inDeviceBuffer
- PR #528 Add maximum_pool_size parameter to reinitialize API
- PR #532 Merge free lists in pool_memory_resource to defragment before growing from upstream
- PR #537 Add CMake option to disable deprecation warnings
- PR #541 Refine CMakeLists.txt to make it easy to import by external projects
- PR #538 Upgrade CUB and Thrust to the latest commits
- PR #542 Pin conda spdlog versions to 1.7.0
- PR #550 Remove CXX11 ABI handling from CMake
- PR #578 Switch thrust to use the NVIDIA/thrust repo
- PR #553 CMake cleanup
- PR #556 By default, don't create a debug log file unless there are warnings/errors
- PR #561 Remove CNMeM and make RMM header-only
- PR #565 CMake: Simplify gtest/gbench handling
- PR #566 CMake: use CPM for thirdparty dependencies
- PR #568 Upgrade googletest to v1.10.0
- PR #572 CMake: prefer locally installed thirdparty packages
- PR #579 CMake: handle thrust via target
- PR #581 Improve logging documentation
- PR #585 Update ci/local/README.md
- PR #587 Replaced
move
withstd::move
- PR #588 Use installed C++ RMM in python build
- PR #601 Make maximum pool size truly optional (grow until failure)
- PR #545 Fix build to support using
clang
as the host compiler - PR #534 Fix
pool_memory_resource
failure when init and max pool sizes are equal - PR #546 Remove CUDA driver linking and correct NVTX macro.
- PR #569 Correct
device_scalar::set_value
to pass host value by reference to avoid copying from invalid value - PR #559 Fix
align_down
to only change unaligned values. - PR #577 Fix CMake
LOGGING_LEVEL
issue which caused verbose logging / performance regression. - PR #582 Fix handling of per-thread default stream when not compiled for PTDS
- PR #590 Add missing
CODE_OF_CONDUCT.md
- PR #595 Fix pool_mr example in README.md
- PR #375 Support out-of-band buffers in Python pickling
- PR #391 Add
get_default_resource_type
- PR #396 Remove deprecated RMM APIs
- PR #425 Add CUDA per-thread default stream support and thread safety to
pool_memory_resource
- PR #436 Always build and test with per-thread default stream enabled in the GPU CI build
- PR #444 Add
owning_wrapper
to simplify lifetime management of resources and their upstreams - PR #449 Stream-ordered suballocator base class and per-thread default stream support
and thread safety for
fixed_size_memory_resource
- PR #450 Add support for new build process (Project Flash)
- PR #457 New
binning_memory_resource
(replaceshybrid_memory_resource
andfixed_multisize_memory_resource
). - PR #458 Add
get/set_per_device_resource
to better support multi-GPU per process applications - PR #466 Deprecate CNMeM.
- PR #489 Move
cudf._cuda
intormm._cuda
- PR #504 Generate
gpu.pxd
based on cuda version as a preprocessor step - PR #506 Upload rmm package per version python-cuda combo
- PR #428 Add the option to automatically flush memory allocate/free logs
- PR #378 Use CMake
FetchContent
to obtain latest release ofcub
andthrust
- PR #377 A better way to fetch
spdlog
- PR #372 Use CMake
FetchContent
to obtaincnmem
instead of git submodule - PR #382 Rely on NumPy arrays for out-of-band pickling
- PR #386 Add short commit to conda package name
- PR #401 Update
get_ipc_handle()
to use cuda driver API - PR #404 Make all memory resources thread safe in Python
- PR #402 Install dependencies via rapids-build-env
- PR #405 Move doc customization scripts to Jenkins
- PR #427 Add DeviceBuffer.release() cdef method
- PR #414 Add element-wise access for device_uvector
- PR #421 Capture thread id in logging and improve logger testing
- PR #426 Added multi-threaded support to replay benchmark
- PR #429 Fix debug build and add new CUDA assert utility
- PR #435 Update conda upload versions for new supported CUDA/Python
- PR #437 Test with
pickle5
(for older Python versions) - PR #443 Remove thread safe adaptor from PoolMemoryResource
- PR #445 Make all resource operators/ctors explicit
- PR #447 Update Python README with info about DeviceBuffer/MemoryResource and external libraries
- PR #456 Minor cleanup: always use rmm/-prefixed includes
- PR #461 cmake improvements to be more target-based
- PR #468 update past release dates in changelog
- PR #486 Document relationship between active CUDA devices and resources
- PR #493 Rely on C++ lazy Memory Resource initialization behavior instead of initializing in Python
- PR #433 Fix python imports
- PR #400 Fix segfault in RANDOM_ALLOCATIONS_BENCH
- PR #383 Explicitly require NumPy
- PR #398 Fix missing head flag in merge_blocks (pool_memory_resource) and improve block class
- PR #403 Mark Cython
memory_resource_wrappers
extern
asnogil
- PR #406 Sets Google Benchmark to a fixed version, v1.5.1.
- PR #434 Fix issue with incorrect docker image being used in local build script
- PR #463 Revert cmake change for cnmem header not being added to source directory
- PR #464 More completely revert cnmem.h cmake changes
- PR #473 Fix initialization logic in pool_memory_resource
- PR #479 Fix usage of block printing in pool_memory_resource
- PR #490 Allow importing RMM without initializing CUDA driver
- PR #484 Fix device_uvector copy constructor compilation error and add test
- PR #498 Max pool growth less greedy
- PR #500 Use tempfile rather than hardcoded path in
test_rmm_csv_log
- PR #511 Specify
--basetemp
forpy.test
run - PR #509 Fix missing : before LINE in throw string of RMM_CUDA_TRY
- PR #510 Fix segfault in pool_memory_resource when a CUDA stream is destroyed
- PR #525 Patch Thrust to workaround
CUDA_CUB_RET_IF_FAIL
macro clearing CUDA errors
- PR #317 Provide External Memory Management Plugin for Numba
- PR #362 Add spdlog as a dependency in the conda package
- PR #360 Support logging to stdout/stderr
- PR #341 Enable logging
- PR #343 Add in option to statically link against cudart
- PR #364 Added new uninitialized device vector type,
device_uvector
- PR #369 Use CMake
FetchContent
to obtainspdlog
instead of vendoring - PR #366 Remove installation of extra test dependencies
- PR #354 Add CMake option for per-thread default stream
- PR #350 Add .clang-format file & format all files
- PR #358 Fix typo in
rmm_cupy_allocator
docstring - PR #357 Add Docker 19 support to local gpuci build
- PR #365 Make .clang-format consistent with cuGRAPH and cuDF
- PR #371 Add docs build script to repository
- PR #363 Expose
memory_resources
in Python
- PR #373 Fix build.sh
- PR #346 Add clearer exception message when RMM_LOG_FILE is unset
- PR #347 Mark rmmFinalizeWrapper nogil
- PR #348 Fix unintentional use of pool-managed resource.
- PR #367 Fix flake8 issues
- PR #368 Fix
clang-format
missing comma bug - PR #370 Fix stream and mr use in
device_buffer
methods - PR #379 Remove deprecated calls from synchronization.cpp
- PR #381 Remove test_benchmark.cpp from cmakelists
- PR #392 SPDLOG matches other header-only acquisition patterns
- PR #253 Add
frombytes
to convertbytes
-like toDeviceBuffer
- PR #252 Add
__sizeof__
method toDeviceBuffer
- PR #258 Define pickling behavior for
DeviceBuffer
- PR #261 Add
__bytes__
method toDeviceBuffer
- PR #262 Moved device memory resource files to
mr/device
directory - PR #266 Drop
rmm.auto_device
- PR #268 Add Cython/Python
copy_to_host
andto_device
- PR #272 Add
host_memory_resource
. - PR #273 Moved device memory resource tests to
device/
directory. - PR #274 Add
copy_from_host
method toDeviceBuffer
- PR #275 Add
copy_from_device
method toDeviceBuffer
- PR #283 Add random allocation benchmark.
- PR #287 Enabled CUDA CXX11 for unit tests.
- PR #292 Revamped RMM exceptions.
- PR #297 Use spdlog to implement
logging_resource_adaptor
. - PR #303 Added replay benchmark.
- PR #319 Add
thread_safe_resource_adaptor
class. - PR #314 New suballocator memory_resources.
- PR #330 Fixed incorrect name of
stream_free_blocks_
debug symbol. - PR #331 Move to C++14 and deprecate legacy APIs.
- PR #246 Type
DeviceBuffer
arguments to__cinit__
- PR #249 Use
DeviceBuffer
indevice_array
- PR #255 Add standard header to all Cython files
- PR #256 Cast through
uintptr_t
tocudaStream_t
- PR #254 Use
const void*
inDeviceBuffer.__cinit__
- PR #257 Mark Cython-exposed C++ functions that raise
- PR #269 Doc sync behavior in
copy_ptr_to_host
- PR #278 Allocate a
bytes
object to fill up with RMM log data - PR #280 Drop allocation/deallocation of
offset
- PR #282
DeviceBuffer
use default constructor for size=0 - PR #296 Use CuPy's
UnownedMemory
for RMM-backed allocations - PR #310 Improve
device_buffer
allocation logic. - PR #309 Sync default stream in
DeviceBuffer
constructor - PR #326 Sync only on copy construction
- PR #308 Fix typo in README
- PR #334 Replace
rmm_allocator
for Thrust allocations - PR #345 Remove stream synchronization from
device_scalar
constructor andset_value
- PR #298 Remove RMM_CUDA_TRY from cuda_event_timer destructor
- PR #299 Fix assert condition blocking debug builds
- PR #300 Fix host mr_tests compile error
- PR #312 Fix libcudf compilation errors due to explicit defaulted device_buffer constructor
- PR #218 Add
_DevicePointer
- PR #219 Add method to copy
device_buffer
back to host memory - PR #222 Expose free and total memory in Python interface
- PR #235 Allow construction of
DeviceBuffer
with astream
- PR #214 Add codeowners
- PR #226 Add some tests of the Python
DeviceBuffer
- PR #233 Reuse the same
CUDA_HOME
logic from cuDF - PR #234 Add missing
size_t
inDeviceBuffer
- PR #239 Cleanup
DeviceBuffer
's__cinit__
- PR #242 Special case 0-size
DeviceBuffer
intobytes
- PR #244 Explicitly force
DeviceBuffer.size
to anint
- PR #247 Simplify casting in
tobytes
and other cleanup
- PR #215 Catch polymorphic exceptions by reference instead of by value
- PR #221 Fix segfault calling rmmGetInfo when uninitialized
- PR #225 Avoid invoking Python operations in c_free
- PR #230 Fix duplicate symbol issues with
copy_to_host
- PR #232 Move
copy_to_host
doc back to header file
- PR #106 Added multi-GPU initialization
- PR #167 Added value setter to
device_scalar
- PR #163 Add Cython bindings to
device_buffer
- PR #177 Add
__cuda_array_interface__
toDeviceBuffer
- PR #198 Add
rmm.rmm_cupy_allocator()
- PR #161 Use
std::atexit
to finalize RMM after Python interpreter shutdown - PR #165 Align memory resource allocation sizes to 8-byte
- PR #171 Change public API of RMM to only expose
reinitialize(...)
- PR #175 Drop
cython
from run requirements - PR #169 Explicit stream argument for device_buffer methods
- PR #186 Add nbytes and len to DeviceBuffer
- PR #188 Require kwargs in
DeviceBuffer
's constructor - PR #194 Drop unused imports from
device_buffer.pyx
- PR #196 Remove unused CUDA conda labels
- PR #200 Simplify DeviceBuffer methods
- PR #174 Make
device_buffer
default ctor explicit to work around type_dispatcher issue in libcudf. - PR #170 Always build librmm and rmm, but conditionally upload based on CUDA / Python version
- PR #182 Prefix
DeviceBuffer
's C functions - PR #189 Drop
__reduce__
fromDeviceBuffer
- PR #193 Remove thrown exception from
rmm_allocator::deallocate
- PR #224 Slice the CSV log before converting to bytes
- PR #99 Added
device_buffer
class - PR #133 Added
device_scalar
class
- PR #123 Remove driver install from ci scripts
- PR #131 Use YYMMDD tag in nightly build
- PR #137 Replace CFFI python bindings with Cython
- PR #127 Use Memory Resource classes for allocations
- PR #107 Fix local build generated file ownerships
- PR #110 Fix Skip Test Functionality
- PR #125 Fixed order of private variables in LogIt
- PR #139 Expose
_make_finalizer
python API needed by cuDF - PR #142 Fix ignored exceptions in Cython
- PR #146 Fix rmmFinalize() not freeing memory pools
- PR #149 Force finalization of RMM objects before RMM is finalized (Python)
- PR #154 Set ptr to 0 on rmm::alloc error
- PR #157 Check if initialized before freeing for Numba finalizer and use
weakref
instead ofatexit
- PR #96 Added
device_memory_resource
for beginning of overhaul of RMM design - PR #103 Add and use unified build script
- PR #111 Streamline CUDA_REL environment variable
- PR #113 Handle ucp.BufferRegion objects in auto_device
...
- PR #95 Add skip test functionality to build.sh
...
- PR #92 Update docs version
- PR #67 Add random_allocate microbenchmark in tests/performance
- PR #70 Create conda environments and conda recipes
- PR #77 Add local build script to mimic gpuCI
- PR #80 Add build script for docs
- PR #76 Add cudatoolkit conda dependency
- PR #84 Use latest release version in update-version CI script
- PR #90 Avoid using c++14 auto return type for thrust_rmm_allocator.h
- PR #68 Fix signed/unsigned mismatch in random_allocate benchmark
- PR #74 Fix rmm conda recipe librmm version pinning
- PR #72 Remove unnecessary _BSD_SOURCE define in random_allocate.cpp
- PR #43 Add gpuCI build & test scripts
- PR #44 Added API to query whether RMM is initialized and with what options.
- PR #60 Default to CXX11_ABI=ON
- PR #58 Eliminate unreliable check for change in available memory in test
- PR #49 Fix pep8 style errors detected by flake8
- PR #2 Added CUDA Managed Memory allocation mode
- PR #12 Enable building RMM as a submodule
- PR #13 CMake: Added CXX11ABI option and removed Travis references
- PR #16 CMake: Added PARALLEL_LEVEL environment variable handling for GTest build parallelism (matches cuDF)
- PR #17 Update README with v0.5 changes including Managed Memory support
- PR #10 Change cnmem submodule URL to use https
- PR #15 Temporarily disable hanging AllocateTB test for managed memory
- PR #28 Fix invalid reference to local stack variable in
rmm::exec_policy
- PR #1 Spun off RMM from cuDF into its own repository.
- CUDF PR #472 RMM: Created centralized rmm::device_vector alias and rmm::exec_policy
- CUDF PR #465 Added templated C++ API for RMM to avoid explicit cast to
void**
RMM was initially implemented as part of cuDF, so we include the relevant changelog history below.
- PR #336 CSV Reader string support
- CUDF PR #333 Add Rapids Memory Manager documentation
- CUDF PR #321 Rapids Memory Manager adds file/line location logging and convenience macros
These were initial releases of cuDF based on previously separate pyGDF and libGDF libraries. RMM was initially implemented as part of libGDF.