Releases · nv-legate/cunumeric

11 Sep 20:36

manopapad

v24.06.01

427da00

v24.06.01 Latest

Latest

This is a patch release, and includes the following fixes:

Fix for nv-legate/legate#947
Fix package dependencies (cuda and openblas)

x86 conda packages with multi-node support (based on UCX) are available at https://anaconda.org/legate/cunumeric.

Documentation for this release can be found at https://docs.nvidia.com/cunumeric/24.06/.

Assets 2

03 Jul 22:35

manopapad

v24.06.00

510e24a

v24.06.00

This release ports cuNumeric to the C++-based Legate-Core. Additionally, it includes the following new features:

np.linalg.qr, np.linalg.svd (single-GPU support only)
"where" argument for unary operations
np.select
np.flipup, np.fliplr
np.cov
np.load (initial, unoptimized implementation)
np.average
np.logical_and/or.reduce
np.digitize
np.diff
np.linalg.cholesky, np.linalg.solve (multi-GPU support, based on cuSolverMp -- not included in conda packages, requires a manual build)
C++-based ndarray class (experimental support)

x86 conda packages with multi-node support (based on UCX) are available at https://anaconda.org/legate/cunumeric.

Documentation for this release can be found at https://docs.nvidia.com/cunumeric/24.06/.

Known issues

Including the nvidia conda channel in an environment with cunumeric may end up pulling cutensor 2.0, even though the cunumeric packages explicitly request cutensor 1.7. This can cause error messages like this:

OSError: libcutensor.so.1: cannot open shared object file: No such file or directory

This is not an issue with cuNumeric, but with incorrect constraints on the cutensor packages on the nvidia channel. Please avoid including the nvidia conda channel in any conda environment including cunumeric.

Assets 2

21 Nov 01:47

marcinz

v23.11.00

d91f17c

v23.11.00

This release contains performance improvements to the variance operation, and a multi-dimensional Cholesky implementation.

Conda packages for this release are available at https://anaconda.org/legate/cunumeric.

What's Changed

🚀 New Features

Added variance as a unary reduction by @jjwilke in #593
Add batched cholesky implementation and tests by @jjwilke in #1029

🐛 Bug Fixes

Replacing set with OrderedSet to avoid control-replication violations by @ipdemes in #1054
Inline boolean operators in NumPy are bitwise, not logical by @manopapad in #1057
Fix #1065 ("where" fails with IndexError) by @manopapad in #1067
Fixes #1069, #1070 (minor einsum bugs) by @manopapad in #1072

📖 Documentation

Suggest using mamba over conda by @manopapad in #1068

Full Changelog: v23.09.00...v23.11.00

Contributors

jjwilke, manopapad, and ipdemes

Assets 2

03 Oct 15:23

marcinz

v23.09.00

e66a063

v23.09.00

This release adds support for the quantile API, and includes some performance and documentation improvements (notably a "Best Practices" guide).

Conda packages for this release are available at https://anaconda.org/legate/cunumeric.

What's Changed

🚀 New Features

Quantile Implementation by @aschaffer in #664

🛠️ Improvements

Add missing openmp variants to BitGenerator and UniqueReduce by @rohany in #1010
Histogram refactor by @aschaffer in #1003

📖 Documentation

Add best practices info to sphinx docs by @bryevdv in #1048

🐛 Bug Fixes

Missing alignment on histogram call by @manopapad in #999
Fix for control replication violation in test by @ipdemes in #1005
Fix build instructions link by @bryevdv in #1014
Add back None as an accepted value for axis on some type sigs by @manopapad in #1017
If a scalar ufunc arg is cn.ndarray use its type directly by @manopapad in #1011
Skip the docstrings for functions pulled from cloned modules by @manopapad in #1024
Fix random test failures in CPU-only runs by @manopapad in #1025
Don't cast histogram to int64 when density=True by @manopapad in #1042
Explicitly cast result of shift binary operators by @manopapad in #1046
Remove use of deprecated np.find_common_type by @manopapad in #1045

New Contributors

@ajschmidt8 made their first contribution in #1035

Full Changelog: v23.07.00...v23.09.00

Contributors

manopapad, bryevdv, and 4 other contributors

Assets 2

25 Jul 04:51

marcinz

v23.07.00

d413db2

v23.07.00

This release adds support for histogram, broadcast* and various nan* APIs. It also includes performance improvements to the FFT functions and cleanups in ufunc support.

Conda packages for this release are available at https://anaconda.org/legate/cunumeric.

What's Changed

🚀 New Features

Implement broadcast routines by @bryevdv in #759
Sanitize unary reductions that have NaNs by @shriram-jagan in #925
Histogram Functionality by @aschaffer in #983

🛠️ Improvements

Add ufunc methods by @bryevdv in #834
Support of the shape argument in empty_like() & Co. by @madsbk in #845
Add support for Python 3.11 (#830) by @marcinz in #837
Ensure ufunc/function dispatching is narrow by @seberg in #977
Fft improvements by @mfoerste4 in #732

📖 Documentation

Note new minimum CUDA requirements for conda packages by @manopapad in #875

🐛 Bug Fixes

Fix bugs in concatenate and stack APIs. by @robinwnv in #844
Fixes #858 by @manopapad in #859
Fix concatenate and *stack APIs to support scalars(#818, #839) by @robinwnv in #866
Avoid following compiler symlinks by @manopapad in #880
Fix for some binary operators on float16 by @magnatelee in #889
WAR for TBLIS compiler detection while upstream PR is pending by @manopapad in #890
Also build CPU-only packages for haswell (#869) by @marcinz in #882
Fix array API(#885). by @robinwnv in #910
Fix unit tests by @magnatelee in #920
Fix an incorrect type by @marcinz in #931
Use correct type, to avoid int narrowing by @manopapad in #941
Fix cunumeric.arange issues by @yimoj in #940
Use the right type for scalar arguments by @magnatelee in #942
Fall back to NumPy eagerly on RandomState methods by @manopapad in #959
Fix bugs in random integer functions by @manopapad in #966
Resolve numpy 1.25 issues by @bryevdv in #973
Set lib_dir explicitly to lib/, even on RHEL by @manopapad in #971
fixing putmask logic for scalar inputs by @ipdemes in #980
fixing cuda error by @ipdemes in #978
Change arg to LLONG_MIN to make it consistent with python. by @shriram-jagan in #986
Missing alignment on histogram call by @manopapad in #1000

New Contributors

@madsbk made their first contribution in #845
@sandeepd-nv made their first contribution in #899
@seberg made their first contribution in #977
@shriram-jagan made their first contribution in #988
@aschaffer made their first contribution in #983

Full Changelog: v23.03.00...v23.07.00

Contributors

seberg, manopapad, and 11 other contributors

Assets 2

15 Mar 20:02

marcinz

v23.03.00

9ac887b

v23.03.00

This is the beta release of cuNumeric.

This release is focused on bug fixes, code clean-up and documentation updates, in preparation for entering beta status.

Conda packages for this release are available at https://anaconda.org/legate/cunumeric.

What's Changed

🐛 Bug Fixes

Do reductions properly in tensor contraction tasks by @magnatelee in #803
Seed the NumPy RNG at the start of every test by @manopapad in #792
Fix handling of negative axis in np.repeat by @manopapad in #821
Fix for #720 (by @lightsighter) by @manopapad in #721
Ensure unary_func seeding is deterministic across processes by @manopapad in #825

🛠️ Improvements

Update the architectures built in conda package by @marcinz in #770
Use thrust::cuda::par_nosync if available by @magnatelee in #780
Preemptively convert to np.ndarray on NumPy fallback by @manopapad in #802
Removing all Legion references from the code by @magnatelee in #811
Remove exception throwing from RNG code by @manopapad in #815
Pin legate to a specific commit by @trxcllnt in #824
Add support for Python 3.11 by @m3vaz in #830

📖 Documentation

[WIP] Docs refresh by @bryevdv in #805

Full Changelog: v23.01.00...v23.03.00

Contributors

trxcllnt, manopapad, and 5 other contributors

Assets 2

31 Jan 03:38

marcinz

v23.01.00

2455b55

v23.01.00

This release introduces support for the put and putmask operations, adds an optimized implementation for the common case of advanced indexing using a single (possibly broadcasted) boolean array, includes more information in the tags of unary/binary operations on profiles (for easier cross-referencing with the source script), and adds some small improvements to OpenMP execution.

Conda packages for this release are available at https://anaconda.org/legate/cunumeric.

What's Changed

🐛 Bug Fixes

Make the code compile with bounds checks by @magnatelee in #648
MatVec & MatVecMul use reduction stores, not outputs by @manopapad in #646
Set default generator based on whether ninja is available by @jjwilke in #602
Allow args to be passed by position and name in auto_convert by @manopapad in #640
Force positive values for log and sqrt tests by @jjwilke in #580
Eliminate empty kernel launch in cunumeric.unique by @magnatelee in #675
Make install.py reconfigure editable installs when build type changes by @trxcllnt in #670
Fix for #684 by @magnatelee in #686
Follow up on PR #671 by @ipdemes in #677
More argument checks for bincount by @magnatelee in #711
Fix a typo in unique.cu indexing by @manopapad in #713
guard all2all from empty transfer by @mfoerste4 in #727
src/cunumeric/item: add openmp variants for write/read tasks by @rohany in #740
Fix CI failures due to numpy 1.24 upgrade by @manopapad in #745
Fix timing for CuPy tests by @manopapad in #747
Don't turn on cuNumeric debug checks on debug-rel builds by @manopapad in #753
Move pip uninstall step before CMake is run instead of after. by @trxcllnt in #760
Force conda version of cutensor by @marcinz in #765
handle numpy 'builtins' properly for coverage by @bryevdv in #766

🚀 New Features

Implementing PUT routine by @ipdemes in #582
Implementing Putmask by @ipdemes in #667

🛠️ Improvements

Move test driver code to legate.core by @bryevdv in #627
Remove --install-dir option by @bryevdv in #656
Updates for new script-based conda env generation by @manopapad in #651
Log operator names of unary and binary operations using annotations by @magnatelee in #679
Regenerate install_info.py on every build by @trxcllnt in #705
Fixes for buffer allocations by @magnatelee in #706
Clean up the basic build instructions by @manopapad in #741
Refactor benchmarks by @manopapad in #567
Improving performance for some special cases of advanced indexing by @ipdemes in #731
Pass CMAKE_GENERATOR to scikit-build by @trxcllnt in #750
Change the default CPU architecture to haswell by @marcinz in #762

Full Changelog: v22.10.00...v23.01.00

Contributors

jjwilke, trxcllnt, and 7 other contributors

Assets 2

13 Oct 23:53

marcinz

v22.10.00

81ad156

v22.10.00

The biggest change in Release 22.10 is a new build infrastructure using CMake and scikit-build. The new build system brings several benefits including robust build dependency tracking and compliance with Python site-packages. This release includes several new search and indexing operators, fixes for several performance and correctness bugs, and provenance tracking for top-level and ndarray routines in execution profiles.

Conda packages for this release are available at https://anaconda.org/legate/cunumeric.

What's Changed

🚀 New Features

• Argwhere and flatnonzero by @mfoerste4 in #525

added extract and place via advanced indexing by @mfoerste4 in #536
Fill diagonal by @ipdemes in #473
Single processor implementation for linalg.solve by @magnatelee in #568

🛠️ Improvements

adding support for array shape () passed as an index argument in advanced indexing by @ipdemes in #486
Refactor test driver for cpu/gpu sharding by @bryevdv in #451
Collate test output to allow workers > 1 with verbose output by @bryevdv in #507
Ensure test.py --use flag fully overrides USE_* envvars by @manopapad in #524
Enhance two integration tests by @robinw0928 in #511
Add typing to array.py by @bryevdv in #478
Update test runner for osx by @bryevdv in #529
Don't blindly trust user-supplied bincount.minlength by @manopapad in #523
Make reduced-precision cuBLAS mode opt-in by @manopapad in #519
Fix reciprocal tests for zero values and improve test value customization (#467) by @marcinz in #537
Refactor test runner to support more pinning options by @bryevdv in #535
Remove dead code ian bincount by @magnatelee in #546
Make the validation condition for random distributions lenient by @magnatelee in #550
src/cunumeric: handle high number of bins in GPU bincount by @rohany in #526
Construct NumPy arrays correctly from 0D deferred arrays backed by region fields by @magnatelee in #551
Collect test failure details at the end by @bryevdv in #556
Simplify some thunk conversion helpers by @manopapad in #553
Fix a compiler warning by @magnatelee in #555
Add option to disable CPU pinning in tests by @bryevdv in #558
Use the new mapper registration to enable detailed mapper logging by @magnatelee in #570
src/cunumeric/search: make nonzero not always allocate SYS_MEM buffers by @rohany in #572
add negative test case in test_array_split.py by @xialu00 in #545
add some test cases for test_arg_reduce.py by @xialu00 in #575
Testcase-add test cases for test_flip and test_indices by @xialu00 in #579
Refactor scalar reductions to use common execution policy by @jjwilke in #573
Sanitize k for the eye operator by @magnatelee in #586
Add CMake build for C++ and scikit-build infrastructure for Python package installation by @jjwilke in #514
Enhance test_block.py and test_eye.py by @robinw0928 in #578
Testcase add test cases for test_fill.py and test_ndim.py by @xialu00 in #588
Remove run dependency on curand by @marcinz in #520
Use Legion Fills when possible by @manopapad in #604
Support building with GASNet-Ex and MPI backends by @manopapad in #610
Provenance tracking for cuNumeric operators by @magnatelee in #596
Fix tests utils to make --directory work correctly. by @robinw0928 in #592
Fix a compiler warning by @magnatelee in #594
Enhance test_diag_indices.py and test_flatten.py. by @robinw0928 in #609
cuNumeric doesn't need nested provenance tracking by @magnatelee in #617
Add RuntimeError exception to legate.time by @robinw0928 in #618
Stop instantiating min and max reduction ops for complex types by @magnatelee in #621
Mark temporary conversion outputs as linear for eager storage recycling by @magnatelee in #608
Make the negative test on fill robust across Python versions by @magnatelee in #619
Enhance mask_indices and move_axis by @robinw0928 in #622
src/cunumeric/matrix: stop including coll.h in solve_template.inl by @rohany in #620

🐛 Bug Fixes

Fix performance bugs in scalar reductions by @magnatelee in #509
Don't use internal LAPACK function names by @manopapad in #522
Bug fixes for advanced indexing by @magnatelee in #532
Handle the case where LAPACK_*potrf is a macro, not a function by @manopapad in #527
fix mypy issue w/ np methods by @bryevdv in #542
Fix buggy complex-to-bool conversions and add correctness tests for astype by @magnatelee in #549
fixing advanced indexing operation for empty arrays by @ipdemes in #504
Do not link curand by @marcinz in #541
Fixing issues with advanced_indexing_kernel by @ipdemes in #557
fixing another corner case for advanced indexing by @ipdemes in #554
Fix OSX test shard generation by @bryevdv in #563
fix error print in test_unary_ufunc by @jjwilke in #566
Add NAN handling to convert() needed for some prefix routines with integer outputs. by @rkarim2 in #502
Fixing logic for slicing by @ipdemes in #574
Fix linalg.solve when inputs are scalars by @magnatelee in #585
Allow casting in cn.dot, to match numpy's behavior by @manopapad in #598
Add linalg.solve to the cmake build by @magnatelee in #603
Invoke eye with read-write privilege, not write-discard by @manopapad in #616
Fix a bug in scalar reduction launching kernels with empty domains by @magnatelee in #606

📖 Documentation

Added note to prefix documentation for corner cases where cunumeric results can diverge from numpy by @rkarim2 in #528
updating documentation by @ipdemes in #614
Add missing docs symlink by @bryevdv in #635

Contributors

jjwilke, manopapad, and 9 other contributors

Assets 2

09 Aug 03:38

marcinz

v22.08.00

ece6585

v22.08.00

Release 22.08.00 features a variety of random distribution implementations (backed by cuRAND), distributed prefix scan operators, and a complete implementation of sorting for multi-node multi-CPU execution. This release also includes several quality-of-life changes and bug fixes, including type annotations for all but one Python module, improvements to the parallel test driver, fixes for several operators when inputs are empty, and proper handling of ndarrays passed as array sizes or indices.

Conda packages for this release are available at https://anaconda.org/legate/cunumeric.

New Features

Adding support for ND output regions in Advanced Indexing task by @ipdemes in #370
added support for 'searchsorted' by @mfoerste4 in #414
np.packbits and np.unpackbits by @magnatelee in #427
Implementation of atleast_{1,2,3}d by @sbak5 in #404
Implementing cunumeric.random.BitGenerator by @fduguet-nv in #254
Adding support for some simple _indices routines by @ipdemes in #417
adding mask_indices routine by @ipdemes in #426
Random advanced distributions by @fduguet-nv in #470
Distributed nd sort for cpu/omp by @mfoerste4 in #437
Initial implementation of scan routines. by @rkarim2 in #425
Adding support for take_along_axis and put_along_axis by @ipdemes in #436
cunumeric.ndim by @magnatelee in #495
Add support for curand conda package build (cherry pick #510) by @marcinz in #512

Improvements

Don't run the resolution logic if the arrays have the same dtype by @magnatelee in #389
Set cuda virtual package as hard run requirement for gpu conda package by @m3vaz in #398
First pass mypy typing by @bryevdv in #387
Generalize Dict to Mapping for newer versions of mypy by @jjwilke in #405
Add support for using cupy in sort.py by @robinw0928 in #395
Refactor test.py by @bryevdv in #378
Use Numpy axis normalizations where possible by @bryevdv in #419
More mypy by @bryevdv in #413
adding bounds check for advanced indexing by @ipdemes in #397
Report Elapsed Time in cholesky's output by @SeyedMir in #423
Support -vv for more verbose test output by @bryevdv in #432
Add typing to runtime.py by @bryevdv in #428
Update compress/take tests for pytest by @bryevdv in #435
Project down to a 1D store for the scalar reduction output by @magnatelee in #455
Fallback to self = np.ndarray when necessary by @bryevdv in #431
Add types to thunk modules by @bryevdv in #438
allclose detail + misc tests improvements by @bryevdv in #457
cunumeric.random - Adding Module-scoped functions by @fduguet-nv in #481
Activate the NumPy fallback for cunumeric.random in CPU build by @magnatelee in #485
Legacy generators for cpu build by @magnatelee in #487
Allow CPU build to optionally use cuRAND by @magnatelee in #498
Sanitize shapes in ndarray's constructor by @magnatelee in #496
src/cunumeric/sort: stop using std::{inclusive, exclusive}_scan by @rohany in #499
Update conda requirements by @manopapad in #383
Handle dtype/casting/out properly in contractions by @manopapad in #402
Missing / overzealous check_eager_args calls by @manopapad in #465
Strengthen some types by @manopapad in #468

Bug Fixes

Add missing includes to aid intellisense providers by @trxcllnt in #382
Proper exception handling for cholesky by @magnatelee in #391
Fixes for building with setup.py outside conda, primarily Mac by @jjwilke in #394
Use the right API to check if the store is unbound by @magnatelee in #399
Fix nargs for report:dump-csv by @bryevdv in #400
Handle empty outputs correctly in advanced indexing task by @magnatelee in #396
Fall back to NumPy in array_function and array_ufunc by @magnatelee in #424
Fix for legate data interface by @magnatelee in #429
Fix test_floating.py test to call sys.exit by @marcinz in #433
Make missing pynvml an error for GPU tests by @bryevdv in #441
Make the NumPy fallback work correctly in randint by @magnatelee in #450
Squeeze fix by @magnatelee in #448
Correctly prune out empty tasks in binary reduction by @magnatelee in #453
Minor fix for indexing routines by @magnatelee in #452
Make DeferredArray.reshape always return a deferred array by @magnatelee in #454
Re-freezing conda compiler versions (#415) by @m3vaz in #449
Fix for floating point predicates by @magnatelee in #466
markdown version fix by @ipdemes in #459
Fixup typing regressions by @bryevdv in #471
Remove ill-defined advanced indexing test case by @magnatelee in #484
Handle empty inputs correctly in local scan tasks by @magnatelee in #491
Handle an unknown in a tuple correctly in reshape by @magnatelee in #490
fix mismatched size_t/uint64_t types by @jjwilke in #475
Allow scalar cunumeric ndarrays as array indices by @manopapad in #479

Documentation

adding new version for documentations by @ipdemes in #447
Updates to api_compare.py by @bryevdv in #456
Be stricter applying CuWrapperMetadata by @bryevdv in #463
Add custom nitpicky ref checks for cunumeric APIs by @bryevdv in #462
Docs coverage check by @bryevdv in #469
Fix the API reference for random functions and scan operators by @magnatelee in #497

New Contributors

@jjwilke made their first contribution in #394
@SeyedMir made their first contribution in #423
@fduguet-nv made their first contribution in #254
@rkarim2 made their first contribution in #425
@rohany made their first contribution in #499

Full Changelog: v22.05.02...v22.08.00

Contributors

jjwilke, trxcllnt, and 13 other contributors

Assets 2

21 Jun 10:52

marcinz

v22.05.02

8b163e6

v22.05.02

This hotfix release fixes issues in conda recipes.

What's Changed

Cherry pick: Update conda requirements (#383) by @marcinz in #406
Cherry pick: Set cuda virtual package as hard run requirement for conda gpu package (#398) by @marcinz in #407
Cherry pick: Fix nargs for report:dump-csv (#400) by @marcinz in #408
Re-freezing conda compiler versions by @m3vaz in #415

Full Changelog: v22.05.01...v22.05.02

Contributors

marcinz and m3vaz

Assets 2

Releases: nv-legate/cunumeric

v24.06.01

v24.06.00

Known issues

v23.11.00

What's Changed

🚀 New Features

🐛 Bug Fixes

📖 Documentation

Contributors

v23.09.00

What's Changed

🚀 New Features

🛠️ Improvements

📖 Documentation

🐛 Bug Fixes

New Contributors

Contributors

v23.07.00

What's Changed

🚀 New Features

🛠️ Improvements

📖 Documentation

🐛 Bug Fixes

New Contributors

Contributors

v23.03.00

What's Changed

🐛 Bug Fixes

🛠️ Improvements

📖 Documentation

Contributors

v23.01.00

What's Changed

🐛 Bug Fixes

🚀 New Features

🛠️ Improvements

Contributors

v22.10.00

What's Changed

🚀 New Features

🛠️ Improvements

🐛 Bug Fixes

📖 Documentation

Contributors

v22.08.00

New Features

Improvements

Bug Fixes

Documentation

New Contributors

Contributors

v22.05.02

What's Changed

Contributors