Skip to content

Header-only runtime API wrappers, split NVTX wrappers, etc.

Compare
Choose a tag to compare
@eyalroz eyalroz released this 14 Oct 14:14
· 521 commits to master since this release

Main changes since 0.3.3:

  • The runtime API wrappers are now a header-only library.
  • Split the NVTX wrappers and the Runtime API wrappers into two separate libraries.
  • Added several fundamental types which were implicit in previous versions: cuda::size_t, cuda::dimensionality_t.

Minor API tweaks:

  • Renamed launch -> enqueue_launch
  • Can now schedule managed memory region attachment on streams
  • Now wrapping cudaMemAdvise() advice.
  • Array copying uses typed pointers
  • Added: A cuda::managed::device_side_pointer_for() standalone function
  • Added: A container facade for the sequence of all devices, so you can now write for (auto device : cuda::devices() ) { }.
  • De-templatized: device setter RAII class
  • Added: a freestanding cuda::synchronize() function instead of some wrapper methods
  • Made some type definitions from inside device_t to the device:: namespace
  • Added: A subclass of memory::region_t for managed memory
  • Using memory::region_t in more API functions
  • Dropped cuda::kernel::maximum_dynamic_shared_memory_per_block().
  • Centralized the definitions of take_ownership and do_not_take_ownership
  • Made stream_t& parameters into const stream_t&, almost universally.

Bug fixes:

  • Cross-device waiting on events
  • Error message fixes
  • Not assuming the uintNN_t types are in the default namespace

Build, compatibility, usability:

  • Fix support for CMake 3.8 (CMakeLists.txt was using some post-3.8 features)
  • Clang-related:
    • Skipping examples which clang++ doesn't support yet (need
    • Only enabling separable compilation and CUDA
    • const-cast'ing const void * kernel function pointers before reinterpretation - clang wont'tt let it
    • GNU extension dropped when compiling examples with CUDA (clang dioesn't support ths)
    • Fixed std::max() call issue
  • CMake targets depending on the wrappers should now have a C++11 language standard requirement for compilation
  • The wrappers now assert C++11 or later is used, instead of letting you just fail somewhere.