Skip to content

Latest commit

 

History

History
3 lines (3 loc) · 233 Bytes

README.md

File metadata and controls

3 lines (3 loc) · 233 Bytes

hip_cuda_examples

Different NVIDIA CUDA and AMD HIP implementations of matrix multiplication, vector add, reduce operations, and layernorm kernels. Each kernel also uses different data types like fp64, fp32, fp16(half), and half2.