-
New York University
- NYC
- davidpissarra.com
Highlights
- Pro
Stars
An open-source efficient deep learning framework/compiler, written in python.
A list of awesome compiler projects and papers for tensor computation and deep learning.
The Tensor Algebra SuperOptimizer for Deep Learning
Efficient Triton Kernels for LLM Training
Optimizing SGEMM kernel functions on NVIDIA GPUs to a close-to-cuBLAS performance.
Mirage: Automatically Generating Fast GPU Kernels without Programming in Triton/CUDA
Analyze the inference of Large Language Models (LLMs). Analyze aspects like computation, storage, transmission, and hardware roofline model in a user-friendly interface.
Open weights LLM from Google DeepMind.
FlashInfer: Kernel Library for LLM Serving
Development repository for the Triton language and compiler
A high-throughput and memory-efficient inference and serving engine for LLMs
Sparsity-aware deep learning inference runtime for CPUs
Extend existing LLMs way beyond the original training length with constant memory usage, without retraining
Universal LLM Deployment Engine with ML Compilation
Open deep learning compiler stack for cpu, gpu and specialized accelerators