Analyze the inference of Large Language Models (LLMs). Analyze aspects like computation, storage, transmission, and hardware roofline model in a user-friendly interface.

Python 354 42 Updated Sep 11, 2024

google-deepmind / gemma

Open weights LLM from Google DeepMind.

Python 2,531 323 Updated Dec 23, 2024

flashinfer-ai / flashinfer

FlashInfer: Kernel Library for LLM Serving

Cuda 1,609 160 Updated Dec 24, 2024

triton-lang / triton

Development repository for the Triton language and compiler

C++ 13,771 1,687 Updated Dec 24, 2024

mlc-ai / dlight-bench

Python 3 4 Updated Nov 6, 2023

ml-explore / mlx

MLX: An array framework for Apple silicon

C++ 17,971 1,036 Updated Dec 23, 2024

state-spaces / mamba

Mamba SSM architecture

Python 13,591 1,161 Updated Dec 6, 2024

vllm-project / vllm

A high-throughput and memory-efficient inference and serving engine for LLMs

Python 32,374 4,937 Updated Dec 23, 2024

neuralmagic / deepsparse

Sparsity-aware deep learning inference runtime for CPUs

Python 3,058 176 Updated Jul 19, 2024

tomaarsen / attention_sinks

Extend existing LLMs way beyond the original training length with constant memory usage, without retraining

Python 683 40 Updated Apr 10, 2024

mlc-ai / mlc-llm

Universal LLM Deployment Engine with ML Compilation

Python 19,453 1,598 Updated Dec 19, 2024

apache / tvm

Open deep learning compiler stack for cpu, gpu and specialized accelerators

Python 11,870 3,489 Updated Dec 16, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

David Pissarra davidpissarra

Achievements

Achievements

Highlights

Organizations

Block or report davidpissarra

Stars

apuaaChen / EVT_AE

HazyResearch / ThunderKittens

hidet-org / hidet

merrymercy / awesome-tensor-compilers

jiazhihao / TASO

hibagus / CUDA_Bench

lambda7xx / awesome-AI-system

linkedin / Liger-Kernel

mobiusml / gemlite

yzhaiustc / Optimizing-SGEMM-on-NVIDIA-Turing-GPUs

mirage-project / mirage

siboehm / SGEMM_CUDA

NVIDIA / CUDALibrarySamples

R100001 / Programming-Massively-Parallel-Processors

NVIDIA / cuda-checkpoint

hahnyuan / LLM-Viewer