-
Notifications
You must be signed in to change notification settings - Fork 1.6k
Issues: mlc-ai/mlc-llm
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Author
Label
Projects
Milestones
Assignee
Sort
Issues list
[Question] Does MLC LLM support running large models on a single GPU with limited VRAM?
question
Question about the usage
#3094
opened Jan 18, 2025 by
fengyi012
[Question]In the output results of attention_with_fused_qkv funcs, some slice accuracies are abnormal
question
Question about the usage
#3093
opened Jan 17, 2025 by
ifndefendif
[Question] Android App Crash
question
Question about the usage
#3091
opened Jan 16, 2025 by
mhollis1980
[Question] semantic description of different quantization methods
question
Question about the usage
#3088
opened Jan 9, 2025 by
phgcha
[Bug] Failed to compile mlc-ai/Llama-3.1-8B-Instruct-fp8-MLC on Jetson AGX orin
bug
Confirmed bugs
#3087
opened Jan 9, 2025 by
phgcha
[Question] how to change system_message at runtime
question
Question about the usage
#3086
opened Jan 9, 2025 by
phgcha
[Question] huge memory usage on iOS device using qwen2.-5-3B, is it a normal perfermance?
question
Question about the usage
#3083
opened Jan 6, 2025 by
ted1995
[Bug] Broken for Intel Macs since v0.15 (or earlier)
bug
Confirmed bugs
#3078
opened Dec 31, 2024 by
zxcat
[Bug] cohere model(aya) doesn't seem to produce the correct output
bug
Confirmed bugs
#3073
opened Dec 21, 2024 by
jhlee525
[Feature Request] Provide a C++ API
feature request
New feature or request
#3066
opened Dec 16, 2024 by
tranlm
[Feature Request] Streamed [DONE] response in RestAPI should have token data
feature request
New feature or request
#3061
opened Dec 10, 2024 by
TNT3530
[Bug] Infinite loop after generate token length near context_windows_size/chunk_prefill_size.
bug
Confirmed bugs
#3057
opened Dec 6, 2024 by
gesanqiu
[Bug] Binary was created using {relax.Executable} but a loader of that name is not registered.
bug
Confirmed bugs
#3055
opened Dec 3, 2024 by
LLIo6oH
[Bug] gemma-2-27b-it-q4f16_1-MLC output the incorrect content.
bug
Confirmed bugs
#3054
opened Dec 1, 2024 by
rankaiyx
[Bug] Still Experiencing 'Error: Using LLVM 19.1.3 with Confirmed bugs
-mcpu=apple-latest
is not valid in -mtriple=arm64-apple-macos
, using default -mcpu=generic
'
bug
#3053
opened Dec 1, 2024 by
BuildBackBuehler
[Question] How to get runtime stats in serve mode?
question
Question about the usage
#3052
opened Dec 1, 2024 by
rankaiyx
[Feature Request] Embeddings support for iOS
feature request
New feature or request
#3050
opened Nov 29, 2024 by
jondeandres
[Bug] Android Llama-3.2-3B-Instruct-q4f16_0-MLC init failed
bug
Confirmed bugs
#3048
opened Nov 26, 2024 by
tdd102
[Bug][iOS/Swift SDK] Multiple image input to vision models will throw error from TVM
bug
Confirmed bugs
#3044
opened Nov 22, 2024 by
Neet-Nestor
[Question] Does MLC_LLM MLCEngine have an equivalent API for Question about the usage
llm.generate
in VLLM or SGLang?
question
#3034
opened Nov 17, 2024 by
pjyi2147
KV cache offloading to CPU RAM
feature request
New feature or request
#3033
opened Nov 17, 2024 by
shahizat
[Feature Request] Add vision model flag to model record
feature request
New feature or request
#3031
opened Nov 16, 2024 by
Neet-Nestor
Previous Next
ProTip!
Add no:assignee to see everything that’s not assigned.