MobileQuant/eval at main · saic-fi/MobileQuant

History

Name		Name	Last commit message	Last commit date
parent directory ..
README.md		README.md
collect_mmlu_result.py		collect_mmlu_result.py
harness_eval.py		harness_eval.py
simple_eval.py		simple_eval.py

README.md

Evaluating the pre-quantized LLM models

The table below includes the checkpoints for each models:

Model	Quantization	CKPT	WikiText	ARC-C	Hellaswag	MMLU
TinyLlaMA-1.1B-v1.0-Chat	W8A8	ckpt	15.5	31.9	59.2	25.0
TinyLlaMA-1.1B-v1.0-Chat	W4A8	ckpt	17.1	32.3	57.0	25.5
StableLM-2-1.6B	W8A8	ckpt	29.7	37.1	63.6	30.0
StableLM-2-1.6B	W4A8	ckpt	33.6	35.6	60.5	24.1
Gemma-2B	W8A8	ckpt	20.3	21.8	40.9	25.8
Gemma-2B	W4A8	ckpt	21.4	23.0	38.9	25.6

Running the evaluation

Download the checkpoint

CUDA_VISIBLE_DEVICES=0 python eval/harness_eval.py \
      --tasks wikitext,arc_challenge,hellaswag,hendrycksTest*
      --mode custom --hf_path ${CKPT} --output_dir ${OUTPUT_DIR}

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

eval

eval

README.md

Evaluating the pre-quantized LLM models

Running the evaluation

Files

eval

Directory actions

More options

Directory actions

More options

Latest commit

History

eval

Folders and files

parent directory

README.md

Evaluating the pre-quantized LLM models

Running the evaluation