rebase #3

pglorio · 2024-11-05T23:10:27Z

No description provided.

…e#34395) * Skip DeepSpeed ZeRO Stage 3 model initialization when it is intended to be quantized. * Propagate the quantization state using a context manager * make fixup

FIX Broken repr of TorchAoConfig The __repr__ method references a non-existent self.kwargs. This is now fixed. There does not appear to be a uniform way of defining __repr__ for quantization configs. I copied the method as implemented for HQQ: https://github.com/huggingface/transformers/blob/e2ac16b28a0b8b900e136750309ca40c49d975c5/src/transformers/utils/quantization_config.py#L285-L287

* save/load sub-configs * nit forgot these * fix copies * move test to common * use dict for sub-configs * add load-save-laod test * clean up modeling check * oops this are correct keys * fix some tests, missed some composite configs * this model was missed

* DistillBERT is ExecuTorch compatible * [run_slow] distilbert * [run_slow] distilbert --------- Co-authored-by: Guang Yang <[email protected]>

Revert "Fix Whisper CI (huggingface#34541)" This reverts commit eb81144.

) * Fix assistant tokens when truncated * fix test * fix test * step

…4558) * update * update * update * update * update * update * update * update * update * update * update --------- Co-authored-by: ydshieh <[email protected]>

) * Changing __repr__ in torchao * small update * make style * small update * add LinearActivationQuantizedTensor * remove some cases * update imports & handle return None * update

fix-torch-interpolation-ci

…xtraction (huggingface#34450) * fix stablelm qkv_bias * fix stablelm qkv_bias and use_parallel_residual * remove original_model.config for stablelm gguf test

* docs: ko: convbert.md * Update _toctree.yml * feat: nmt draft

* gptqmodel Signed-off-by: jiqing-feng <[email protected]> * fix format Signed-off-by: jiqing-feng <[email protected]> * update readme Signed-off-by: jiqing-feng <[email protected]> * gptqmodel need use checkpoint_format (#1) * gptqmodel need use checkpoint_format * fix quantize * Update quantization_config.py * Update quantization_config.py * Update quantization_config.py --------- Co-authored-by: ZX-ModelCloud <[email protected]> Co-authored-by: Qubitium-ModelCloud <[email protected]> * Revert quantizer_gptq.py (#2) * revert quantizer_gptq.py change * pass **kwargs * limit gptqmodel and optimum version Signed-off-by: jiqing-feng <[email protected]> * fix format Signed-off-by: jiqing-feng <[email protected]> * fix warning Signed-off-by: jiqing-feng <[email protected]> * fix version check Signed-off-by: jiqing-feng <[email protected]> * revert unrelated changes Signed-off-by: jiqing-feng <[email protected]> * enable gptqmodel tests Signed-off-by: jiqing-feng <[email protected]> * fix requires gptq Signed-off-by: jiqing-feng <[email protected]> * Fix Transformer compat (#3) * revert quantizer_gptq.py change * pass **kwargs * add meta info * cleanup * cleanup * Update quantization_config.py * hf_select_quant_linear pass checkpoint_format and meta * fix GPTQTestCUDA * Update test_gptq.py * gptqmodel.hf_select_quant_linear() now does not select ExllamaV2 * cleanup * add backend * cleanup * cleanup * no need check exllama version * Update quantization_config.py * lower checkpoint_format and backend * check none * cleanup * Update quantization_config.py * fix self.use_exllama == False * spell * fix unittest * fix unittest --------- Co-authored-by: LRL <[email protected]> Co-authored-by: Qubitium-ModelCloud <[email protected]> * fix format Signed-off-by: jiqing-feng <[email protected]> * fix format again Signed-off-by: jiqing-feng <[email protected]> * update gptqmodel version (#6) * update gptqmodel version * update gptqmodel version * fix unit test (#5) * update gptqmodel version * update gptqmodel version * "not self.use_exllama" is not equivalent to "self.use_exllama==False" * fix unittest * update gptqmodel version * backend is loading_attibutes (huggingface#7) * fix format and tests Signed-off-by: jiqing-feng <[email protected]> * fix memory check Signed-off-by: jiqing-feng <[email protected]> * fix device mismatch Signed-off-by: jiqing-feng <[email protected]> * fix result check Signed-off-by: jiqing-feng <[email protected]> * Update src/transformers/quantizers/quantizer_gptq.py Co-authored-by: Marc Sun <[email protected]> * Update src/transformers/quantizers/quantizer_gptq.py Co-authored-by: Marc Sun <[email protected]> * Update src/transformers/quantizers/quantizer_gptq.py Co-authored-by: Marc Sun <[email protected]> * update tests Signed-off-by: jiqing-feng <[email protected]> * review: update docs (huggingface#10) * review: update docs (huggingface#12) * review: update docs * fix typo * update tests for gptqmodel Signed-off-by: jiqing-feng <[email protected]> * update document (huggingface#9) * update overview.md * cleanup * Update overview.md * Update overview.md * Update overview.md * update gptq.md * Update gptq.md * Update gptq.md * Update gptq.md * Update gptq.md * Update gptq.md * Update gptq.md --------- Co-authored-by: Qubitium-ModelCloud <[email protected]> * typo * doc note for asymmetric quant * typo with apple silicon(e) * typo for marlin * column name revert: review * doc rocm support * Update docs/source/en/quantization/gptq.md Co-authored-by: Steven Liu <[email protected]> * Update docs/source/en/quantization/gptq.md Co-authored-by: Steven Liu <[email protected]> * Update docs/source/en/quantization/gptq.md Co-authored-by: Steven Liu <[email protected]> * Update docs/source/en/quantization/gptq.md Co-authored-by: Steven Liu <[email protected]> * Update docs/source/en/quantization/overview.md Co-authored-by: Steven Liu <[email protected]> * Update docs/source/en/quantization/overview.md Co-authored-by: Steven Liu <[email protected]> --------- Signed-off-by: jiqing-feng <[email protected]> Co-authored-by: LRL-ModelCloud <[email protected]> Co-authored-by: ZX-ModelCloud <[email protected]> Co-authored-by: Qubitium-ModelCloud <[email protected]> Co-authored-by: ZX-ModelCloud <[email protected]> Co-authored-by: LRL <[email protected]> Co-authored-by: Marc Sun <[email protected]> Co-authored-by: Mohamed Mekkouri <[email protected]> Co-authored-by: Steven Liu <[email protected]>

eljandoubi and others added 12 commits November 5, 2024 10:06

Skip DeepSpeed ZeRO Stage 3 model initialization when bnb (huggingfac…

d0b1d8d

…e#34395) * Skip DeepSpeed ZeRO Stage 3 model initialization when it is intended to be quantized. * Propagate the quantization state using a context manager * make fixup

DistilBERT is ExecuTorch compatible (huggingface#34475)

663c851

* DistillBERT is ExecuTorch compatible * [run_slow] distilbert * [run_slow] distilbert --------- Co-authored-by: Guang Yang <[email protected]>

Remove unused test_dataset (huggingface#34516)

45b0c76

Revert "Fix Whisper CI" (huggingface#34605)

74d3824

Revert "Fix Whisper CI (huggingface#34541)" This reverts commit eb81144.

Fix huggingface#34494 assistant tokens when truncated (huggingface#34531

082e57e

) * Fix assistant tokens when truncated * fix test * fix test * step

Remove @slow for test_eager_matches_sdpa_inference (huggingface#3…

f2d5dfb

…4558) * update * update * update * update * update * update * update * update * update * update * update --------- Co-authored-by: ydshieh <[email protected]>

Changing __repr__ in torchao to show quantized Linear (huggingface#34202

d2bae7e

) * Changing __repr__ in torchao * small update * make style * small update * add LinearActivationQuantizedTensor * remove some cases * update imports & handle return None * update

Fix torchvision interpolation CI (huggingface#34539)

9f28d0c

fix-torch-interpolation-ci

Fix use_parallel_residual and qkv_bias for StableLM GGUF config e…

e83aaaa

…xtraction (huggingface#34450) * fix stablelm qkv_bias * fix stablelm qkv_bias and use_parallel_residual * remove original_model.config for stablelm gguf test

🌐 [i18n-KO] Translated convbert.md to Korean (huggingface#34599)

7bbc624

* docs: ko: convbert.md * Update _toctree.yml * feat: nmt draft

pglorio merged commit eb6063e into zamba2 Nov 5, 2024
21 of 34 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

rebase #3

rebase #3

pglorio commented Nov 5, 2024

rebase #3

rebase #3

Conversation

pglorio commented Nov 5, 2024