[Misc] Add support for new autogptq checkpoint_format #3689

Qubitium · 2024-03-28T14:51:11Z

Reason for PR: make sure loading of gptq marlin quants is compatible with latest autogptq

Due to autogpq update AutoGPTQ/AutoGPTQ#603 quants made with autogptq >=0.8.0-dev will use a new checkpoint_format property to hold marlin and future formats.

Compat tested with both marlin and non-marlin in both old is_marlin_format and new checkpoint_format

robertgshaw2-neuralmagic · 2024-03-28T18:31:15Z

Thanks, will review this

robertgshaw2-neuralmagic · 2024-03-28T18:32:10Z

Do you have examples models in each format I can try?

Qubitium · 2024-03-29T03:33:03Z

@robertgshaw2-neuralmagic

To test marlin compat just change the quntize_config param:

"checkpoint_format": "marlin" => "is_marlin_format": true

robertgshaw2-neuralmagic · 2024-03-30T21:36:46Z

Each of the following works:

new serialization config

from vllm import LLM
model = LLM("LnL-AI/TinyLlama-1.1B-Chat-v1.0-GPTQ-Marlin-4bit")
model.generate("hello my name is")

old serialization config

from vllm import LLM
odel = LLM("neuralmagic/TinyLlama-1.1B-Chat-v1.0-marlin")
model.generate("hello my name is")

robertgshaw2-neuralmagic · 2024-03-30T23:55:42Z

@Qubitium Can you merge the PR I posted to your branch, which adds tests for this case?

Qubitium#1

added quantization tests

Qubitium · 2024-03-31T01:39:12Z

@robertgshaw2-neuralmagic merged

robertgshaw2-neuralmagic · 2024-04-01T21:38:29Z

@Qubitium can you rerun ./format.sh? Will merge after

robertgshaw2-neuralmagic

nvm saw you commit. Letting CI run then will merge

) Co-authored-by: Robert Shaw <[email protected]>

Add support for new autogptq checkpoint_format

7fd2ff5

Qubitium changed the title ~~[MISC] Add support for new autogptq checkpoint_format~~ [Misc] Add support for new autogptq checkpoint_format Mar 28, 2024

Qubitium marked this pull request as ready for review March 28, 2024 15:23

robertgshaw2-neuralmagic approved these changes Mar 30, 2024

View reviewed changes

added quantization tests

3c4b6d4

robertgshaw2-neuralmagic and others added 2 commits March 30, 2024 23:56

added comment

7f1bf11

Merge pull request #1 from neuralmagic/autogptq-compat

1ad9b32

added quantization tests

robertgshaw2-neuralmagic approved these changes Apr 1, 2024

View reviewed changes

ruff

2299743

robertgshaw2-neuralmagic requested changes Apr 1, 2024

View reviewed changes

robertgshaw2-neuralmagic self-requested a review April 1, 2024 23:20

robertgshaw2-neuralmagic approved these changes Apr 1, 2024

View reviewed changes

robertgshaw2-neuralmagic merged commit 7d4e1b8 into vllm-project:main Apr 1, 2024
33 checks passed

dtrifiro mentioned this pull request May 15, 2024

bump ubi base image tag opendatahub-io/vllm#24

Merged

Temirulan pushed a commit to Temirulan/vllm-whisper that referenced this pull request Sep 6, 2024

[Misc] Add support for new autogptq checkpoint_format (vllm-project#3689

078a6d4

) Co-authored-by: Robert Shaw <[email protected]>

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Misc] Add support for new autogptq checkpoint_format #3689

[Misc] Add support for new autogptq checkpoint_format #3689

Qubitium commented Mar 28, 2024 •

edited

Loading

robertgshaw2-neuralmagic commented Mar 28, 2024

robertgshaw2-neuralmagic commented Mar 28, 2024

Qubitium commented Mar 29, 2024

robertgshaw2-neuralmagic commented Mar 30, 2024

robertgshaw2-neuralmagic commented Mar 30, 2024

Qubitium commented Mar 31, 2024

robertgshaw2-neuralmagic commented Apr 1, 2024

robertgshaw2-neuralmagic left a comment •

edited

Loading

[Misc] Add support for new autogptq checkpoint_format #3689

[Misc] Add support for new autogptq checkpoint_format #3689

Conversation

Qubitium commented Mar 28, 2024 • edited Loading

robertgshaw2-neuralmagic commented Mar 28, 2024

robertgshaw2-neuralmagic commented Mar 28, 2024

Qubitium commented Mar 29, 2024

robertgshaw2-neuralmagic commented Mar 30, 2024

robertgshaw2-neuralmagic commented Mar 30, 2024

Qubitium commented Mar 31, 2024

robertgshaw2-neuralmagic commented Apr 1, 2024

robertgshaw2-neuralmagic left a comment • edited Loading

Choose a reason for hiding this comment

Qubitium commented Mar 28, 2024 •

edited

Loading

robertgshaw2-neuralmagic left a comment •

edited

Loading