Releases · ModelCloud/GPTQModel

What's Changed

Added LG EXAONE 3.0 model support. New dynamic per layer/module flexible quantization where each layer/module may have different bits/params. Added proper sharding support to backend.BITBLAS. Auto-heal quantization errors due to small damp values.

[CORE] add support for pack and shard to bitblas by @LRL-ModelCloud in #316
Add dynamic bits by @PZS-ModelCloud in #311, #319, #321, #323, #327
[MISC] Adjust the validate order of QuantLinear when BACKEND is AUTO by @ZX-ModelCloud in #318
add save_quantized log model total size by @PZS-ModelCloud in #320
Auto damp recovery by @CSY-ModelCloud in #326
[FIX] add missing original_infeatures by @CSY-ModelCloud in #337
Update Transformers to 4.44.0 by @Qubitium in #336
[MODEL] add exaone model support by @LRL-ModelCloud in #340
[CI] Upload wheel to local server by @CSY-ModelCloud in #339
[MISC] Fix assert by @CSY-ModelCloud in #342

Full Changelog: v0.9.10...v0.9.11

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

What's Changed

Contributors

What's Changed

Contributors

What's Changed

Contributors

What's Changed

Contributors

What's Changed

Contributors

What's Changed

Contributors

What's Changed

New Contributors

Contributors

What's Changed

Contributors

What's Changed

Contributors

What's Changed

Contributors

Releases: ModelCloud/GPTQModel

GPTQModel v1.0.9

What's Changed

Contributors

GPTQModel v1.0.8

What's Changed

Contributors

GPTQModel v1.0.7

What's Changed

Contributors

GPTQModel v1.0.6

What's Changed

Contributors

GPTQModel v1.0.5

What's Changed

Contributors

GPTQModel v1.0.4

What's Changed

Contributors

GPTQModel v1.0.3

What's Changed

New Contributors

Contributors

GPTQModel v1.0.2

What's Changed

Contributors

v1.0.0

What's Changed

Contributors

GPTQModel v0.9.11

What's Changed

Contributors