Skip to content

v2.8.1

Compare
Choose a tag to compare
@KodiaqQ KodiaqQ released this 09 Feb 09:45
· 2296 commits to develop since this release

Post-training Quantization:

Bugfixes:

  • (Common) Fixed issue with nncf.compress_weights() to avoid overflows on 32-bit Windows systems.
  • (Common) Fixed performance issue with nncf.compress_weights() on LLama models.
  • (Common) Fixed nncf.quantize_with_accuracy_control pipeline with tune_hyperparams=True enabled option.
  • (OpenVINO) Fixed issue for stateful LLM models and added state restoring after the inference for it.
  • (PyTorch) Fixed issue with nncf.compress_weights() for LLM models with the executing is_floating_point with tracing.