intel / neural-compressor Public

Notifications
Fork 259
Star 2.3k

Additional navigation options

Code
Issues
Pull requests
Discussions
Actions
Projects
Security
Insights

Issues: intel/neural-compressor

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Clear current search query, filters, and sorts

38 Open 173 Closed

Author

Filter by author

Label

Filter by label

Use alt + click/return to exclude labels

or ⇧ + click/return for logical OR

Projects

Filter by project

Milestones

Filter by milestone

Assignee

Filter by who’s assigned

Assigned to nobody

Sort

Sort by

Newest Oldest Most commented Least commented Recently updated Least recently updated Best match

Most reactions

Issues list

How to perform int8 quantisation (not uint8) using ONNX?

#1610 opened Feb 16, 2024 by paul-ang

How to set the pruned weight blocks as a same learnable value?

#1361 opened Oct 30, 2023 by hobbitlzy

Export Quantized Model to ONNX: NotImplementedError: Could not run 'quantized::conv2d.new' with arguments from the 'CPU' backend.

#1390 opened Nov 15, 2023 by clementpoiret

[BUG] - segmentation fault occur when follow the tutorial

#1391 opened Nov 15, 2023 by HDCharles

Is pruning of quantized models supported?

#1422 opened Nov 27, 2023 by thomasave

Bug in ORTSmoothQuant._adjust_weights()

#1435 opened Nov 30, 2023 by pavelkochnev

[RFC] Porting INC SmoothQuant recipes to IPEX autotune API

#1502 opened Dec 27, 2023 by xin3he

Quantized Neural compress model not generating expected results in AMD processor

#1531 opened Jan 10, 2024 by Bhuvaneswaran-R

AWQ fails on ONNX model when a MatMul node's input is a model input/initializer

#1571 opened Jan 25, 2024 by jstoecker

PostTrainingQuantConfig(quant_level='auto', device='npu', backend="onnxrt_dml_ep") produces fp32 ops.

#1580 opened Jan 26, 2024 by kleiti

how to get layer_mappings for distillation?

#1590 opened Feb 1, 2024 by Michael-Fuu

Unable to save llama2 after SmoothQuant

#1600 opened Feb 2, 2024 by dellamuradario

smooth quant pattern is incomplete at folding=True

#1092 opened Jul 18, 2023 by wenhuach21

AWQ quantization is very slow for ONNX LLMs

#1609 opened Feb 10, 2024 by PatriceVignola

How to quantify google/vit-base-patch16-224 pytorch_model.bin to int8 type with neural-compressor

#1612 opened Feb 19, 2024 by yingmuying

neural_compressor/adaptor/ox_utils/quantizer.py dfs crash during "basic" tuning

#1621 opened Feb 22, 2024 by kmn1024

Model execution is single threaded?

#1663 opened Mar 12, 2024 by akhauriyash

AWQ Quantization padding error

#1699 opened Mar 26, 2024 by PatriceVignola

io.UnsupportedOperation: fileno

#1714 opened Apr 3, 2024 by jashokkumar83

'q_config' is needed when export an INT8 model aitce

AI TCE to handle it firstly

#1736 opened Apr 18, 2024 by ZhangShuoAlreadyExists

how to extract int8 weights from quantized model aitce

AI TCE to handle it firstly

#1817 opened May 25, 2024 by chensterliu

Is there any accuracy data related to FP4?

#1835 opened Jun 3, 2024 by PhzCode

PTQ with IPEX backend and XPU device is not working

#1889 opened Jun 28, 2024 by paguilomanas

FP4 encoding related

#1891 opened Jul 1, 2024 by Tiantian-Han

Mx FP Quantization About Subnorm

#2106 opened Jan 2, 2025 by Jzz24

Previous 1 2 Next

Previous Next

ProTip! What’s not been updated in a month: updated:<2024-12-19.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly