Skip to content

Commit

Permalink
Update the main page examples (#1670)
Browse files Browse the repository at this point in the history
Signed-off-by: yiliu30 <[email protected]>
  • Loading branch information
yiliu30 authored Mar 18, 2024
1 parent 94013d2 commit b0c2a82
Showing 1 changed file with 25 additions and 15 deletions.
40 changes: 25 additions & 15 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -35,29 +35,39 @@ pip install neural-compressor
> More installation methods can be found at [Installation Guide](https://github.com/intel/neural-compressor/blob/master/docs/source/installation_guide.md). Please check out our [FAQ](https://github.com/intel/neural-compressor/blob/master/docs/source/faq.md) for more details.
## Getting Started
### Quantization with Python API

```shell
# Install Intel Neural Compressor and TensorFlow
pip install neural-compressor
pip install tensorflow
# Prepare fp32 model
wget https://storage.googleapis.com/intel-optimized-tensorflow/models/v1_6/mobilenet_v1_1.0_224_frozen.pb
Setting up the environment:
```bash
pip install "neural-compressor>=2.3" "transformers>=4.34.0" torch torchvision
```
After successfully installing these packages, try your first quantization program.

### Weight-Only Quantization (LLMs)
```python
from neural_compressor.data import DataLoader, Datasets
from transformers import AutoModel

from neural_compressor.config import PostTrainingQuantConfig
from neural_compressor.quantization import fit

dataset = Datasets("tensorflow")["dummy"](shape=(1, 224, 224, 3))
dataloader = DataLoader(framework="tensorflow", dataset=dataset)
float_model = AutoModel.from_pretrained("mistralai/Mistral-7B-v0.1")
woq_conf = PostTrainingQuantConfig(approach="weight_only")
quantized_model = fit(model=float_model, conf=woq_conf)
```

### Static Quantization (Non-LLMs)

```python
from torchvision import models

from neural_compressor.config import PostTrainingQuantConfig
from neural_compressor.data import DataLoader, Datasets
from neural_compressor.quantization import fit

q_model = fit(
model="./mobilenet_v1_1.0_224_frozen.pb",
conf=PostTrainingQuantConfig(),
calib_dataloader=dataloader,
)
float_model = models.resnet18()
dataset = Datasets("pytorch")["dummy"](shape=(1, 3, 224, 224))
calib_dataloader = DataLoader(framework="pytorch", dataset=dataset)
static_quant_conf = PostTrainingQuantConfig()
quantized_model = fit(model=float_model, conf=static_quant_conf, calib_dataloader=calib_dataloader)
```

## Documentation
Expand Down

0 comments on commit b0c2a82

Please sign in to comment.