The user-facing APIs are changed between v1.2 and v1.1. The major changes are:
-
v1.2 abstracts
neural_compressor.common.Model
concept to cover those cases whose weight and graph files are stored separately. -
v1.2 unifies the calling style by setting model, calibration dataloader, evaluation dataloader, and metric through
quantizer
attributes rather than passing as function inputs.
Refer to below examples for details.
# user facing API example in v1.1
quantizer = Quantization("/path/to/user.yaml")
ds = dataset("/path/to/dataset")
dataloader = quantizer.dataloader(ds, batch_size=100)
quantizer.metric("metric", metric)
q_model = quantizer(
"/path/to/model",
q_dataloader=dataloader,
eval_dataloader=dataloader,
)
... # user to write framework specific code to save q_model
# user facing API example in v1.2
quantizer = Quantization(conf.yaml)
quantizer.model = "/path/to/model"
dl = dataset("/path/to/dataset")
quantizer.calib_dataloader = common.DataLoader(dl, batch_size=32)
quantizer.eval_dataloader = common.DataLoader(dl, batch_size=32)
quantizer.metric = common.Metric(custom_metric)
q_model = quantizer.fit()
q_model.save("/path/to/output/dir") # explicitly call to save q_model
v1.2 refines Neural Compressor built-in transform/dataset/metric to unify APIs cross different framework backends.
Refer to dataset, transform, and metric to learn how to use them in yaml or code.