[Nano] OpenVINO API for Async Inference #5507

zhentaocc · 2022-08-23T05:51:16Z

Problem

Nano has enabled OpenVINOModel.forward with sync mode. For example:

from bigdl.nano.openvino import OpenVINOModel
ov_model = OpenVINOModel("model.xml")
for img in dataset:
    out = ov_model(img)

This will get the results batch by batch. As for async mode in https://docs.openvino.ai/latest/openvino_docs_OV_UG_Python_API_exclusives.html#asyncinferqueue, we didn't enable this for Nano. With async inference, we can obtain better performance.

API Design

For raw OpenVINO:

from bigdl.nano.openvino import OpenVINOModel
ov_model = OpenVINOModel("model.xml")
model.predict(dataset: np_array)

OpenVINOModel.precit(...) will be the function to enable async inference loop.

For Pytorch OpenVINO"

from bigdl.nano.pytorch.trainer import Trainer
trainer = Trainer()
ov_model = Trainer.trace(pytorch_model, accelerator='openvino')
trainer.predict(ov_model, dataloader)

Trainer.predict() will be overridden to run async inference loop.

Tasks

assigned to @hjzin

enable async API for raw OpenVINO
enable async API for Pytorch OpenVINO
modify [Nano] Openvino quantization notebooks with nano #5491

The text was updated successfully, but these errors were encountered:

jason-dai · 2022-08-23T07:03:28Z

Why do we need two different sets of APIs? Make it consistent for this specific feature.

zhentaocc · 2022-08-24T02:41:15Z

Why do we need two different sets of APIs? Make it consistent for this specific feature.

Because they are for different frameworks actually.
There is no Trainer for raw OpenVINO since we don't require Pytorch-Lightning installation in this case.
For pytorch-lightning:

trainer.predict(ov_model, dataloader)

For Keras:

ov_model.predict(dataloader)

For OpenVINO without pytorch/tf:

ov_model.predict(data)

They seem to be kind of consistent.

jason-dai · 2022-08-24T04:02:45Z

I think we can then have all cases to use model.predict, no? And model.predict can support both sync and async modes.

zhentaocc · 2022-08-25T02:05:24Z

I think we can then have all cases to use model.predict, no?

Sure we can do this. This will introduce a new API for Pytorch users, model.predict(...). Shall we make trainer.predict same as model.predict(...)? Users might be confused with these two API with the same name.

And model.predict can support both sync and async modes.

Is it necessary to support sync mode? seems that async mode can always obtain better performance in this case.

jason-dai · 2022-08-25T02:54:37Z

I think we can then have all cases to use model.predict, no?

Sure we can do this. This will introduce a new API for Pytorch users, model.predict(...). Shall we make trainer.predict same as model.predict(...)? Users might be confused with these two API with the same name.

And model.predict can support both sync and async modes.

Is it necessary to support sync mode? seems that async mode can always obtain better performance in this case.

No need to add or update trainer.predict
Change to model.async_predict?

hjzin mentioned this issue Sep 1, 2022

Add Nano openvino async api #5615

Merged

1 task

hjzin closed this as completed Sep 19, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Nano] OpenVINO API for Async Inference #5507

[Nano] OpenVINO API for Async Inference #5507

zhentaocc commented Aug 23, 2022 •

edited

Loading

jason-dai commented Aug 23, 2022

zhentaocc commented Aug 24, 2022

jason-dai commented Aug 24, 2022

zhentaocc commented Aug 25, 2022

jason-dai commented Aug 25, 2022

[Nano] OpenVINO API for Async Inference #5507

[Nano] OpenVINO API for Async Inference #5507

Comments

zhentaocc commented Aug 23, 2022 • edited Loading

Problem

API Design

Tasks

jason-dai commented Aug 23, 2022

zhentaocc commented Aug 24, 2022

jason-dai commented Aug 24, 2022

zhentaocc commented Aug 25, 2022

jason-dai commented Aug 25, 2022

zhentaocc commented Aug 23, 2022 •

edited

Loading