Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Nano] OpenVINO API for Async Inference #5507

Closed
zhentaocc opened this issue Aug 23, 2022 · 5 comments
Closed

[Nano] OpenVINO API for Async Inference #5507

zhentaocc opened this issue Aug 23, 2022 · 5 comments

Comments

@zhentaocc
Copy link
Contributor

zhentaocc commented Aug 23, 2022

Problem

Nano has enabled OpenVINOModel.forward with sync mode. For example:

from bigdl.nano.openvino import OpenVINOModel
ov_model = OpenVINOModel("model.xml")
for img in dataset:
    out = ov_model(img)

This will get the results batch by batch. As for async mode in https://docs.openvino.ai/latest/openvino_docs_OV_UG_Python_API_exclusives.html#asyncinferqueue, we didn't enable this for Nano. With async inference, we can obtain better performance.

API Design

For raw OpenVINO:

from bigdl.nano.openvino import OpenVINOModel
ov_model = OpenVINOModel("model.xml")
model.predict(dataset: np_array)

OpenVINOModel.precit(...) will be the function to enable async inference loop.

For Pytorch OpenVINO"

from bigdl.nano.pytorch.trainer import Trainer
trainer = Trainer()
ov_model = Trainer.trace(pytorch_model, accelerator='openvino')
trainer.predict(ov_model, dataloader)

Trainer.predict() will be overridden to run async inference loop.

Tasks

assigned to @hjzin

  1. enable async API for raw OpenVINO
  2. enable async API for Pytorch OpenVINO
  3. modify [Nano] Openvino quantization notebooks with nano #5491
@jason-dai
Copy link
Contributor

Why do we need two different sets of APIs? Make it consistent for this specific feature.

@zhentaocc
Copy link
Contributor Author

Why do we need two different sets of APIs? Make it consistent for this specific feature.

Because they are for different frameworks actually.
There is no Trainer for raw OpenVINO since we don't require Pytorch-Lightning installation in this case.
For pytorch-lightning:

trainer.predict(ov_model, dataloader)

For Keras:

ov_model.predict(dataloader)

For OpenVINO without pytorch/tf:

ov_model.predict(data)

They seem to be kind of consistent.

@jason-dai
Copy link
Contributor

I think we can then have all cases to use model.predict, no? And model.predict can support both sync and async modes.

@zhentaocc
Copy link
Contributor Author

I think we can then have all cases to use model.predict, no?

Sure we can do this. This will introduce a new API for Pytorch users, model.predict(...). Shall we make trainer.predict same as model.predict(...)? Users might be confused with these two API with the same name.

And model.predict can support both sync and async modes.

Is it necessary to support sync mode? seems that async mode can always obtain better performance in this case.

@jason-dai
Copy link
Contributor

I think we can then have all cases to use model.predict, no?

Sure we can do this. This will introduce a new API for Pytorch users, model.predict(...). Shall we make trainer.predict same as model.predict(...)? Users might be confused with these two API with the same name.

And model.predict can support both sync and async modes.

Is it necessary to support sync mode? seems that async mode can always obtain better performance in this case.

  1. No need to add or update trainer.predict
  2. Change to model.async_predict?

@hjzin hjzin closed this as completed Sep 19, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants