diff --git a/docs/source/mixed_precision.md b/docs/source/mixed_precision.md index fa134e6c0b9..eab37d62504 100644 --- a/docs/source/mixed_precision.md +++ b/docs/source/mixed_precision.md @@ -20,14 +20,99 @@ The recently launched 3rd Gen Intel® Xeon® Scalable processor (codenamed Coope ## Mixed Precision Support Matrix -|Framework |BF16 |FP16 | -|--------------|:-----------:|:-----------:| -|TensorFlow |✔ |:x: | -|PyTorch |✔ |:x: | -|ONNX Runtime |✔ |✔ | -|MXNet |✔ |:x: | - -> **During quantization, BF16 conversion is default enabled, FP16 can be executed if 'device' of config is 'gpu'. Please refer to this [document](./quantization_mixed_precision.md) for its workflow.** + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
FrameworkBackendBackend LibraryBackend ValueSupport Device(cpu as default)Support BF16Support FP16
PyTorchFXFBGEMM"default"cpu:x:
IPEXOneDNN"ipex"cpu:x:
ONNX RuntimeCPUExecutionProviderMLAS"default"cpu:x::x:
TensorrtExecutionProviderTensorRT"onnxrt_trt_ep"gpu:x::x:
CUDAExecutionProviderCUDA"onnxrt_cuda_ep"gpu
DnnlExecutionProviderOneDNN"onnxrt_dnnl_ep"cpu:x:
TensorflowTensorflowOneDNN"default"cpu:x:
ITEXOneDNN"itex"cpu | gpu:x:
MXNetOneDNNOneDNN"default"cpu:x:
+ +> **During quantization, BF16 conversion is default enabled, FP16 can be executed if 'device' of config is 'gpu' and 'backend' is 'onnxrt_cuda_ep'. Please refer to this [document](./quantization_mixed_precision.md) for its workflow.** ## Get Started with Mixed Precision API @@ -82,12 +167,14 @@ There are some pre-requirements to run mixed precision examples for each framewo #### ONNX Runtime - 1. Hardware: GPU, set 'device' of config to 'gpu' and 'backend' to 'onnxrt_cuda_ep'. - 2. Software: onnxruntime-gpu. + 1. Hardware: GPU or CPU supports `avx512_bf16` instruction set. + 2. Software: onnxruntime-gpu (for GPU) or onnxruntime-dnnl (for CPU, refer to this [doc](https://onnxruntime.ai/docs/build/eps.html#onednn) to build onnxruntime-dnnl from source code). - **FP16:** #### ONNX Runtime - 1. Hardware: GPU, set 'device' of config to 'gpu' and 'backend' to 'onnxrt_cuda_ep'. + 1. Hardware: GPU. 2. Software: onnxruntime-gpu. + +> **Note: Please set the backend and device values of config according to [Mixed Precision Support Matrix](#mixed-precision-support-matrix) for mixed precision.**