Skip to content

Commit

Permalink
Update mixed_precision.md
Browse files Browse the repository at this point in the history
  • Loading branch information
mengniwang95 authored Jun 6, 2023
1 parent 5c9c7a4 commit 6f80a89
Showing 1 changed file with 98 additions and 11 deletions.
109 changes: 98 additions & 11 deletions docs/source/mixed_precision.md
Original file line number Diff line number Diff line change
Expand Up @@ -20,14 +20,99 @@ The recently launched 3rd Gen Intel® Xeon® Scalable processor (codenamed Coope

## Mixed Precision Support Matrix

|Framework |BF16 |FP16 |
|--------------|:-----------:|:-----------:|
|TensorFlow |✔ |:x: |
|PyTorch |✔ |:x: |
|ONNX Runtime |✔ |✔ |
|MXNet |✔ |:x: |

> **During quantization, BF16 conversion is default enabled, FP16 can be executed if 'device' of config is 'gpu'. Please refer to this [document](./quantization_mixed_precision.md) for its workflow.**
<table class="center">
<thead>
<tr>
<th>Framework</th>
<th>Backend</th>
<th>Backend Library</th>
<th>Backend Value</th>
<th>Support Device(cpu as default)</th>
<th>Support BF16</th>
<th>Support FP16</th>
</tr>
</thead>
<tbody>
<tr>
<td rowspan="2" align="left">PyTorch</td>
<td align="left">FX</td>
<td align="left">FBGEMM</td>
<td align="left">"default"</td>
<td align="left">cpu</td>
<td align="left">&#10004;</td>
<td align="left">:x:</td>
</tr>
<tr>
<td align="left">IPEX</td>
<td align="left">OneDNN</td>
<td align="left">"ipex"</td>
<td align="left">cpu</td>
<td align="left">&#10004;</td>
<td align="left">:x:</td>
</tr>
<tr>
<td rowspan="4" align="left">ONNX Runtime</td>
<td align="left">CPUExecutionProvider</td>
<td align="left">MLAS</td>
<td align="left">"default"</td>
<td align="left">cpu</td>
<td align="left">:x:</td>
<td align="left">:x:</td>
</tr>
<tr>
<td align="left">TensorrtExecutionProvider</td>
<td align="left">TensorRT</td>
<td align="left">"onnxrt_trt_ep"</td>
<td align="left">gpu</td>
<td align="left">:x:</td>
<td align="left">:x:</td>
</tr>
<tr>
<td align="left">CUDAExecutionProvider</td>
<td align="left">CUDA</td>
<td align="left">"onnxrt_cuda_ep"</td>
<td align="left">gpu</td>
<td align="left">&#10004;</td>
<td align="left">&#10004;</td>
</tr>
<tr>
<td align="left">DnnlExecutionProvider</td>
<td align="left">OneDNN</td>
<td align="left">"onnxrt_dnnl_ep"</td>
<td align="left">cpu</td>
<td align="left">&#10004;</td>
<td align="left">:x:</td>
</tr>
<tr>
<td rowspan="2" align="left">Tensorflow</td>
<td align="left">Tensorflow</td>
<td align="left">OneDNN</td>
<td align="left">"default"</td>
<td align="left">cpu</td>
<td align="left">&#10004;</td>
<td align="left">:x:</td>
</tr>
<tr>
<td align="left">ITEX</td>
<td align="left">OneDNN</td>
<td align="left">"itex"</td>
<td align="left">cpu | gpu</td>
<td align="left">&#10004;</td>
<td align="left">:x:</td>
</tr>
<tr>
<td align="left">MXNet</td>
<td align="left">OneDNN</td>
<td align="left">OneDNN</td>
<td align="left">"default"</td>
<td align="left">cpu</td>
<td align="left">&#10004;</td>
<td align="left">:x:</td>
</tr>
</tbody>
</table>

> **During quantization, BF16 conversion is default enabled, FP16 can be executed if 'device' of config is 'gpu' and 'backend' is 'onnxrt_cuda_ep'. Please refer to this [document](./quantization_mixed_precision.md) for its workflow.**
## Get Started with Mixed Precision API

Expand Down Expand Up @@ -82,12 +167,14 @@ There are some pre-requirements to run mixed precision examples for each framewo

#### ONNX Runtime

1. Hardware: GPU, set 'device' of config to 'gpu' and 'backend' to 'onnxrt_cuda_ep'.
2. Software: onnxruntime-gpu.
1. Hardware: GPU or CPU supports `avx512_bf16` instruction set.
2. Software: onnxruntime-gpu (for GPU) or onnxruntime-dnnl (for CPU, refer to this [doc](https://onnxruntime.ai/docs/build/eps.html#onednn) to build onnxruntime-dnnl from source code).

- **FP16:**

#### ONNX Runtime

1. Hardware: GPU, set 'device' of config to 'gpu' and 'backend' to 'onnxrt_cuda_ep'.
1. Hardware: GPU.
2. Software: onnxruntime-gpu.

> **Note: Please set the backend and device values of config according to [Mixed Precision Support Matrix](#mixed-precision-support-matrix) for mixed precision.**

0 comments on commit 6f80a89

Please sign in to comment.