Performance optimizations

Improved performance of fp32 direct and Winograd convolution on Intel(R) Xeon(R) processors with Intel(R) Advanced Vector Instructions 512 (Intel(R) AVX512) support
Improved performance of int8 direct convolution on Intel Xeon processors with Intel AVX512 instruction set
Improved batch normalization performance on Intel Xeon processors with Intel AVX512 instruction set
Optimized dilated convolution backward propagation
Improved initialization time of GEMM-based convolution implementations

New functionality

Support for int8 inference. These functions support int8 data type:
- reorders (including quantization and dequantization)
- convolution
- pooling
- eltwise
- sum
- concat
Layer fusion support with the new post-ops API. Functions that support fusion:
- forward convolution with eltwise for inference and training
- convolution with sum for inference
- batch normalization with eltwise for training

API deprecations and breaking changes

ReLU primitive is deprecated. The functionality is a part of eltwise primitive
Merged convolution/ReLU primitive is deprecated. The functionality is available using the new post-ops API

Thanks to the contributors

This release contains contributions from many Intel(R) Performance Libraries developers as well as @kruus, Yong Wu, Daoxin Pan, and Zhiming Wang. We would also like to thank everyone who asked questions and reported issues.

* Other names and brands may be claimed as the property of others.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

v0.12

Performance optimizations

New functionality

API deprecations and breaking changes

Thanks to the contributors