v0.9
Performance optimizations
- Improved performance on processors with Intel(R) AVX2 instruction set support
- Improved performance on processors with Intel(R) AVX512 instruction set support
- Added optimizations for Intel(R) Xeon processors with Intel AVX512 instruction set support
- Added inference optimizations for Intel(R) Atom processors with Intel(R) SSE4.2 support
- Added JIT implementation of SGEMM for Intel(R) Xeon Phi(TM) processors.
New functionality
- Average pooling supports 'exclude padding' mode
- LRN supports arbitrary local size
- Feature preview: Added int8 support in convolution, ReLU, pooling and inner product. Added optimized
u8s8u8
convolution flavor for Intel Xeon processors with Intel AVX512 instruction set support. - Feature preview: Added int16 support in convolution, ReLU, pooling and inner product. Added optimized
s16s16s32
convolution flavor for future Intel Xeon Phi processors.
Usability improvements
- Improved build system to enable integration to other projects.
- Intel(R) OpenMP runtime is used when the library built with binary dependency
- Feature based dispatcher added to support wide range of Intel(R) processors and compatible
Thanks to the contributors
This release contains contributions from many Intel(R) Performance Libraries developers as well as Ismo Puustinen @ipuustin, Dmitry Gorokhov, Vladimir Dudnik @vladimir-dudnik, @pruthviIntel, and Chris Olivier @cjolivier01. We would also like to thank everyone who asked questions and reported issues.