Releases: oneapi-src/oneDNN
Releases · oneapi-src/oneDNN
v0.21.1
This is a patch release containing following changes to Intel MKL-DNN v0.21:
v0.21
Performance optimizations
- Improved int8 and fp32 GEMM and inner product performance.
- Improved reorder performance for certain shapes.
- Improved RNN, LSTM, GRU and LBR-GRU training performance.
New functionality
- Added GELU activation support.
Thanks to the contributors
This release contains contributions from many Intel Performance Libraries developers. We would also like to thank everyone who asked questions and reported issues.
v1.1-rc
This is a release candidate for DNNL v1.1. Please provide feedback and report bugs in Github issues.
v0.20.5
This is a patch release containing following changes to Intel MKL-DNN v0.20.4:
v0.20.4
v0.21-rc
This is a release candidate for Intel MKL-DNN v0.21. Please provide feedback and report bugs in Github issues.
v0.20.3
v1.0.2
This is a patch release containing following changes to Intel MKL-DNN v1.0.1:
- Fixed issue with bfloat16 instructions detection in Xbyak (0f4ba11)
- Fixed buffer size in packed GEMM (9764940)
- Fixed offset calculation issue in weight update depthwise convolution in fp32 and bfloat16 kernels (6b9d412, 061499d)
- Added check that size of generated kernel doesn't exceed the maximum allowed bound in fp32 forward and backward kernels (67e8cd2)
- Various fixes in RNN primitive:
- Proper handling of packed GEMM in extended GEMM (4eb9f56)
- Force no-copy GEMM only for Intel AVX+ systems (2fbc8ba)
- Avoid unaligned pointers usage in vex instructions in GRU cell (a147c08)
- Fixed wrong dimension when creating GEMM primitive descriptor in reference RNN implementation for GPU (eb3c866)
- Fixed Tanh backward calculation in GPU RNN reference implementation (f6e4b97)
- Fixed pack GEMM dispatching for int8 (16b46c7)
- Addressed bugs in tests for RNNs (cf83e83, f7c2de2, 960f3f3)
v0.20.2
This is a patch release containing following changes to Intel MKL-DNN v0.20.1:
- Fixed issue with bfloat16 instructions detection in Xbyak (b59bf2e)
- Fixed offset calculation issue in weight update depthwise convolution in fp32 and bfloat16 kernels (ddc54e5, 0982b25)
- Added check that size of generated kernel doesn't exceed the maximum allowed bound in fp32 forward and backward kernels (24abe20)
- Various fixes in RNN primitive: