See DirectML version history on MSDN for more detailed notes.
- Introduced DML_FEATURE_LEVEL 6.1
- Added DML_OPERATOR_MULTIHEAD_ATTENTION.
- Added DML_OPERATOR_ACTIVATION_SOFTMAX and DML_OPERATOR_ACTIVATION_SOFTMAX1 to the list of fusable activations for DML_OPERATOR_GEMM.
- Introduced DML_FEATURE_LEVEL 6.0
- Added UINT64 and INT64 data type support for DML_OPERATOR_ELEMENT_WISE_DIVIDE, DML_OPERATOR_ELEMENT_WISE_MODULUS_FLOOR, and DML_OPERATOR_ELEMENT_WISE_MODULUS_TRUNCATE.
- Added FLOAT16 data type support in ScaleTensor for DML_OPERATOR_ELEMENT_WISE_QUANTIZE_LINEAR
- Added FLOAT16 data type support in ScaleTensor and OutputTensor for DML_OPERATOR_ELEMENT_WISE_DEQUANTIZE_LINEAR
- Added DML_OPERATOR_ELEMENT_WISE_CLIP operator to the supported fused activation list.
- Improved performance of DML_OPERATOR_ACTIVATION_SOFTMAX/1.
- Improved performance of DML_OPERATOR_GEMM on certain GPU hardware.
- Improved performance of DML_OPERATOR_CONVOLUTION_INTEGER and DML_OPERATOR_QUANTIZED_LINEAR_CONVOLUTION on INT8 tensors.
- Fixed bug in DML_OPERATOR_CONVOLUTION_INTEGER and DML_OPERATOR_QUANTIZED_LINEAR_CONVOLUTION on certain GPU hardware.
- Fixed crash in DML graph compilation when DML_OPERATOR_ROI_ALIGN_GRAD is used.
- Fixed Windows App Certification Kit failure in windows app using DirectML.
- Fixed bug in DMLCreateDevice1 that would cause it to incorrectly fail when a minimumFeatureLevel greater than DML_FEATURE_LEVEL_5_0 is supplied.
- Introduced DML_FEATURE_LEVEL 5.2:
- Expanded supported rank count across certain DML operators.
- Enabled DML_OPERATOR_MEAN_VARIANCE_NORMALIZATION/1 to allow optional ScaleTensor regardless of the value of the BiasTensor and vice-versa.
- Enabled DMLCreateDevice/1 to use DML_CREATE_DEVICE_FLAG_DEBUG flag even if Direct3D 12 debug layer is not enabled.
- Improved DML_OPERATOR_RESAMPLE/1/2 APIs performance significantly.
- Improved graph-level layout transformation logic performance significantly.
- Fixed DML_OPERATOR_ELEMENT_WISE_MODULUS_TRUNCATE/FLOOR precision issue with non-power of 2 on specific GPUs.
- Fixed DML_OPERATOR_CAST casting between 16-bit and 64-bit data types on specific GPUs.
- Fixed DML_OPERATOR_ACTIVATION_HARDMAX/1 result for certain OutputTensor stride.
- Fixed bug in DML_OPERATOR_ONE_HOT operator when using large uint64 indices.
- Improve FP32 convolution performance.
- Improve DML_OPERATOR_JOIN operator performance.
- Fix bug with unconnected split nodes when executing in DML graph.
- Fix identity node optimization when near end of DML graph.
- Introduces DML_FEATURE_LEVEL_5_1
- Adds 7 new operators:
- DML_OPERATOR_ACTIVATION_GELU
- DML_OPERATOR_ACTIVATION_SOFTMAX1
- DML_OPERATOR_ACTIVATION_LOG_SOFTMAX1
- DML_OPERATOR_ACTIVATION_HARDMAX1
- DML_OPERATOR_RESAMPLE2
- DML_OPERATOR_RESAMPLE_GRAD1
- DML_OPERATOR_DIAGONAL_MATRIX1
- Adds 7 new operators:
- GRU significant performance boost.
- INT8 convolution performance improvement using DP4A HLSL intrinsics.
- Fix Linux-specific execution failure in a TensorFlow graph due to bad alignment related to bitscan forward instruction.
- Fix incorrect results in 2D convolution with certain combinations of parameters where group count > 1 (issue).
- Fix telemetry bug that caused slower CPU execution over time with repeated operator creation.
- Introduces DML_FEATURE_LEVEL_5_0
- Adds 4 new operators:
- DML_OPERATOR_ELEMENT_WISE_CLIP1
- DML_OPERATOR_ELEMENT_WISE_CLIP_GRAD1
- DML_OPERATOR_PADDING1
- DML_OPERATOR_ELEMENT_WISE_NEGATE
- Supports 64-bit data type for operators: CLIP, CLIP_GRAD, CUMULATIVE_SUMMATION, CUMULATIVE_PRODUCT, ELEMENT_WISE_MAX, ELEMENT_WISE_MIN, REDUCE+REDUCE_FUNCTION_MAX, REDUCE+REDUCE_FUNCTION_MAX, REDUCE+REDUCE_FUNCTION_SUM, REDUCE+REDUCE_FUNCTION_MULTIPLY, REDUCE+REDUCE_FUNCTION_SUM_SQUARE, REDUCE+REDUCE_FUNCTION_L1, PADDING, SPACE_TO_DEPTH, DEPTH_TO_SPACE, TOP_K, ELEMENT_WISE_NEGATE, ELEMENT_WISE_IF, MAX_POOLING, MAX_UNPOOLING, FILL_VALUE_SEQUENCE, REVERSE_SUBSEQUENCES, ROI_ALIGN BatchIndicesTensor.
- Adds 4 new operators:
- Bug fixes.
- Introduces DML_FEATURE_LEVEL_4_1
- Adds 3 new operators:
- DML_OPERATOR_ROI_ALIGN_GRAD
- DML_OPERATOR_BATCH_NORMALIZATION_TRAINING
- DML_OPERATOR_BATCH_NORMALIZATION_TRAINING_GRAD
- Supports 64-bit data type for operators: ELEMENT_WISE_IDENTITY, ELEMENT_WISE_ADD, ELEMENT_WISE_SUBTRACT, ELEMENT_WISE_MULTIPLY, ELEMENT_WISE_ABS, ELEMENT_WISE_SIGN, ELEMENT_WISE_LOGICAL_EQUALS, ELEMENT_WISE_LOGICAL_GREATER_THAN, ELEMENT_WISE_LOGICAL_LESS_THAN, ELEMENT_WISE_LOGICAL_GREATER_THAN_OR_EQUAL, ELEMENT_WISE_LOGICAL_LESS_THAN_OR_EQUAL, ELEMENT_WISE_BIT_SHIFT_LEFT, ELEMENT_WISE_BIT_SHIFT_RIGHT, ELEMENT_WISE_BIT_AND, ELEMENT_WISE_BIT_OR, ELEMENT_WISE_BIT_NOT, ELEMENT_WISE_BIT_XOR, ELEMENT_WISE_BIT_COUNT, ARGMIN, ARGMAX, CAST, SLICE, SLICE1, SLICE_GRAD, SPLIT, JOIN, GATHER, GATHER_ELEMENTS, GATHER_ND, GATHER_ND1, SCATTER, SCATTER_ND, FILL_VALUE_CONSTANT, TILE, ONE_HOT
- Adds 3 new operators:
- Substantial performance improvements for several operators (especially in training scenarios).
- Bug fixes.
- Introduces DML_FEATURE_LEVEL_4_0
- Adds 3 new operators:
- DML_OPERATOR_ELEMENT_WISE_QUANTIZED_LINEAR_ADD
- DML_OPERATOR_DYNAMIC_QUANTIZE_LINEAR
- DML_OPERATOR_ROI_ALIGN1
- Supports 8D tensors for operators: FILL_VALUE_CONSTANT, FILL_VALUE_SEQUENCE, CUMULATIVE_SUMMATION, CUMULATIVE_PRODUCT, REVERSE_SUBSEQUENCES, ACTIVATION_RELU_GRAD, RANDOM_GENERATOR, NONZERO_COORDINATES, ADAM_OPTIMIZER, DYNAMIC_QUANTIZE_LINEAR, ELEMENT_WISE_QUANTIZED_LINEAR_ADD
- Adds 3 new operators:
- Substantial performance improvements for several operators.
- Bug fixes.
- Adds a workaround for a driver issue that affects some Intel devices. For the best performance it is recommended to use the latest drivers.
- Introduces a new feature level: DML_FEATURE_LEVEL_3_1
- Adds 6 new operators:
- DML_OPERATOR_ELEMENT_WISE_ATAN_YX,
- DML_OPERATOR_ELEMENT_WISE_CLIP_GRAD,
- DML_OPERATOR_ELEMENT_WISE_DIFFERENCE_SQUARE,
- DML_OPERATOR_LOCAL_RESPONSE_NORMALIZATION_GRAD,
- DML_OPERATOR_CUMULATIVE_PRODUCT,
- DML_OPERATOR_BATCH_NORMALIZATION_GRAD,
- Supports 8D tensors for operators: ELEMENT_WISE_CLIP_GRAD, ELEMENT_WISE_DIFFERENCE_SQUARE, ELEMENT_WISE_ATAN_YX, CAST, JOIN, PADDING, TILE, TOP_K, BATCH_NORMALIZATION, BATCH_NORMALIZATION_GRAD, LP_NORMALIZATION, TOP_K1, MEAN_VARIANCE_NORMALIZATION1, SLICE_GRAD
- Adds 6 new operators:
- Initial support ARM/ARM64 builds of DirectML.
- Substantial performance improvements for several operators.
- Bug fixes.
- Fix perf issue for NHWC layouts of fused activation with Convolution/GEMM/Normalization.
- Add PIX markers support to redist to enable profiling graph at operator level.
- Bug fixes related to metacomands:
- Fix DML_OPERATOR_BATCH_NORMALIZATION crash when the operator is created with DimensionCount > 5.
- Fix DML_OPERATOR_MAX_POOLING1/2 binding order for optional output indices tensor. This did not affect the output, but when running with GPU validation enabled, an error would happen "Supplied parameters size doesn't match enumerated size".
- First release of DirectML as a redistributable NuGet package, Microsoft.AI.DirectML.
- Introduces two new feature levels since DirectML 1.1.0: DML_FEATURE_LEVEL_3_0 and DML_FEATURE_LEVEL_2_1.
- Adds 44 new operators.
- The maximum number of tensor dimensions has been increased from 5 to 8 for operators: ELEMENT_WISE_IDENTITY, ELEMENT_WISE_ABS, ELEMENT_WISE_ACOS, ELEMENT_WISE_ADD, ELEMENT_WISE_ASIN, ELEMENT_WISE_ATAN, ELEMENT_WISE_CEIL, ELEMENT_WISE_CLIP, ELEMENT_WISE_COS, ELEMENT_WISE_DIVIDE, ELEMENT_WISE_EXP, ELEMENT_WISE_FLOOR, ELEMENT_WISE_LOG, ELEMENT_WISE_LOGICAL_AND, ELEMENT_WISE_LOGICAL_EQUALS, ELEMENT_WISE_LOGICAL_GREATER_THAN, ELEMENT_WISE_LOGICAL_LESS_THAN, ELEMENT_WISE_LOGICAL_GREATER_THAN_OR_EQUAL, ELEMENT_WISE_LOGICAL_LESS_THAN_OR_EQUAL, ELEMENT_WISE_LOGICAL_NOT, ELEMENT_WISE_LOGICAL_OR, ELEMENT_WISE_LOGICAL_XOR, ELEMENT_WISE_MAX, ELEMENT_WISE_MEAN, ELEMENT_WISE_MIN, ELEMENT_WISE_MULTIPLY, ELEMENT_WISE_POW, ELEMENT_WISE_CONSTANT_POW, ELEMENT_WISE_RECIP, ELEMENT_WISE_SIN, ELEMENT_WISE_SQRT, ELEMENT_WISE_SUBTRACT, ELEMENT_WISE_TAN, ELEMENT_WISE_THRESHOLD, ELEMENT_WISE_QUANTIZE_LINEAR, ELEMENT_WISE_DEQUANTIZE_LINEAR, ARGMIN, ARGMAX, SLICE, SPLIT, GATHER, ELEMENT_WISE_SIGN, ELEMENT_WISE_IS_NAN, ELEMENT_WISE_ERF, ELEMENT_WISE_SINH, ELEMENT_WISE_COSH, ELEMENT_WISE_TANH, ELEMENT_WISE_ASINH, ELEMENT_WISE_ACOSH, ELEMENT_WISE_ATANH, ELEMENT_WISE_IF, ELEMENT_WISE_ADD1, SCATTER, ONE_HOT, ELEMENT_WISE_BIT_SHIFT_LEFT, ELEMENT_WISE_BIT_SHIFT_RIGHT, ELEMENT_WISE_ROUND, ELEMENT_WISE_IS_INFINITY, ELEMENT_WISE_MODULUS_TRUNCATE, ELEMENT_WISE_MODULUS_FLOOR, GATHER_ELEMENTS, GATHER_ND, SCATTER_ND, SLICE1, ELEMENT_WISE_BIT_AND, ELEMENT_WISE_BIT_OR, ELEMENT_WISE_BIT_XOR, ELEMENT_WISE_BIT_NOT, ELEMENT_WISE_BIT_COUNT, GATHER_ND1
- Select operators support additional tensor data types.
- Substantial performance improvements for several operators.
- Bug fixes.
- Introduces a new feature level: DML_FEATURE_LEVEL_2_0.
- Adds 19 new operators.
- When binding an input resource for dispatch of an IDMLOperatorInitializer, it is now legal to provide a resource with D3D12_HEAP_TYPE_CUSTOM (in addition to D3D12_HEAP_TYPE_DEFAULT), as long as appropriate heap properties are also set.
- Select operators support 8-bit integer tensors.
- 5D activation functions now support the use of strides on their input and output tensors.
- Substantial performance improvements for several operators.
- Bug fixes.
- First release of DirectML