You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
The LBANN Distconv adapter for layers mandates that only the first input tensor to distconv-enabled layer can be a non-DiHydrogen tensor. We raise an error if a tensor requires a copy to a DiHydrogen tensor. The following checks are done:
LBANN_ERROR(layer().get_name(), ": Copyout of non-first tensor not supported");
While these worked for the original DC layers (Convolution, MSE, ReLU), mewer DC layers such as Scatter, Gather, and MatMul generally have more than one input that may need to be copied to DiHydrogen tensors, so ideally we should support the case for multiple parent tensors requiring copy. Simply removing the checks resulted in failing CI tests.
Possible workaround with Identity layer as a copy layer also has issues: #2126
The text was updated successfully, but these errors were encountered:
The LBANN Distconv adapter for layers mandates that only the first input tensor to distconv-enabled layer can be a non-DiHydrogen tensor. We raise an error if a tensor requires a copy to a DiHydrogen tensor. The following checks are done:
lbann/src/layers/data_type_distconv_adapter.cpp
Line 329 in 3b0ea84
lbann/src/layers/data_type_distconv_adapter.cpp
Line 646 in 3b0ea84
lbann/src/layers/data_type_distconv_adapter.cpp
Line 787 in 3b0ea84
lbann/src/layers/data_type_distconv_adapter.cpp
Line 812 in 3b0ea84
lbann/src/layers/data_type_distconv_adapter.cpp
Line 836 in 3b0ea84
lbann/src/layers/data_type_distconv_adapter.cpp
Line 861 in 3b0ea84
While these worked for the original DC layers (Convolution, MSE, ReLU), mewer DC layers such as Scatter, Gather, and MatMul generally have more than one input that may need to be copied to DiHydrogen tensors, so ideally we should support the case for multiple parent tensors requiring copy. Simply removing the checks resulted in failing CI tests.
Possible workaround with Identity layer as a copy layer also has issues: #2126
The text was updated successfully, but these errors were encountered: