Fix MixConv2d()
remove shortcut + apply depthwise
#5410
Merged
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
MixConv2d fixes:
Validation test:
🛠️ PR Summary
Made with ❤️ by Ultralytics Actions
🌟 Summary
Enhanced neural network layers with SiLU activation and optimized MixConv2d implementation.
📊 Key Changes
LeakyReLU
activation withSiLU
(Swish-1 activation function) in certain network layers for potentially improved performance.MixConv2d
layer by introducing a cleaner convolution strategy, which takes into account the greatest common divisor (GCD) for grouped convolutions, and a new method for distributing channels over different kernels based on their size.SiLU
follows the inplace pattern to maintain memory efficiency, as previously used activations did.🎯 Purpose & Impact
SiLU
activation may lead to improved training results as SiLU often outperforms traditional activations likeLeakyReLU
in neural networks.MixConv2d
can provide better computational efficiency and precision in how input channels are distributed across different convolutional kernel sizes.inplace
activation ensures the memory footprint of models is kept low, which is beneficial for users with limited computing resources.The changes can result in more accurate models that are efficient in both operation and resource utilization, benefiting a wide range of users, from researchers to industry professionals implementing YOLOv5 in their systems.