#
moe
Here are 4 public repositories matching this topic...
HydraNet is a state-of-the-art transformer architecture that combines Multi-Query Attention (MQA), Mixture of Experts (MoE), and continuous learning capabilities.
-
Updated
Jan 13, 2025 - Shell
Improve this page
Add a description, image, and links to the moe topic page so that developers can more easily learn about it.
Add this topic to your repo
To associate your repository with the moe topic, visit your repo's landing page and select "manage topics."