-
Notifications
You must be signed in to change notification settings - Fork 286
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Support for T5 #560
Comments
What specific model would you like supported? We would only take this on if we saw sufficient interest (but in practice we see heavy movement towards decoder-only models). |
Decoder only models are great for generative use cases but T5 family is the work horse for many discriminative tasks. For example, the flan-t5-base model has 2M downloads on Huggingface in the last month. Support for flan-t5 will add a huge value to the community. |
It'd be great to have T5 models here as well. |
I'm going to try to turn MaxText into encoder-decoder anyway, so native support is of course also appreciated :) |
https://github.com/p-doom/maxtext/tree/colab_temp We finally came around to implement encoder-decoder models in our maxtext fork. The synthetic data pipeline seems to work. Will add support for the real data pipeline later today. |
okay I was a bit too fast, still have to fix a few things. |
Do you have plans to support encoder-decoder models like T5? It will be great to have T5 with flash attention 😃
The text was updated successfully, but these errors were encountered: