This project is a showcase for a neural tokenization technique. Since the inputs are compressed and have a smaller shape, the LLM is downsized accordingly.
For example, llama3-8b is brought down to 34 million parameters instead of 8 billion.
Final model:
- pretraining: file / Google Colab
- fine-tuning: file / Google Colab
See TODO.
This project winks at llama3 from Meta, but doesn't actually its weights nor code.
Licensed under the aGPLv3.