Uses reinforcement learning
to encourage roneneldan/TinyStories-33M
to generate stories with alliteration
Docs are here
If you install uv, it'll get the dependencies.
Backup plan: ./build.sh
and ./run.sh
will build and run a Docker container
that has uv
, in case your system is weird (like my NixOS laptop)
and doesn't work with uv
.
Once you're in the container,
you can run the commands in the following sections.
uv run src/tiny_stories_rl/train.py
The KL penalty coeffient is configurable via --kl-coefficient
;
see here for more.
uv run pytest tests