kaiokendev
Follow
🎯
working, if you need help send email or DM
Popular repositories Loading
-
cutoff-len-is-context-len
cutoff-len-is-context-len PublicDemonstration that finetuning RoPE model on larger sequences than the pre-trained model adapts the model context limit
-
-
-
llama-rmt-test
llama-rmt-test PublicJust checking this for one sec https://arxiv.org/pdf/2304.11062.pdf
Python 3
-
flashattention2-custom-mask
flashattention2-custom-mask PublicForked from alexzhang13/flashattention2-custom-mask
Triton implementation of FlashAttention2 that adds Custom Masks.
Python 2
Something went wrong, please refresh the page to try again.
If the problem persists, check the GitHub status page or contact support.
If the problem persists, check the GitHub status page or contact support.