Skip to content

Pull requests: karpathy/nanoGPT

Author
Filter by author
Loading
Label
Filter by label
Loading
Use alt + click/return to exclude labels
or + click/return for logical OR
Projects
Filter by project
Loading
Milestones
Filter by milestone
Loading
Reviews
Assignee
Filter by who’s assigned
Sort

Pull requests list

added fix to type comparison to enable fused AdamW
#569 opened Oct 30, 2024 by seanjudelyons Loading…
Adding NVIDIA hardware performance detection
#555 opened Sep 20, 2024 by fparisio Loading…
Add fire finetuning
#553 opened Sep 6, 2024 by gkielian Draft
Add support for 0 temperature
#546 opened Aug 19, 2024 by jmccrosky Loading…
Use weights_only for loading
#540 opened Aug 2, 2024 by kit1980 Loading…
Update train.py for more efficiency
#538 opened Jul 19, 2024 by Jesseonmi Loading…
Add automatic detection of number of CPU cores
#530 opened Jun 27, 2024 by Jakobovski Loading…
fix val dataset size code comment
#528 opened Jun 24, 2024 by vhmth Loading…
sign descent seems to do better than adamw?
#488 opened Jun 1, 2024 by nullonesix Loading…
Refactor for easier configuration and overrides
#459 opened Mar 20, 2024 by ikeman32 Loading…
Early stopping
#453 opened Mar 9, 2024 by derekehyatt Loading…
Implement ROPE positional encodings
#450 opened Mar 8, 2024 by devinbot Loading…
fix: estimate_mfu dt ZeroDivisionError
#446 opened Mar 2, 2024 by HildaM Loading…
Generalize encode/decode for datasets
#415 opened Jan 5, 2024 by GMNGeoffrey Loading…
Fix BUG when = in CLI value, like: --start="1+1="
#412 opened Jan 4, 2024 by DIYer22 Loading…
Dockerfile using Nvidia Container Toolkit
#409 opened Dec 27, 2023 by niccolox Loading…
Azure deployment
#406 opened Dec 18, 2023 by lakaschus Loading…
Update transformer_sizing.ipynb
#402 opened Dec 10, 2023 by Cassini-chris Loading…
Update configurator.py
#394 opened Nov 20, 2023 by GeniusPlums Loading…
ProTip! What’s not been updated in a month: updated:<2024-10-21.