-
Notifications
You must be signed in to change notification settings - Fork 1.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
exp push: fails for >50MB commits #6181
Comments
If users are already using github for versioning their pipeline, what is happening in these particular experiment commits that is making them go over the github size limit (when the user's "regular" commits are not over the size limit) |
Could have intermediate per-checkpoint debug data (which the user doesn't want tracked by DVC nor Git) |
For reference, the issue here was discussed in yesterdays meeting: When we generate a checkpoint commit, anything marked as a pipeline dependency or output that is also The problem w/the large commits is most likely occurring when users have large, intermediate One thing to note here is that DVC will not do the forced tracking for files which are both If these intermediate files/dirs are properly gitignored, it would also stop DVC from generating these bloated checkpoint commits. I think we need to clearly document this behavior on both the DVC and CML sides. |
Currently, DVC or Git would track every checkpoint commit? If so, in training progress with a large number epoch or iteration, it might generate a huge number of checkpoints and iterations, but in most cases, we only need the latest checkpoints. |
@karajan1001 yes, we track every iteration. Once the user decides they want to keep an experiment, they can choose to either keep all of the commits or just squash them all into a single commit (to only keep the final checkpoint) |
On a related note we may want to add an option to only keep the last N checkpoints. May save disk space as well as sanity when doing |
dvc exp push
can fail since GitHub rejects commits >50MB in size. Perhaps use DVC cache instead for such cases?Part of iterative/cml#560
/CC @pmrowla
The text was updated successfully, but these errors were encountered: