You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
At the moment dvc repro is not tested not guaranteed to run with changes. This is mitigated to a certain extent by tests but not fully. It would be ideal to have a way to run dvc repro on a test dataset so validate that it works before kicking a full run. This is intended to be used from the person that develops a PR as a sanity check similar to tests. Later on it could be added as Github check although a minor problem there is that dvc pull needs to run which fetches >100GB.
The text was updated successfully, but these errors were encountered:
One we were discussing with @pdan93 is to add an environment variable TEST=1 that dvc reads and uses as a param or flag in preprocess to create a small dataset say 1K examples or less. This can be enabled / disabled locally and in Github and it will not be enabled when running the final run in the cloud (or whenever its run)
One way that works well in neural nets or models that use sgd variants to train and you can stop train early is to use a --dry-run flag which stops train after the first batch. We have effectively used this in #160
At the moment
dvc repro
is not tested not guaranteed to run with changes. This is mitigated to a certain extent by tests but not fully. It would be ideal to have a way to rundvc repro
on a test dataset so validate that it works before kicking a full run. This is intended to be used from the person that develops a PR as a sanity check similar to tests. Later on it could be added as Github check although a minor problem there is thatdvc pull
needs to run which fetches >100GB.The text was updated successfully, but these errors were encountered: