Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

exp run: checks out data dependencies (destructive) [QA] #5593

Closed
jorgeorpinel opened this issue Mar 11, 2021 · 34 comments · Fixed by #5859
Closed

exp run: checks out data dependencies (destructive) [QA] #5593

jorgeorpinel opened this issue Mar 11, 2021 · 34 comments · Fixed by #5859
Assignees
Labels
enhancement Enhances DVC p1-important Important, aka current backlog of things to do

Comments

@jorgeorpinel
Copy link
Contributor

jorgeorpinel commented Mar 11, 2021

1. It produces a first unchanged exp

UPDATE: Addressed in #5600

$ git clone [email protected]:iterative/example-get-started.git
$ cd example-get-started
$ dvc pull

$ dvc exp run

This works even when there are no changes to the committed project version (HEAD). Below we can see there are differences in metrics or params:

$  dvc exp show --no-pager
┏━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━┳━━━━━━━━━━┳━━━━━━━━━┳━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━┳━━━━━━━━━━━━━┳━
┃ Experiment    ┃ Created      ┃ avg_prec ┃ roc_auc ┃ prepare.split ┃ prepare.seed ┃ featurize.max_features ┃ featurize.ngrams ┃ train.seed ┃ train.n_est ┃
┡━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━╇━━━━━━━━━━╇━━━━━━━━━╇━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━╇━━━━━━━━━━━━━╇━
│ workspace     │ -            │  0.60405 │  0.9608 │ 0.2           │ 20170428     │ 3000                   │ 2                │ 20170428   │ 100         │
│ master        │ Mar 01, 2021 │  0.60405 │  0.9608 │ 0.2           │ 20170428     │ 3000                   │ 2                │ 20170428   │ 100         │
│ └── exp-44136 │ 02:22 AM     │  0.60405 │  0.9608 │ 0.2           │ 20170428     │ 3000                   │ 2                │ 20170428   │ 100         │
└───────────────┴──────────────┴──────────┴─────────┴───────────────┴──────────────┴────────────────────────┴──────────────────┴────────────┴─────────────┴─

exp diff doesn't print anything.

Is there a use case for this? Otherwise I'd vote to block it.

2. It checks out data dependencies (destructive)

Continuing the previous CLI block

...
$ truncate --size=20M data/data.xml  # This is a stage dep
$ dvc exp run
 33% Checkout|███████████ ...  # I can see this flash momentarily
...
ERROR: Reproduced experiment conflicts with existing experiment 'exp-44136'. To overwrite the existing experiment run:

Even when I changed a dependency, which could be the basis for my experiment, all the pipeline data was checked out again (undoing my manual changes), so this exp is the same as the previous one (1).

BTW if this was the first time I exp run, it would be easy to miss that the data I changed was restored silently in the process. I'd just see that the exp results are the same as in HEAD which would be misleading.

2.2 Does it really behave exactly like repro?

https://dvc.org/doc/command-reference/exp/run reads:

dvc exp run is equivalent to dvc repro for experiments. It has the same behavior when it comes to targets and stage execution

We also say:

Before using this command, you'll probably want to make modifications such as data and code updates...

But when I change an added data file and try exp run, it undoes those changes and reports that no stage has changed. repro re-adds it instead, and then runs any stage downstream.

3. Not all changes to "code" can be queued

Extracted to #5801

In the docs we say "Before running an experiment, you'll probably want to make modifications such as data and code updates, or hyperparameter tuning." Is this the intended behavior?

Because, if we see dvc.yaml as code — and I think we do as we one of DVC's principles is pipeline codification — then this statement isn't completely true when it comes to queueing experiments. It works with regular experiments though, which use the workspace files (not a tmp copy), which makes me think this may be unintended (as we want all kinds of experiments to be consistent in behavior AFAIK).

Specifically, if you create (or modify) dvc.yaml between queued runs and then try to --run-all, you get errors. Example:

$ git init; dvc init
$ git add --all; git commit -m "`dvc -V`"
$ dvc stage add -n hi -o hi "echo hey > hi"
Creating 'dvc.yaml'
Adding stage 'hi' in 'dvc.yaml'
...
$ dvc exp run
... # works

$ dvc exp run --queue
Queued experiment '16c7340' for future execution.
$ dvc stage add -fn hi -o hello "echo hi > hello"
Modifying stage 'hi' in 'dvc.yaml'
...
$ dvc exp run --queue
Queued experiment '41791e2' for future execution.

$ dvc exp run --run-all
!ERROR: 'dvc.yaml' does not exist
ERROR: 'dvc.yaml' does not exist
ERROR: Failed to reproduce experiment '41791e2'
ERROR: Failed to reproduce experiment '16c7340'
@jorgeorpinel jorgeorpinel added the product: VSCode Integration with VSCode extension label Mar 11, 2021
@jorgeorpinel jorgeorpinel changed the title exp run: produces an unchanged HEAD exp exp run: QA questions Mar 11, 2021
@jorgeorpinel jorgeorpinel changed the title exp run: QA questions exp run: QA questions/concerns Mar 11, 2021
@jorgeorpinel jorgeorpinel changed the title exp run: QA questions/concerns exp run: questions/concerns [QA] Mar 11, 2021
@pmrowla
Copy link
Contributor

pmrowla commented Mar 11, 2021

This works even when there are no changes to the committed project version (HEAD). Below we can see there are differences in metrics or params:

exp diff doesn't print anything.

This is a side effect of experiments being identified by an internal hash of the pipeline. We don't store these hashes for the parent commits (HEAD) so the first run will be treated as a unique experiment.

Storing it as a separate experiment doesn't really hurt anything, its just an empty git commit, no data or anything else is duplicated.

edit: as noted in #5595, for --temp runs in this case we do not create the duplicate experiment. workspace runs should be updated to use the same behavior

Even when I changed a dependency, which could be the basis for my experiment, all the pipeline data was checked out again (undoing my manual changes), so this exp is the same as the previous one (1).

BTW if this was the first time I exp run, it would be easy to miss that the data I changed was restored silently in the process. I'd just see that the exp results are the same as in HEAD which would be misleading.

exp run always starts with a dvc checkout. Prior to the changes in #5586, dvc.lock was always reset to the HEAD state, so experiments would always checkout any dependencies as of HEAD (dropping workspace changes). After that PR is merged, deps will be checked out to their state in your workspace's lock files, so modifications to dependencies should be dvc commit'ed first (so the change is written to dvc.lock or .dvc files) before using exp run.

This is needed for temp directory runs, since we have to dvc checkout the data dependencies in the temp directory. We could consider adjusting behavior so we don't checkout any dependencies in workspace runs, but I don't think we should have different behavior for workspace vs temp dir runs in this case.

@pmrowla
Copy link
Contributor

pmrowla commented Mar 11, 2021

duplicate empty experiment issue (1) will be fixed in #5600

@jorgeorpinel
Copy link
Contributor Author

jorgeorpinel commented Mar 11, 2021

On 2.

deps will be checked out to their state in your workspace's lock files

OK cool, but it can still be destructive. Should we at least warn users when there are changes in dvc status to confirm that modifications to DVC-tracked file/dir deps will be lost?

Cc @dberenbaum

I don't think we should have different behavior for workspace vs temp dir runs

Good point, agreed. An alternative to keep both kinds of runs consistent would be to copy the workspace data to the tmp dirs... But having exps based on HEAD (not the workspace) makes more sense probably. I'd just add that warning.

@dberenbaum
Copy link
Collaborator

Should we at least warn users when there are changes in dvc status to confirm that modifications to DVC-tracked file/dir deps will be lost?

This seems reasonable, or suggesting users either dvc commit to keep changes or dvc checkout to recover previous data.

exps based on HEAD

I don't think data dependencies need to be based on HEAD, they just need to be cached and in the dvc.lock file in the workspace so they can be checked out.

@pmrowla
Copy link
Contributor

pmrowla commented Mar 12, 2021

Good point, agreed. An alternative to keep both kinds of runs consistent would be to copy the workspace data to the tmp dirs... But having exps based on HEAD (not the workspace) makes more sense probably. I'd just add that warning.

The decision to not directly copy data here was for supporting CI use cases and eventual remote (machine) execution. The expected workflow there will be that you need to dvc push data to a DVC remote, and then it will be dvc pull'ed (and checked out) when dvc exp run happens on your remote/CI machine.

We also currently avoid duplicating data for the local tempdir runs (assuming the user is using any non-copy cache link type), which would be another concern with copying workspace data into the tmp dirs.

@jorgeorpinel
Copy link
Contributor Author

Hey guys just FYI I don't have any more points here. Can we decide what's the ideal solution if any to 2. and 3. above? (3 is new, but connected to 2.)

I'd just add that warning.

Basically the only possible problem of keeping this behavior even with the warning, is that exp run does not behave like repro. Maybe that's not a problem and we just need to update docs about it?

@dberenbaum
Copy link
Collaborator

modifications to dependencies should be dvc commit'ed first (so the change is written to dvc.lock or .dvc files) before using exp run

Thanks @jorgeorpinel. It's been awhile since I thought about this issue. Coming back to it now, it's unclear to me if exp run should automatically do a dvc commit first to align with dvc repro. Thoughts?

@jorgeorpinel
Copy link
Contributor Author

jorgeorpinel commented Apr 5, 2021

I see what you mean @dberenbaum but repro doesn't git commit so it would still feel pretty different :)

deps will be checked out to their state in your workspace's lock files, so modifications to dependencies should be dvc commit'ed first

@pmrowla did explain how we got here ☝️ but why is that needed? It probably makes sense for --queue/temp runs but I still think regular exps shouldn't change dependencies in the workspace. It's kind of unexpected!

@dberenbaum
Copy link
Collaborator

This is needed for temp directory runs, since we have to dvc checkout the data dependencies in the temp directory. We could consider adjusting behavior so we don't checkout any dependencies in workspace runs, but I don't think we should have different behavior for workspace vs temp dir runs in this case.

I agree with Peter that behavior shouldn't differ between workspace vs temp runs.

If we dvc commit whatever is in the workspace, it can be checked out in both workspace and temp runs without changing anything in the workspace.

@jorgeorpinel
Copy link
Contributor Author

jorgeorpinel commented Apr 6, 2021

Hmmm workspace runs vs queued, and esp. temp runs are already inherently different anyway. Is there a reason to seek that exact consistency? Because it seems we need to pick between that OR consistency with repro, which I find more appealing

Idk about exp run Git-committing the working tree changes, since experiments are meant to avoid users form polluting the repo with commits.

@pmrowla
Copy link
Contributor

pmrowla commented Apr 6, 2021

It sounds like dvc commiting the workspace state is the way to go here.

Idk about exp run Git-committing the working tree changes, since experiments are meant to avoid users form polluting the repo with commits.

We would only be dvc commiting (so the workspace's current dependency states are written to the workspace's dvc.lock). After exp run the workspace dvc.lock would contain the unstaged changes (as a result of dvc commit), but nothing would be Git-committed from the user's perspective.

@pmrowla
Copy link
Contributor

pmrowla commented Apr 6, 2021

Hmmm workspace runs vs queued, and esp. temp runs are already inherently different anyway. Is there a reason to seek that exact consistency? Because it seems we need to pick between that OR consistency with repro, which I find more appealing

I think from the user's perspective, workspace or temp runs should not be inherently different. This can even be extended to remote execution - if I run dvc exp run --set-param foo=1 I should get a consistent result whether I run it in my local workspace, in a local temp directory, or on some other machine over ssh.

@dberenbaum
Copy link
Collaborator

@jorgeorpinel I'm not seeing the inconsistency with dvc repro since, as @pmrowla said, there would be no Git commits. Does that address your concern, or is there some other inconsistency I'm missing?

@jorgeorpinel
Copy link
Contributor Author

jorgeorpinel commented Apr 7, 2021

We would only be dvc commiting

Oh OK, I see.

I think from the user's perspective, workspace or temp runs should not be inherently different.

@pmrowla What I mean is that it's evident that queued runs may need to change the data in the workspace first, to reflect what was present when they were queued (which we agree is an intuitive user expectation). What I don't see is a need for regular runs to do the same (I find that unintuitive).

And yes, a dvc commit may make it less unintuitive (or at least not-destructive) but still why not just behave like regular repro? What's the problem with queued runs having an extra step vs regular runs? They already have other extra steps (create tmp dir, etc).

I'm not seeing the inconsistency with dvc repro

@dberenbaum repro does not dvc commit nor dvc checkout before stage runs.

@jorgeorpinel
Copy link
Contributor Author

jorgeorpinel commented Apr 7, 2021

p.s. AFAIK committing data is meant as part of the tracking mechanism (at the end of add and repro). I don't think it makes sense to put it at the beginning too, that would repeat the operation in the case of exp run: you commit first, you repro, which commits again? And the cache ends up with some temp data you probably didn't mean to store.

@pmrowla
Copy link
Contributor

pmrowla commented Apr 8, 2021

p.s. AFAIK committing data is meant as part of the tracking mechanism (at the end of add and repro). I don't think it makes sense to put it at the beginning too, that would repeat the operation in the case of exp run: you commit first, you repro, which commits again?

The important thing here is that we are committing two different states though.

  • The first dvc commit (at the start of exp run) is committing the state of any (modified) data dependencies in the user's workspace.
  • The second commit (at the end of repro) is committing the state of any outputs that were changed by repro. (But yes, it's true that for a normal repro, this commit does combine committing both dependency and output states to the DVC cache within a single operation.)

And the cache ends up with some temp data you probably didn't mean to store.

The "temp data" here is really "modified dependencies", and will end up being committed into cache anyways in the normal repro use case.

Also, by definition, for experiments everything has to be either git-committed or saved to cache in order for experiments to actually work. If we allowed the user to force behavior where we do not cache or git-commit certain objects within an experiment, the user would not be able to actually restore/apply that entire experiment state later on.


I think we are really just debating semantics here, and it seems like everyone agrees that for queued/temp dir runs, we should be doing:

  1. dvc commit - save the user's potentially modified dependency state for the queued experiment
  2. dvc checkout - checkout that dependency state in the temp directory before continuing with exp run

For workspace runs, the issue is currently that we only do dvc checkout, which essentially erases the state of any modified dependencies. Whether we address the issue by doing both dvc commit + dvc checkout (for consistency's sake as I have suggested), or by doing neither (and just remove the dvc checkout step entirely for workspace runs as @jorgeorpinel has suggested), the end result as far as what data ends up in the cache at the end of exp run will be same - the modified dependency states and the resulting outputs based on those modified states will all still get committed into cache at the end of dvc repro (via exp run).


What's the problem with queued runs having an extra step vs regular runs? They already have other extra steps (create tmp dir, etc)

From an architecture perspective, IMO the actual execution (the steps taken to actually execute the user's pipeline) should be identical in all cases, whether it's in the user's workspace, on a completely different remote machine, or in a temporary directory. The only differences should be in setting up the environment for that execution. Meaning that in this case the only extra step should be creating the temp directory.

Basically the what it comes down to is that the way experiments were designed (with remote executors in mind) is this:

At the time we queue an experiment, we don't actually care whether we are going to run the experiment in the workspace or in a temp directory (or on some other remote machine). All we do is shove the requested experiment into our queue, to be executed at some point in the future (in any arbitrary machine/environment). Even for workspace runs, we do actually "queue" the experiment. Meaning that this step is intended to be consistent regardless of how (or more specifically where) we end up running that experiment. So in order to keep this consistency, we should just do dvc commit all the time when queuing.

At actual execution time, our executor does the following:

  1. Setup the environment (for workspace runs this is a no-op, for temp dir runs or remote machines this means creating our directory and then populating it via git)
  2. Pops the appropriate experiment out of our queue, and runs it (with dvc checkout + dvc repro).

This behavior is intended to be consistent everywhere, and it's important to remember that the dvc commit that we need for temp dirs has to be done at the time an experiment is queued (for potential execution in any environment).

@jorgeorpinel
Copy link
Contributor Author

jorgeorpinel commented Apr 8, 2021

The "temp data" here is really "modified dependencies", and will end up being committed into cache anyways in the normal repro use case.

True.

it seems like everyone agrees that for queued/temp dir runs, we should be doing:... 1. dvc commit

I didn't necessarily agree that we should commit data in either case. Originally this was about checking out deps to match some lockfile (needed due to implementation for queued runs) which could be unexpected/destructive on regular exp runs.

But thinking about it, we do say that exps are based on the last Git commit, so exp runs are different to repro by definition (I should update docs about that...). And for that to be true, DVC has to checkout deps.

My only other idea is to consider running all exps as --temp (deprecate that option), so that changes in the workspace are left alone, while still checking out the lockfile.

Up to you guys, this is getting overcomplicated probably. But thanks for all the feedback!

@jorgeorpinel jorgeorpinel added the discussion requires active participation to reach a conclusion label Apr 8, 2021
jorgeorpinel added a commit to iterative/dvc.org that referenced this issue Apr 8, 2021
@pmrowla
Copy link
Contributor

pmrowla commented Apr 8, 2021

I didn't necessarily agree that we should commit data in either case. Originally this was about checking out deps to match some lockfile (needed due to implementation for queued runs) which could be unexpected/destructive on regular exp runs.

If your concern is that the user's dvc.lock would be modified because we will write to it on dvc commit, for queued/tempdir runs this change won't even be visible to the user. We restore the user's entire workspace state after queuing the experiment.

This is the same kind of change that is made when you use --set-param. On --set-param, DVC writes that actual param value to params.yaml (or whatever the appropriate params file is). We then create a queued experiment based on that modified working tree. After the changes are pushed into the experiment queue, the user's workspace is restored (so params.yaml will still appear to be completely unchanged to the user).

For queued runs, the user will never see any modified dvc.lock, .dvc file or params.yaml in their workspace.

For workspace runs, the only thing they will see is the same modified dvc.lock file that they would have after running dvc repro.


Again, I think we are confusing what will be an internal implementation detail with behavior that would affect the user. Doing this internal dvc commit everywhere will make exp run more consistent with repro in all cases. This change will not be visible to the user at all (other than fixing the unexpected behavior where dependency modifications were dropped prior to this change)

@pmrowla
Copy link
Contributor

pmrowla commented Apr 8, 2021

One thing that should be documented as a difference between dvc repro and exp run is that --no-commit should be disallowed for exp run, since by definition everything has to be committed to cache in order for experiments to actually work properly.

This also needs to be reflected in the CLI, currently supplying the exp run --no-commit flag is technically allowed but it will likely cause undefined behavior.

@jorgeorpinel

@dberenbaum
Copy link
Collaborator

I didn't necessarily agree that we should commit data in either case. Originally this was about checking out deps to match some lockfile (needed due to implementation for queued runs) which could be unexpected/destructive on regular exp runs.

@jorgeorpinel Do you agree that dvc should keep whatever changes are in the working tree at the time of dvc exp run [--queue]?

This is the intended behavior for both workspace and temp runs. You pointed out that it's not the current behavior because dvc-tracked deps are checked out to match the lockfile in the last commit, and I think we all agree that this is unexpected/destructive. The suggested implementation by @pmrowla is a way to achieve keeping all changes in the working tree and avoiding the unexpected behavior you noted.

jorgeorpinel added a commit to iterative/dvc.org that referenced this issue Apr 10, 2021
@jorgeorpinel
Copy link
Contributor Author

"Based on" here means that the experiment is the diff of the current working directory changes against HEAD.

Makes sense. I keep getting confused about that 🙂 Also means (internally) that the exp commit will have HEAD as it's parent. I clarified the docs a bit in iterative/dvc.org@af0389b

--temp is just provided as a shortcut for --queue; --run-all (with a single experiment in the queue).

TBH I'm still not seeing how --temp can be useful in it's current form. Maybe it should be hidden until you can run multiple ones or parallelize them, etc.?

@pmrowla
Copy link
Contributor

pmrowla commented Apr 11, 2021

I don't really see a reason why it should be hidden? Being able to do a run in the background while still being able to continue modifying whatever I want in my workspace while that job is running seems useful to me.

@jorgeorpinel
Copy link
Contributor Author

jorgeorpinel commented Apr 12, 2021

I see. For very long experiments that makes sense indeed! OK I should document the use then... ⌛

Meanwhile, please notice I added a (new) third item to the OP (should I open a separate issue instead? This ticket is loong); Titled 3. Not all changes to "code" can be queued.

UPDATE: Mentioned about --temp in iterative/dvc.org@12ae65d. In fact I introduced the entire concept of "background runs" with this (see iterative/dvc.org/pull/2368).

jorgeorpinel added a commit to iterative/dvc.org that referenced this issue Apr 12, 2021
jorgeorpinel added a commit to iterative/dvc.org that referenced this issue Apr 12, 2021
@pmrowla

This comment has been minimized.

@pmrowla
Copy link
Contributor

pmrowla commented Apr 12, 2021

The issue with creating a new dvc.yaml file is the same as creating any new code file for experiments. In order for any new (completely untracked) files to be included in the queued experiment, you must at least stage them with git add (but they do not need to be committed).

It looks like this is not currently documented, see: #5029

  • Untracked files are now ignored when populating experiment executor. Only workspace changes to git-tracked files will be passed into the executor.
  • To explicitly pass untracked files (such as a new code classes/files) into the experiment, the file should be staged as a new file via git add <file> or git add --intent-to-add <file> before running the experiment.

For workspace runs since the files are present in the workspace itself, this issue doesn't show up, but for temp directory runs, we only populate the directory with files which are tracked in either git or DVC, untracked files are ignored.

The reason for this is that the alternative would be for us to explicitly include all of the untracked files in our experiment git commits. Doing so will likely result in us git-committing objects which do not belong in git (like large binary files, or things like venv dirs or __pycache__ dirs that the user forgot to git ignore), or files which the user was explicitly not tracking in DVC or git for a reason (like authentication credentials).

@jorgeorpinel

This comment has been minimized.

@jorgeorpinel
Copy link
Contributor Author

jorgeorpinel commented Apr 12, 2021

you must at least stage them with git add (but they do not need to be committed... looks like this is not currently documented

OK let me look into that ⌛ but so will queued experiments be different to regular experiments some times after all?

@dberenbaum
Copy link
Collaborator

the issue w/modified dvc.yaml files is the same commit/checkout issue

But I thought we were only talking about dvc commit earlier, not git commit (see #5593 (comment)). The error in # 3 is 'dvc.yaml' does not exist which prob. won't be helped by dvc commit.

Agree that this seems like a slightly different issue.

For 2 (data dependencies), it seems that the focus is on dependencies already tracked by dvc, where their hashes and other metadata are already tracked by Git in .dvc files. When a dependency is modified, the .dvc file won't be updated (nor will the dvc cache be populated with the modified data) until dvc commit is run. The behavior is the same regardless of whether the experiment is in the workspace or temp. In this case, I think users will always want to dvc commit dependency changes so that the experiment reflects the modified data, although maybe I'm missing some scenario.

For 3 (code/yaml changes), it seems that the focus is on untracked files. Files not tracked by Git will only be reflected in workspace experiments and not in temp experiments. Workspace experiments will use whatever is in the workspace, but temp experiments copy files to the temp dir by checking them out with Git. In this case, I think it's unclear what the expected behavior is, and @pmrowla noted some of the challenges:

Doing so will likely result in us git-committing objects which do not belong in git (like large binary files, or things like venv dirs or pycache dirs that the user forgot to git ignore), or files which the user was explicitly not tracking in DVC or git for a reason (like authentication credentials).

Just a few days ago, someone on Discord was running into this problem when trying an experiment that relied on credentials in ignored files.

@jorgeorpinel jorgeorpinel changed the title exp run: questions/concerns [QA] exp run: checks out data dependencies (destructive) [QA] Apr 12, 2021
@jorgeorpinel
Copy link
Contributor Author

jorgeorpinel commented Apr 12, 2021

please notice I added a (new) third item to the OP (should I open a separate issue instead? This ticket is loong); Titled 3. Not all changes to "code" can be queued.

OK, actually I'm going to move this new conversation to a separate issue: #5801

@jorgeorpinel jorgeorpinel added enhancement Enhances DVC and removed product: VSCode Integration with VSCode extension labels Apr 12, 2021
@dberenbaum
Copy link
Collaborator

What's the status on the remaining part of this issue (checking out data dependencies)? @jorgeorpinel do you still have questions, and are you okay with the dvc commit proposal?

@jorgeorpinel
Copy link
Contributor Author

jorgeorpinel commented Apr 14, 2021

No more Qs and yes! I agree with the proposed solution 👍

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement Enhances DVC p1-important Important, aka current backlog of things to do
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants