exp run: checks out data dependencies (destructive) [QA] #5593

jorgeorpinel · 2021-03-11T00:35:30Z

1. It produces a first unchanged exp

UPDATE: Addressed in #5600

$ git clone [email protected]:iterative/example-get-started.git
$ cd example-get-started
$ dvc pull

$ dvc exp run

This works even when there are no changes to the committed project version (HEAD). Below we can see there are differences in metrics or params:

$  dvc exp show --no-pager
┏━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━┳━━━━━━━━━━┳━━━━━━━━━┳━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━┳━━━━━━━━━━━━━┳━
┃ Experiment    ┃ Created      ┃ avg_prec ┃ roc_auc ┃ prepare.split ┃ prepare.seed ┃ featurize.max_features ┃ featurize.ngrams ┃ train.seed ┃ train.n_est ┃
┡━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━╇━━━━━━━━━━╇━━━━━━━━━╇━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━╇━━━━━━━━━━━━━╇━
│ workspace     │ -            │  0.60405 │  0.9608 │ 0.2           │ 20170428     │ 3000                   │ 2                │ 20170428   │ 100         │
│ master        │ Mar 01, 2021 │  0.60405 │  0.9608 │ 0.2           │ 20170428     │ 3000                   │ 2                │ 20170428   │ 100         │
│ └── exp-44136 │ 02:22 AM     │  0.60405 │  0.9608 │ 0.2           │ 20170428     │ 3000                   │ 2                │ 20170428   │ 100         │
└───────────────┴──────────────┴──────────┴─────────┴───────────────┴──────────────┴────────────────────────┴──────────────────┴────────────┴─────────────┴─

exp diff doesn't print anything.

Is there a use case for this? Otherwise I'd vote to block it.

2. It checks out data dependencies (destructive)

Continuing the previous CLI block

...
$ truncate --size=20M data/data.xml  # This is a stage dep
$ dvc exp run
 33% Checkout|███████████ ...  # I can see this flash momentarily
...
ERROR: Reproduced experiment conflicts with existing experiment 'exp-44136'. To overwrite the existing experiment run:

Even when I changed a dependency, which could be the basis for my experiment, all the pipeline data was checked out again (undoing my manual changes), so this exp is the same as the previous one (1).

BTW if this was the first time I exp run, it would be easy to miss that the data I changed was restored silently in the process. I'd just see that the exp results are the same as in HEAD which would be misleading.

2.2 Does it really behave exactly like `repro`?

https://dvc.org/doc/command-reference/exp/run reads:

dvc exp run is equivalent to dvc repro for experiments. It has the same behavior when it comes to targets and stage execution

We also say:

Before using this command, you'll probably want to make modifications such as data and code updates...

But when I change an added data file and try exp run, it undoes those changes and reports that no stage has changed. repro re-adds it instead, and then runs any stage downstream.

3. Not all changes to "code" can be queued

Extracted to #5801

In the docs we say "Before running an experiment, you'll probably want to make modifications such as data and code updates, or hyperparameter tuning." Is this the intended behavior?

Because, if we see dvc.yaml as code — and I think we do as we one of DVC's principles is pipeline codification — then this statement isn't completely true when it comes to queueing experiments. It works with regular experiments though, which use the workspace files (not a tmp copy), which makes me think this may be unintended (as we want all kinds of experiments to be consistent in behavior AFAIK).

Specifically, if you create (or modify) dvc.yaml between queued runs and then try to --run-all, you get errors. Example:

$ git init; dvc init
$ git add --all; git commit -m "`dvc -V`"
$ dvc stage add -n hi -o hi "echo hey > hi"
Creating 'dvc.yaml'
Adding stage 'hi' in 'dvc.yaml'
...
$ dvc exp run
... # works

$ dvc exp run --queue
Queued experiment '16c7340' for future execution.
$ dvc stage add -fn hi -o hello "echo hi > hello"
Modifying stage 'hi' in 'dvc.yaml'
...
$ dvc exp run --queue
Queued experiment '41791e2' for future execution.

$ dvc exp run --run-all
!ERROR: 'dvc.yaml' does not exist
ERROR: 'dvc.yaml' does not exist
ERROR: Failed to reproduce experiment '41791e2'
ERROR: Failed to reproduce experiment '16c7340'

The text was updated successfully, but these errors were encountered:

pmrowla · 2021-03-11T02:16:32Z

This works even when there are no changes to the committed project version (HEAD). Below we can see there are differences in metrics or params:

exp diff doesn't print anything.

This is a side effect of experiments being identified by an internal hash of the pipeline. We don't store these hashes for the parent commits (HEAD) so the first run will be treated as a unique experiment.

~~Storing it as a separate experiment doesn't really hurt anything, its just an empty git commit, no data or anything else is duplicated.~~

edit: as noted in #5595, for --temp runs in this case we do not create the duplicate experiment. workspace runs should be updated to use the same behavior

Even when I changed a dependency, which could be the basis for my experiment, all the pipeline data was checked out again (undoing my manual changes), so this exp is the same as the previous one (1).

BTW if this was the first time I exp run, it would be easy to miss that the data I changed was restored silently in the process. I'd just see that the exp results are the same as in HEAD which would be misleading.

exp run always starts with a dvc checkout. Prior to the changes in #5586, dvc.lock was always reset to the HEAD state, so experiments would always checkout any dependencies as of HEAD (dropping workspace changes). After that PR is merged, deps will be checked out to their state in your workspace's lock files, so modifications to dependencies should be dvc commit'ed first (so the change is written to dvc.lock or .dvc files) before using exp run.

This is needed for temp directory runs, since we have to dvc checkout the data dependencies in the temp directory. We could consider adjusting behavior so we don't checkout any dependencies in workspace runs, but I don't think we should have different behavior for workspace vs temp dir runs in this case.

pmrowla · 2021-03-11T06:51:20Z

duplicate empty experiment issue (1) will be fixed in #5600

jorgeorpinel · 2021-03-11T22:38:31Z

On 2.

deps will be checked out to their state in your workspace's lock files

OK cool, but it can still be destructive. Should we at least warn users when there are changes in dvc status to confirm that modifications to DVC-tracked file/dir deps will be lost?

Cc @dberenbaum

I don't think we should have different behavior for workspace vs temp dir runs

Good point, agreed. An alternative to keep both kinds of runs consistent would be to copy the workspace data to the tmp dirs... But having exps based on HEAD (not the workspace) makes more sense probably. I'd just add that warning.

dberenbaum · 2021-03-12T01:57:40Z

Should we at least warn users when there are changes in dvc status to confirm that modifications to DVC-tracked file/dir deps will be lost?

This seems reasonable, or suggesting users either dvc commit to keep changes or dvc checkout to recover previous data.

exps based on HEAD

I don't think data dependencies need to be based on HEAD, they just need to be cached and in the dvc.lock file in the workspace so they can be checked out.

pmrowla · 2021-03-12T07:38:05Z

Good point, agreed. An alternative to keep both kinds of runs consistent would be to copy the workspace data to the tmp dirs... But having exps based on HEAD (not the workspace) makes more sense probably. I'd just add that warning.

The decision to not directly copy data here was for supporting CI use cases and eventual remote (machine) execution. The expected workflow there will be that you need to dvc push data to a DVC remote, and then it will be dvc pull'ed (and checked out) when dvc exp run happens on your remote/CI machine.

We also currently avoid duplicating data for the local tempdir runs (assuming the user is using any non-copy cache link type), which would be another concern with copying workspace data into the tmp dirs.

jorgeorpinel · 2021-04-05T05:05:27Z

Hey guys just FYI I don't have any more points here. Can we decide what's the ideal solution if any to 2. and 3. above? (3 is new, but connected to 2.)

I'd just add that warning.

Basically the only possible problem of keeping this behavior even with the warning, is that exp run does not behave like repro. Maybe that's not a problem and we just need to update docs about it?

dberenbaum · 2021-04-05T15:30:01Z

modifications to dependencies should be dvc commit'ed first (so the change is written to dvc.lock or .dvc files) before using exp run

Thanks @jorgeorpinel. It's been awhile since I thought about this issue. Coming back to it now, it's unclear to me if exp run should automatically do a dvc commit first to align with dvc repro. Thoughts?

jorgeorpinel · 2021-04-05T22:10:04Z

I see what you mean @dberenbaum but repro doesn't git commit so it would still feel pretty different :)

deps will be checked out to their state in your workspace's lock files, so modifications to dependencies should be dvc commit'ed first

@pmrowla did explain how we got here ☝️ but why is that needed? It probably makes sense for --queue/temp runs but I still think regular exps shouldn't change dependencies in the workspace. It's kind of unexpected!

dberenbaum · 2021-04-05T22:52:02Z

This is needed for temp directory runs, since we have to dvc checkout the data dependencies in the temp directory. We could consider adjusting behavior so we don't checkout any dependencies in workspace runs, but I don't think we should have different behavior for workspace vs temp dir runs in this case.

I agree with Peter that behavior shouldn't differ between workspace vs temp runs.

If we dvc commit whatever is in the workspace, it can be checked out in both workspace and temp runs without changing anything in the workspace.

jorgeorpinel · 2021-04-06T03:54:42Z

Hmmm workspace runs vs queued, and esp. temp runs are already inherently different anyway. Is there a reason to seek that exact consistency? Because it seems we need to pick between that OR consistency with repro, which I find more appealing

Idk about exp run Git-committing the working tree changes, since experiments are meant to avoid users form polluting the repo with commits.

pmrowla · 2021-04-06T06:15:18Z

It sounds like dvc commiting the workspace state is the way to go here.

Idk about exp run Git-committing the working tree changes, since experiments are meant to avoid users form polluting the repo with commits.

We would only be dvc commiting (so the workspace's current dependency states are written to the workspace's dvc.lock). After exp run the workspace dvc.lock would contain the unstaged changes (as a result of dvc commit), but nothing would be Git-committed from the user's perspective.

pmrowla · 2021-04-06T06:19:40Z

Hmmm workspace runs vs queued, and esp. temp runs are already inherently different anyway. Is there a reason to seek that exact consistency? Because it seems we need to pick between that OR consistency with repro, which I find more appealing

I think from the user's perspective, workspace or temp runs should not be inherently different. This can even be extended to remote execution - if I run dvc exp run --set-param foo=1 I should get a consistent result whether I run it in my local workspace, in a local temp directory, or on some other machine over ssh.

dberenbaum · 2021-04-06T12:43:33Z

@jorgeorpinel I'm not seeing the inconsistency with dvc repro since, as @pmrowla said, there would be no Git commits. Does that address your concern, or is there some other inconsistency I'm missing?

jorgeorpinel · 2021-04-07T21:16:26Z

We would only be dvc commiting

Oh OK, I see.

I think from the user's perspective, workspace or temp runs should not be inherently different.

@pmrowla What I mean is that it's evident that queued runs may need to change the data in the workspace first, to reflect what was present when they were queued (which we agree is an intuitive user expectation). What I don't see is a need for regular runs to do the same (I find that unintuitive).

And yes, a dvc commit may make it less unintuitive (or at least not-destructive) but still why not just behave like regular repro? What's the problem with queued runs having an extra step vs regular runs? They already have other extra steps (create tmp dir, etc).

I'm not seeing the inconsistency with dvc repro

@dberenbaum repro does not dvc commit nor dvc checkout before stage runs.

jorgeorpinel · 2021-04-07T21:27:14Z

p.s. AFAIK committing data is meant as part of the tracking mechanism (at the end of add and repro). I don't think it makes sense to put it at the beginning too, that would repeat the operation in the case of exp run: you commit first, you repro, which commits again? And the cache ends up with some temp data you probably didn't mean to store.

pmrowla · 2021-04-08T03:13:28Z

p.s. AFAIK committing data is meant as part of the tracking mechanism (at the end of add and repro). I don't think it makes sense to put it at the beginning too, that would repeat the operation in the case of exp run: you commit first, you repro, which commits again?

The important thing here is that we are committing two different states though.

The first dvc commit (at the start of exp run) is committing the state of any (modified) data dependencies in the user's workspace.
The second commit (at the end of repro) is committing the state of any outputs that were changed by repro. (But yes, it's true that for a normal repro, this commit does combine committing both dependency and output states to the DVC cache within a single operation.)

And the cache ends up with some temp data you probably didn't mean to store.

The "temp data" here is really "modified dependencies", and will end up being committed into cache anyways in the normal repro use case.

Also, by definition, for experiments everything has to be either git-committed or saved to cache in order for experiments to actually work. If we allowed the user to force behavior where we do not cache or git-commit certain objects within an experiment, the user would not be able to actually restore/apply that entire experiment state later on.

I think we are really just debating semantics here, and it seems like everyone agrees that for queued/temp dir runs, we should be doing:

dvc commit - save the user's potentially modified dependency state for the queued experiment
dvc checkout - checkout that dependency state in the temp directory before continuing with exp run

For workspace runs, the issue is currently that we only do dvc checkout, which essentially erases the state of any modified dependencies. Whether we address the issue by doing both dvc commit + dvc checkout (for consistency's sake as I have suggested), or by doing neither (and just remove the dvc checkout step entirely for workspace runs as @jorgeorpinel has suggested), the end result as far as what data ends up in the cache at the end of exp run will be same - the modified dependency states and the resulting outputs based on those modified states will all still get committed into cache at the end of dvc repro (via exp run).

What's the problem with queued runs having an extra step vs regular runs? They already have other extra steps (create tmp dir, etc)

From an architecture perspective, IMO the actual execution (the steps taken to actually execute the user's pipeline) should be identical in all cases, whether it's in the user's workspace, on a completely different remote machine, or in a temporary directory. The only differences should be in setting up the environment for that execution. Meaning that in this case the only extra step should be creating the temp directory.

Basically the what it comes down to is that the way experiments were designed (with remote executors in mind) is this:

At the time we queue an experiment, we don't actually care whether we are going to run the experiment in the workspace or in a temp directory (or on some other remote machine). All we do is shove the requested experiment into our queue, to be executed at some point in the future (in any arbitrary machine/environment). Even for workspace runs, we do actually "queue" the experiment. Meaning that this step is intended to be consistent regardless of how (or more specifically where) we end up running that experiment. So in order to keep this consistency, we should just do dvc commit all the time when queuing.

At actual execution time, our executor does the following:

Setup the environment (for workspace runs this is a no-op, for temp dir runs or remote machines this means creating our directory and then populating it via git)
Pops the appropriate experiment out of our queue, and runs it (with dvc checkout + dvc repro).

This behavior is intended to be consistent everywhere, and it's important to remember that the dvc commit that we need for temp dirs has to be done at the time an experiment is queued (for potential execution in any environment).

jorgeorpinel · 2021-04-08T04:53:40Z

The "temp data" here is really "modified dependencies", and will end up being committed into cache anyways in the normal repro use case.

True.

it seems like everyone agrees that for queued/temp dir runs, we should be doing:... 1. dvc commit

I didn't necessarily agree that we should commit data in either case. Originally this was about checking out deps to match some lockfile (needed due to implementation for queued runs) which could be unexpected/destructive on regular exp runs.

But thinking about it, we do say that exps are based on the last Git commit, so exp runs are different to repro by definition (I should update docs about that...). And for that to be true, DVC has to checkout deps.

My only other idea is to consider running all exps as --temp (deprecate that option), so that changes in the workspace are left alone, while still checking out the lockfile.

Up to you guys, this is getting overcomplicated probably. But thanks for all the feedback!

per iterative/dvc#5593 (comment)

pmrowla · 2021-04-08T05:17:58Z

I didn't necessarily agree that we should commit data in either case. Originally this was about checking out deps to match some lockfile (needed due to implementation for queued runs) which could be unexpected/destructive on regular exp runs.

If your concern is that the user's dvc.lock would be modified because we will write to it on dvc commit, for queued/tempdir runs this change won't even be visible to the user. We restore the user's entire workspace state after queuing the experiment.

This is the same kind of change that is made when you use --set-param. On --set-param, DVC writes that actual param value to params.yaml (or whatever the appropriate params file is). We then create a queued experiment based on that modified working tree. After the changes are pushed into the experiment queue, the user's workspace is restored (so params.yaml will still appear to be completely unchanged to the user).

For queued runs, the user will never see any modified dvc.lock, .dvc file or params.yaml in their workspace.

For workspace runs, the only thing they will see is the same modified dvc.lock file that they would have after running dvc repro.

Again, I think we are confusing what will be an internal implementation detail with behavior that would affect the user. Doing this internal dvc commit everywhere will make exp run more consistent with repro in all cases. This change will not be visible to the user at all (other than fixing the unexpected behavior where dependency modifications were dropped prior to this change)

pmrowla · 2021-04-08T05:33:16Z

One thing that should be documented as a difference between dvc repro and exp run is that --no-commit should be disallowed for exp run, since by definition everything has to be committed to cache in order for experiments to actually work properly.

This also needs to be reflected in the CLI, currently supplying the exp run --no-commit flag is technically allowed but it will likely cause undefined behavior.

@jorgeorpinel

dberenbaum · 2021-04-08T13:07:38Z

I didn't necessarily agree that we should commit data in either case. Originally this was about checking out deps to match some lockfile (needed due to implementation for queued runs) which could be unexpected/destructive on regular exp runs.

@jorgeorpinel Do you agree that dvc should keep whatever changes are in the working tree at the time of dvc exp run [--queue]?

This is the intended behavior for both workspace and temp runs. You pointed out that it's not the current behavior because dvc-tracked deps are checked out to match the lockfile in the last commit, and I think we all agree that this is unexpected/destructive. The suggested implementation by @pmrowla is a way to achieve keeping all changes in the working tree and avoiding the unexpected behavior you noted.

per iterative/dvc#5593 (comment)

jorgeorpinel · 2021-04-10T17:28:15Z

"Based on" here means that the experiment is the diff of the current working directory changes against HEAD.

Makes sense. I keep getting confused about that 🙂 Also means (internally) that the exp commit will have HEAD as it's parent. I clarified the docs a bit in iterative/dvc.org@af0389b

--temp is just provided as a shortcut for --queue; --run-all (with a single experiment in the queue).

TBH I'm still not seeing how --temp can be useful in it's current form. Maybe it should be hidden until you can run multiple ones or parallelize them, etc.?

pmrowla · 2021-04-11T02:43:04Z

I don't really see a reason why it should be hidden? Being able to do a run in the background while still being able to continue modifying whatever I want in my workspace while that job is running seems useful to me.

jorgeorpinel · 2021-04-12T05:34:08Z

I see. For very long experiments that makes sense indeed! OK I should document the use then... ⌛

~~Meanwhile, please notice I added a (new) third item to the OP (should I open a separate issue instead? This ticket is loong); Titled 3. Not all changes to "code" can be queued.~~

UPDATE: Mentioned about --temp in iterative/dvc.org@12ae65d. ~~In fact I introduced the entire concept of "background runs" with this~~ (see iterative/dvc.org/pull/2368).

explain --temp use case per iterative/dvc#5593 (comment)

pmrowla · 2021-04-12T07:26:52Z

The issue with creating a new dvc.yaml file is the same as creating any new code file for experiments. In order for any new (completely untracked) files to be included in the queued experiment, you must at least stage them with git add (but they do not need to be committed).

It looks like this is not currently documented, see: #5029

Untracked files are now ignored when populating experiment executor. Only workspace changes to git-tracked files will be passed into the executor.

To explicitly pass untracked files (such as a new code classes/files) into the experiment, the file should be staged as a new file via git add <file> or git add --intent-to-add <file> before running the experiment.

For workspace runs since the files are present in the workspace itself, this issue doesn't show up, but for temp directory runs, we only populate the directory with files which are tracked in either git or DVC, untracked files are ignored.

The reason for this is that the alternative would be for us to explicitly include all of the untracked files in our experiment git commits. Doing so will likely result in us git-committing objects which do not belong in git (like large binary files, or things like venv dirs or __pycache__ dirs that the user forgot to git ignore), or files which the user was explicitly not tracking in DVC or git for a reason (like authentication credentials).

jorgeorpinel · 2021-04-12T08:18:47Z

you must at least stage them with git add (but they do not need to be committed... looks like this is not currently documented

OK let me look into that ⌛ ~~but so will queued experiments be different to regular experiments some times after all?~~

dberenbaum · 2021-04-12T14:07:39Z

the issue w/modified dvc.yaml files is the same commit/checkout issue

But I thought we were only talking about dvc commit earlier, not git commit (see #5593 (comment)). The error in # 3 is 'dvc.yaml' does not exist which prob. won't be helped by dvc commit.

Agree that this seems like a slightly different issue.

For 2 (data dependencies), it seems that the focus is on dependencies already tracked by dvc, where their hashes and other metadata are already tracked by Git in .dvc files. When a dependency is modified, the .dvc file won't be updated (nor will the dvc cache be populated with the modified data) until dvc commit is run. The behavior is the same regardless of whether the experiment is in the workspace or temp. In this case, I think users will always want to dvc commit dependency changes so that the experiment reflects the modified data, although maybe I'm missing some scenario.

For 3 (code/yaml changes), it seems that the focus is on untracked files. Files not tracked by Git will only be reflected in workspace experiments and not in temp experiments. Workspace experiments will use whatever is in the workspace, but temp experiments copy files to the temp dir by checking them out with Git. In this case, I think it's unclear what the expected behavior is, and @pmrowla noted some of the challenges:

Doing so will likely result in us git-committing objects which do not belong in git (like large binary files, or things like venv dirs or pycache dirs that the user forgot to git ignore), or files which the user was explicitly not tracking in DVC or git for a reason (like authentication credentials).

Just a few days ago, someone on Discord was running into this problem when trying an experiment that relied on credentials in ignored files.

jorgeorpinel · 2021-04-12T18:20:03Z

please notice I added a (new) third item to the OP (should I open a separate issue instead? This ticket is loong); Titled 3. Not all changes to "code" can be queued.

OK, actually I'm going to move this new conversation to a separate issue: #5801

dberenbaum · 2021-04-13T17:31:10Z

What's the status on the remaining part of this issue (checking out data dependencies)? @jorgeorpinel do you still have questions, and are you okay with the dvc commit proposal?

jorgeorpinel · 2021-04-14T00:51:29Z

No more Qs and yes! I agree with the proposed solution 👍

jorgeorpinel added the product: VSCode Integration with VSCode extension label Mar 11, 2021

jorgeorpinel changed the title ~~exp run: produces an unchanged HEAD exp~~ exp run: QA questions Mar 11, 2021

jorgeorpinel changed the title ~~exp run: QA questions~~ exp run: QA questions/concerns Mar 11, 2021

jorgeorpinel changed the title ~~exp run: QA questions/concerns~~ exp run: questions/concerns [QA] Mar 11, 2021

jorgeorpinel mentioned this issue Mar 11, 2021

exp run: --temp crashes #5595

Closed

This was referenced Mar 11, 2021

exp run: reset when the lock file is in Git #5553

Closed

exp run: do nothing if result matches parent (HEAD) #5600

Merged

jorgeorpinel added the discussion requires active participation to reach a conclusion label Apr 8, 2021

jorgeorpinel added a commit to iterative/dvc.org that referenced this issue Apr 8, 2021

ref: clarify that exp run isn't exactly like repro

01d6b21

per iterative/dvc#5593 (comment)

jorgeorpinel mentioned this issue Apr 8, 2021

ref: clarify that exp run isn't exactly like repro iterative/dvc.org#2365

Closed

jorgeorpinel added a commit to iterative/dvc.org that referenced this issue Apr 10, 2021

ref: clarify "based on" in exp run

af0389b

per iterative/dvc#5593 (comment)

jorgeorpinel mentioned this issue Apr 12, 2021

ref: clarifications to exp run iterative/dvc.org#2368

Merged

jorgeorpinel added a commit to iterative/dvc.org that referenced this issue Apr 12, 2021

ref: intro background concept to exp run, and

12ae65d

explain --temp use case per iterative/dvc#5593 (comment)

jorgeorpinel added a commit to iterative/dvc.org that referenced this issue Apr 12, 2021

ref: intro background concept to exp run, and

c389ca7

explain --temp use case per iterative/dvc#5593 (comment)

This comment has been minimized.

Sign in to view

jorgeorpinel changed the title ~~exp run: questions/concerns [QA]~~ exp run: checks out data dependencies (destructive) [QA] Apr 12, 2021

jorgeorpinel mentioned this issue Apr 12, 2021

exp run: new code (or any untracked files) can't be queued [QA] #5801

Closed

jorgeorpinel added enhancement Enhances DVC and removed product: VSCode Integration with VSCode extension labels Apr 12, 2021

dberenbaum added p1-important Important, aka current backlog of things to do and removed discussion requires active participation to reach a conclusion labels Apr 14, 2021

pmrowla self-assigned this Apr 15, 2021

pmrowla mentioned this issue Apr 21, 2021

exp run: dvc commit DVC-tracked data deps when stashing an experiment #5859

Merged

2 tasks

pmrowla closed this as completed in #5859 Apr 23, 2021

jorgeorpinel mentioned this issue Jun 3, 2021

exp run: internal data committing performance implications iterative/dvc.org#2418

Closed

2 tasks

jorgeorpinel mentioned this issue Feb 11, 2022

guide: note that exp run commits data iterative/dvc.org#3271

Merged

dberenbaum mentioned this issue Dec 13, 2022

dvc: save_dvc_exp doesn't work if repo has .dvc files. iterative/dvclive#408

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

exp run: checks out data dependencies (destructive) [QA] #5593

exp run: checks out data dependencies (destructive) [QA] #5593

jorgeorpinel commented Mar 11, 2021 •

edited

Loading

pmrowla commented Mar 11, 2021 •

edited

Loading

pmrowla commented Mar 11, 2021 •

edited by jorgeorpinel

Loading

jorgeorpinel commented Mar 11, 2021 •

edited

Loading

dberenbaum commented Mar 12, 2021

pmrowla commented Mar 12, 2021 •

edited

Loading

jorgeorpinel commented Apr 5, 2021

dberenbaum commented Apr 5, 2021

jorgeorpinel commented Apr 5, 2021 •

edited

Loading

dberenbaum commented Apr 5, 2021

jorgeorpinel commented Apr 6, 2021 •

edited

Loading

pmrowla commented Apr 6, 2021

pmrowla commented Apr 6, 2021

dberenbaum commented Apr 6, 2021

jorgeorpinel commented Apr 7, 2021 •

edited

Loading

jorgeorpinel commented Apr 7, 2021 •

edited

Loading

pmrowla commented Apr 8, 2021 •

edited

Loading

jorgeorpinel commented Apr 8, 2021 •

edited

Loading

pmrowla commented Apr 8, 2021 •

edited

Loading

pmrowla commented Apr 8, 2021

dberenbaum commented Apr 8, 2021

jorgeorpinel commented Apr 10, 2021

pmrowla commented Apr 11, 2021

jorgeorpinel commented Apr 12, 2021 •

edited

Loading

This comment has been minimized.

pmrowla commented Apr 12, 2021 •

edited

Loading

This comment has been minimized.

jorgeorpinel commented Apr 12, 2021 •

edited

Loading

dberenbaum commented Apr 12, 2021

jorgeorpinel commented Apr 12, 2021 •

edited

Loading

dberenbaum commented Apr 13, 2021

jorgeorpinel commented Apr 14, 2021 •

edited

Loading

exp run: checks out data dependencies (destructive) [QA] #5593

exp run: checks out data dependencies (destructive) [QA] #5593

Comments

jorgeorpinel commented Mar 11, 2021 • edited Loading

1. It produces a first unchanged exp

2. It checks out data dependencies (destructive)

2.2 Does it really behave exactly like repro?

3. Not all changes to "code" can be queued

pmrowla commented Mar 11, 2021 • edited Loading

pmrowla commented Mar 11, 2021 • edited by jorgeorpinel Loading

jorgeorpinel commented Mar 11, 2021 • edited Loading

dberenbaum commented Mar 12, 2021

pmrowla commented Mar 12, 2021 • edited Loading

jorgeorpinel commented Apr 5, 2021

dberenbaum commented Apr 5, 2021

jorgeorpinel commented Apr 5, 2021 • edited Loading

dberenbaum commented Apr 5, 2021

jorgeorpinel commented Apr 6, 2021 • edited Loading

pmrowla commented Apr 6, 2021

pmrowla commented Apr 6, 2021

dberenbaum commented Apr 6, 2021

jorgeorpinel commented Apr 7, 2021 • edited Loading

jorgeorpinel commented Apr 7, 2021 • edited Loading

pmrowla commented Apr 8, 2021 • edited Loading

jorgeorpinel commented Apr 8, 2021 • edited Loading

pmrowla commented Apr 8, 2021 • edited Loading

pmrowla commented Apr 8, 2021

dberenbaum commented Apr 8, 2021

jorgeorpinel commented Apr 10, 2021

pmrowla commented Apr 11, 2021

jorgeorpinel commented Apr 12, 2021 • edited Loading

This comment has been minimized.

pmrowla commented Apr 12, 2021 • edited Loading

This comment has been minimized.

jorgeorpinel commented Apr 12, 2021 • edited Loading

dberenbaum commented Apr 12, 2021

jorgeorpinel commented Apr 12, 2021 • edited Loading

dberenbaum commented Apr 13, 2021

jorgeorpinel commented Apr 14, 2021 • edited Loading

jorgeorpinel commented Mar 11, 2021 •

edited

Loading

2.2 Does it really behave exactly like `repro`?

pmrowla commented Mar 11, 2021 •

edited

Loading

pmrowla commented Mar 11, 2021 •

edited by jorgeorpinel

Loading

jorgeorpinel commented Mar 11, 2021 •

edited

Loading

pmrowla commented Mar 12, 2021 •

edited

Loading

jorgeorpinel commented Apr 5, 2021 •

edited

Loading

jorgeorpinel commented Apr 6, 2021 •

edited

Loading

jorgeorpinel commented Apr 7, 2021 •

edited

Loading

jorgeorpinel commented Apr 7, 2021 •

edited

Loading

pmrowla commented Apr 8, 2021 •

edited

Loading

jorgeorpinel commented Apr 8, 2021 •

edited

Loading

pmrowla commented Apr 8, 2021 •

edited

Loading

jorgeorpinel commented Apr 12, 2021 •

edited

Loading

pmrowla commented Apr 12, 2021 •

edited

Loading

jorgeorpinel commented Apr 12, 2021 •

edited

Loading

jorgeorpinel commented Apr 12, 2021 •

edited

Loading

jorgeorpinel commented Apr 14, 2021 •

edited

Loading