Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add repro -R, support dirs & stage files when running pull/push/repro… #2030

Merged
merged 2 commits into from
May 27, 2019

Conversation

prihoda
Copy link
Contributor

@prihoda prihoda commented May 21, 2019

Added support for dvc repro --recursive.
Added support for recursive mode to accept not just directories but also dvc files:
dvc pull/push/fetch/checkout/repro --recursive someFile.dvc someDir/
for convenience (following unix cp -r mv -r conventions that support both dirs and files).

DVC issue #1989
New dvc.org docs issue: iterative/dvc.org#367

pared
pared previously requested changes May 21, 2019
Copy link
Contributor

@pared pared left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good stuff @prihoda!
Two things thought:

  1. Would you mind writing test that repro --recursive actually gathers only stages inside targeted dir for recursive = True and isdir(target) = True?
  2. I think change of checkout behaviour could be handled in separate PR. It is just a observation to be considered in future requests.

@prihoda
Copy link
Contributor Author

prihoda commented May 22, 2019

I realized that there's actually an edge case with -R --single-item that will not behave the expected way (at least not deterministically).

Imagine you have the following DAG: dir/c -> dir/b -> anotherdir/a (c depends on b depends on a).
When anotherdir/a changes and we run dvc repro -R --single-item dir/, we would expect b to be reproduced and therefore c to be reproduced. However, the stages b and c are collected in order as provided by the filesystem, so it can happen that c is checked first and no changes are detected and only then b is reproduced on its own.

I will try to fix this by sorting the found stages in dfs_postorder_nodes order (this is only important in case of --single-item, otherwise the stages will be ordered properly anyway in _reproduce_stages).

@prihoda
Copy link
Contributor Author

prihoda commented May 23, 2019

@pared @MrOutis We are ready for review again ;)

@efiop efiop requested a review from pared May 23, 2019 13:56
@pared pared dismissed their stale review May 23, 2019 19:57

No longer applies

Copy link
Contributor

@pared pared left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thorough testing!
Sorry for mentioning it only now, but recently we started to switch to pytest. Would you be willing to refactor it accordingly?

EDIT:
here is example of test refactoring:
#1953

@prihoda prihoda force-pushed the recursive-repro branch from 6cac8aa to e11c3cb Compare May 27, 2019 10:49
Copy link
Contributor

@efiop efiop left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Outstanding! 🥇 Thank you so much @prihoda !

@@ -35,7 +35,7 @@ def _reproduce_stage(stages, node, force, dry, interactive, no_commit):
def reproduce(
self,
target=None,
recursive=True,
single_item=False,
Copy link
Contributor

@efiop efiop May 27, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is breaking the backward compatibility, but it is fine since our API is not officially released yet and this change is indeed worthy.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually this is just the internal naming, recursive argument was originally used to specify the default behaviour (recursive = not self.args.single_item), so the behavior stays the same. I just renamed the recursive method argument to single_item (and removed the negation) and added a new recursive argument that corresponds to the --recursive option.

@efiop efiop requested a review from pared May 27, 2019 15:26
Copy link
Contributor

@pared pared left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great change!

@efiop efiop merged commit 29fd8e2 into iterative:master May 27, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants