Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

repro -P seems to check status of dependancies too often #3644

Closed
charlesbaynham opened this issue Apr 17, 2020 · 2 comments
Closed

repro -P seems to check status of dependancies too often #3644

charlesbaynham opened this issue Apr 17, 2020 · 2 comments
Labels
bug Did we break something?

Comments

@charlesbaynham
Copy link
Contributor

DVC 0.91.0, Windows 10, pip

Description

Running dvc repro -P on a large graph seems to check all the early dependences once for each output, even if it has already been checked.

In the following repo:

                        step1.dvc
                          |
         ----------------------------------------
         |            |            |            |           
 step2_A.dvc  step2_B.dvc  step2_C.dvc  step2_D.dvc

Where each of the step 2 stages depends on steps 1, I think that calling dvc repro -P results in 4x checks of the outputs of step1.

I conclude this because, in my repo, I have several hundred stages in "step2" which all depend on a "step1" which is an import. When I run dvc repro -P, I get the

WARNING: DVC-file 'xxx.dvc' is locked. Its dependencies are not going to be reproduced.

message several hundred times.

@triage-new-issues triage-new-issues bot added the triage Needs to be triaged label Apr 17, 2020
@charlesbaynham
Copy link
Contributor Author

In Discord, @efiop suggested problem might be in

if pipeline or all_pipelines:
if all_pipelines:
pipelines = active_pipelines
else:
stage = Dvcfile(self, target).load()
pipelines = [get_pipeline(active_pipelines, stage)]
targets = []
for pipeline in pipelines:
for stage in pipeline:
if pipeline.in_degree(stage) == 0:
targets.append(stage)
else:

@efiop efiop added bug Did we break something? and removed triage Needs to be triaged labels Apr 20, 2020
efiop pushed a commit that referenced this issue Apr 21, 2020
* Only reproduce steps once

* Add test

* Fix linter nasties

* Got the order wrong

* Check for previous run

* Formatting

* Do the de-duplication in _reproduce_stages

Co-authored-by: Charles Baynham <[email protected]>
@efiop
Copy link
Contributor

efiop commented Apr 21, 2020

Fixed by #3645 . Kudos @charlesbaynham 🥇

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Did we break something?
Projects
None yet
Development

No branches or pull requests

2 participants