-
Notifications
You must be signed in to change notification settings - Fork 1.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
repro: Move pull logic to inside Stage. #9434
Conversation
365648d
to
9c960d0
Compare
Codecov ReportPatch coverage:
Additional details and impacted files@@ Coverage Diff @@
## main #9434 +/- ##
=======================================
Coverage 91.60% 91.60%
=======================================
Files 489 489
Lines 38166 38167 +1
Branches 5469 5469
=======================================
+ Hits 34962 34963 +1
Misses 2639 2639
Partials 565 565
☔ View full report in Codecov by Sentry. |
Sorry, this is more about QA for #9395. I tried:
I think it shouldn't make the whole command fail if run cache isn't supported or fails to be pulled. |
I guess so. I think it is a problem of |
ping @skshetry does this makes sense for your comment in #9395 (comment) ? |
IMHO since |
dvc/stage/__init__.py
Outdated
for objs in self.get_used_objs().values(): | ||
self.repo.cloud.pull(objs) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You probably need to pass odb here.
for objs in self.get_used_objs().values(): | |
self.repo.cloud.pull(objs) | |
for odb, objs in self.get_used_objs().items(): | |
self.repo.cloud.pull(objs, odb=odb) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Talk to Ruslan and Peter please about this on how to pull from cloud versioned remotes. Most likely this involves index, but make sure to discuss. With this change, cloud versioning won't work.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I got the reasoning for putting it inside here because of the rwlocks, but still missing the point of not using the high level pull API, as it would solve that
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@iterative/dvc could you reason what would be the downside of using self.repo.pull
inside a stage vs trying to use some low-level API?
In addition to making this work for cloud versioning here, I also didn´t find a good way of pulling dependencies as in #9437 (comment) without using stage.repo.pull
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's a large dependency that we are adding that is not meant for internal consumption. Repo.pull()
is equivalent to dvc pull
, so it's heavy and complex. And this one does that in a loop, for multiple stages.
Repo.pull(stage.addressing)
reloads dvc.yaml
/dvc.lock
file, finds proper objects/index, and downloads them.
Also, improving UI for dvc repro
/dvc exp run
may be difficult as it's now tied with dvc pull
. What we really need here is a way to pull for one stage.
I don't have much issue with this as temporary solution, but I could see it getting permanent.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Makes sense
What we really need here is a way to pull for one stage.
I don't have much issue with this as temporary solution, but I could see it getting permanent.
Ok. I will open an issue to revisit this. Probably worth waiting for the new fetch
self._sync_import(dry, force, kwargs.get("jobs", None), no_download) | ||
elif not self.frozen and self.cmd: | ||
self._run_stage(dry, force, **kwargs) | ||
else: | ||
elif kwargs.get("pull"): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should not this come before you run the stage?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is for pulling outputs of unchanged stages.
The follow-up P.R. adds a new flag that enables pulling dependencies see #9437 (comment)
9c960d0
to
3577947
Compare
3577947
to
8ecd5ab
Compare
@daavoo Is it supposed to be fixed? Do you want me to create a follow-up issue? I still see the same error when trying to pull the run cache for an http remote. |
Sorry, no, It is not fixed. Please open issue |
Per #9395 (comment)