Skip to content

This issue was moved to a discussion.

You can continue the conversation there. Go to discussion →

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

run: deprecate in favor of stage add #5784

Closed
jorgeorpinel opened this issue Apr 8, 2021 · 13 comments
Closed

run: deprecate in favor of stage add #5784

jorgeorpinel opened this issue Apr 8, 2021 · 13 comments
Labels
discussion requires active participation to reach a conclusion ui user interface / interaction

Comments

@jorgeorpinel
Copy link
Contributor

jorgeorpinel commented Apr 8, 2021

What's the roadmap for phasing out dvc run? Is there another issue tracking this? Thanks

On the docs side, beyond just removing https://dvc.org/doc/command-reference/run, we'd need to review all the example code blocks using run... (iterative/dvc.org/issues/2076)

@jorgeorpinel jorgeorpinel added the question I have a question? label Apr 8, 2021
@dberenbaum
Copy link
Collaborator

Are we even certain we want to do this? It seems like it's still useful as a shortcut for dvc stage add && dvc repro (or exp run instead of repo).

@shcheklein
Copy link
Member

My 2cs on this - for me run was a bit confusing because of this double semantics (create + actually run). And we have a few discussions about changing this long long time ago. My personal preference would be to have dvc stage add and add an option dvc stage add --run, for example if we need to actually execute it.

Btw, I think it's a shortcut for dvc stage add && dvc repro dvc.yaml:stage-name?

@jorgeorpinel
Copy link
Contributor Author

it's a shortcut for dvc stage add && dvc repro dvc.yaml:stage-name
add an option dvc stage add --run

Agree with Ivan on the UI if we're keeping this shortcut 🙂

@jorgeorpinel jorgeorpinel added discussion requires active participation to reach a conclusion ui user interface / interaction and removed question I have a question? labels Apr 10, 2021
@jorgeorpinel jorgeorpinel changed the title run: deprecate in favor of stage add? run: deprecate in favor of stage add Apr 10, 2021
@karajan1001
Copy link
Contributor

dvc run equals to dvc stage add plus dvc repro. But in :

dvc/dvc/stage/__init__.py

Lines 389 to 407 in 353e4cf

def reproduce(self, interactive=False, **kwargs):
if not (kwargs.get("force", False) or self.changed()):
if not isinstance(self, PipelineStage) and self.is_data_source:
logger.info("'%s' didn't change, skipping", self.addressing)
else:
logger.info(
"Stage '%s' didn't change, skipping", self.addressing
)
return None
msg = (
"Going to reproduce {stage}. "
"Are you sure you want to continue?".format(stage=self)
)
if interactive and not prompt.confirm(msg):
raise DvcException("reproduction aborted by the user")
self.run(**kwargs)

stage.reproduce called stage.run. Maybe we should rename them?

@efiop
Copy link
Contributor

efiop commented Apr 12, 2021

@karajan1001 More precisely dvc stage add is dvc run --no-exec. dvc repro will also walk the dependencies and reproduce them, so it is not quite the same.

@jorgeorpinel Regarding the roadmap. We've introduced stage add for experiments, as run was odd for that and intended to keep dvc run/repro for now, to see if we still need them in particular scenarios or if experiments will actually replace them both together. For now it is not clear, but we will get back to it in the future.

Closing for now, since there are no action points here. Feel free to transfer into discussions if there is something left to discuss.

@efiop efiop closed this as completed Apr 12, 2021
@shcheklein
Copy link
Member

So, what do we recommend by default? run or stage add?

@efiop
Copy link
Contributor

efiop commented Apr 12, 2021

@shcheklein For old scenarios without experiments - run, for experiments or when you would otherwise use --no-exec - stage add.

@shcheklein
Copy link
Member

Hmm, so, what should we use in the Get started, for example? When we build the pipeline we don't have experiments yet, but we'll have them in the very last section. Keep using dvc run? But then to be honest it's not clear where we want to use the dvc stage add at all? I would prefer to have a single default in the docs and mention the second command only if we have a very clear case for it (and don't see one for now). It will be very confusing to explain two different (but very similar) commands in docs when we have different things related to the pipelines.

@jorgeorpinel I think in docs we can discuss with @dberenbaum and default to one of those and mention in the second that this is a short cut pretty much. If we decide to keep both in DVC, let's use dvc run then for now (and mention --no-exec option when it makes sense) and in the dvc stage add just redirect to dvc run. My take - It'll be the least confusing thing for users.

WDYT?

@karajan1001
Copy link
Contributor

karajan1001 commented Apr 13, 2021

dvc run is shortcut for dvc stage add && dvc stage repro, just like git pull and git fetch && git merge.
But git pull is a highly frequent command, while dvc run only used once at the beginning.

@shcheklein
Copy link
Member

thanks @karajan1001

while dvc run only used once at the beginning.

so, means that we should get rid of it in your opinion? (no need for a short cut for a rarely used combination). Also, I think we can easily add a flag (dvc stage add --run) if we'll see that it's needed.

@shcheklein
Copy link
Member

To add more to this:

@casperdcl mentioned during the meeting that we have this in the "official" blog pos:

Note, we use dvc stage add command instead of dvc run. Starting from DVC 2.0 we begin extracting all stage specific functionality under dvc stage umbrella. dvc run is still working, but will be deprecated in the following major DVC version (most likely in 3.0).

It means that (at least from the docs perspective) it makes sense to get rid of dvc run everywhere, replace with dvc stage add - it'll be simpler and consistent.

(Same btw, applies to the dvc repro - in many case it might make sense to start replacing this with dvc exp run)

@casperdcl
Copy link
Contributor

yes and as also mentioned... is dvc run && dvc repro identical to dvc stage add && dvc exp run && dvc exp gc?

@dberenbaum
Copy link
Collaborator

I think it's fine to move towards this, but it seems like there are a few dependencies that need to be addressed first:

@efiop efiop reopened this Apr 15, 2021
@efiop efiop closed this as completed Apr 15, 2021
@iterative iterative locked and limited conversation to collaborators Apr 15, 2021

This issue was moved to a discussion.

You can continue the conversation there. Go to discussion →

Labels
discussion requires active participation to reach a conclusion ui user interface / interaction
Projects
None yet
Development

No branches or pull requests

6 participants