-
Notifications
You must be signed in to change notification settings - Fork 1.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
refactor: extract stage's relationship with Dvcfile out from Stage #3619
Conversation
e1d25d6
to
4ffb5cf
Compare
Travis is offline for maintenance at the moment. /cc @efiop |
dvc/repo/run.py
Outdated
|
||
stage = Stage.create(self, **kwargs) | ||
stage = Dvcfile.create_stage(self, **kwargs) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Again, this feels weird (esp because of Dvcfile(self, stage.path).dump(stage)
down below). The multistage PR still looks strange https://github.com/iterative/dvc/pull/3584/files#r407061889 too . We could definitely do better.
dvc/dvcfile.py
Outdated
raise StageFileFormatError(fname, exc) | ||
|
||
@classmethod | ||
def create_stage(cls, repo, accompany_outs=False, **kwargs): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pretty much the whole logic doesn't belong to dvcfile itself, but belongs to the Stage class. Seems like the only thing that Dvcfile
should care about here is the filename.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
E.g. it should care about if os.path.exist
at the bottom. So, for example, it could be Dvcfile.create()
method, that would in turn call Stage.create()
with all that meat and then would check for already existing dvcfile with a path like that.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It needs to be somewhere in between of those two. Dvcfile
should care about filename, if it's a multistage dvc, if stage already exists with the given name, etc.
Rest can be taken care by Stage
, but it's not really linear at the time. :(
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@skshetry Yes, that was my point. Right now create_stage
does too much with the Stage. It should pass to Stage.create()
and deal with dvcfile file/existing dvcfile/etc.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Took a closer look. Looks like Dvcfile.create
should handle things like filename/wdir validation. E.g. things like _stage_name
and _check_stage_path
don't belong to Stage
anymore, they should be handled in Dvcfile
.
EDIT: though not quite :(
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Did a bit of refactor. Most of the things stays on Stage::create()
. Only overwriting stuffs have been pulled to Dvcfile
.
1ce9045
to
ff12d71
Compare
@@ -47,7 +47,6 @@ def run(self): | |||
no_exec=self.args.no_exec, | |||
overwrite=overwrite, | |||
ignore_build_cache=self.args.ignore_build_cache, | |||
remove_outs=self.args.remove_outs, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Removed this deprecated flag.
) | ||
|
||
@classmethod | ||
def is_stage_file(cls, path): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This kind of naming will be changing on upcoming PRs. For now, stage_file and dvcfile are essentially the same thing and should work.
@classmethod | ||
def check_dvc_filename(cls, path): | ||
if not cls.is_valid_filename(path): | ||
raise StageFileBadNameError( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Again, exceptions will probably change wherever this does not make sense in future PRs.
We might've tried to bite more than we could chew here. Even though the arch is far from ideal yet, this PR does a lot of good refactorings, so let's merge it and move on, we'll introduce better interfaces this week. |
❗ I have followed the Contributing to DVC checklist.
📖 If this PR requires documentation updates, I have created a separate PR (or issue, at least) in dvc.org and linked it here. If the CLI API is changed, I have updated tab completion scripts.
❌ I will check DeepSource, CodeClimate, and other sanity checks below. (We consider them recommendatory and don't expect everything to be addressed. Please fix things that actually improve code or fix bugs.)
Thank you for the contribution - we'll try to review it as soon as possible. 🙏
This is just a refactor, which extracts
Stage
's relationship with actual Dvcfile (with regards to parsing/dumping, validating schema, etc) to a separate abstraction (Dvcfile
).Now,
Stage
is just an in-memory representation that can work without in-relation to Dvcfile.Related to #1871