-
Notifications
You must be signed in to change notification settings - Fork 1.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
exp run: add experiment name check. #6848
Conversation
We should not support slashes in experiment names. Yes, the full ref can contain slashes, and we already use slashes in exp refs to organize them, but the user-readable (short) name should not contain slashes - the same as git branch or tag (short) names. |
Ah ok, that's my misunderstanding then, I didn't think cgit allowed you to nest anything in |
I will ban it in this PR, and then clarify in doc. |
Is it harder to handle nested refs in |
1. Add experiment name check (https://git-scm.com/docs/git-check-ref-format) 2. Add duplicate exp name check. 3. Add some unit test for it.
e75b4eb
to
b812d61
Compare
The |
There's nothing preventing us from allowing slashes in the exp refs, but once we start allowing them it means we cannot extend the current naming convention of Regarding user naming conventions, dots |
Co-authored-by: Peter Rowlands (변기호) <[email protected]>
Co-authored-by: Peter Rowlands (변기호) <[email protected]>
dvc/repo/experiments/__init__.py
Outdated
name = kwargs.pop("name", None) | ||
self._validate_ref_name(name, **kwargs) | ||
|
||
return self._stash_exp(*args, name=name, **kwargs) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It looks a bit strange to pop
a kwarg and then re-use it again in the following call (to _stash_exp
name = kwargs.pop("name", None) | |
self._validate_ref_name(name, **kwargs) | |
return self._stash_exp(*args, name=name, **kwargs) | |
self._validate_ref_name(kwargs.get("name"), **kwargs) | |
return self._stash_exp(*args, **kwargs) |
Alternatively, you could also just make name
default to None
in _validate_ref_name
and call it with **kwargs
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The problem is that kwargs
still contains name
argument and this would cause a duplication. If we want to use this method, we need to get name
, force
, reset
and pass them separately.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Seems like the simplest fix is to just do
def _validate_ref_name(self, name: Optional[str] = None, **kwargs):
and call it from here as
self._validate_ref_name(**kwargs)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You could also change this function to be
def new(self, *args, checkpoint_resume: Optional[str] = None, name: Optional[str] = None, **kwargs):
and then keep the call you have as
self._validate_ref_name(name, **kwargs)
return self._stash_exp(*args, name=name, **kwargs)
The point is just that doing
value = kwargs.pop(key)
func(key=value, **kwargs)
isn't particularly clean code
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Seems like the simplest fix is to just do
def _validate_ref_name(self, name: Optional[str] = None, **kwargs):
and call it from here as
self._validate_ref_name(**kwargs)
Yes, I also considered it, but I think it is more clear to call it explicitly.
@@ -473,6 +473,17 @@ def _log_reproduced(self, revs: Iterable[str], tmp_dir: bool = False): | |||
"\tdvc exp branch <exp> <branch>\n" | |||
) | |||
|
|||
def _validate_new_ref(self, exp_ref: ExpRefInfo): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This function is used for validating a new reference, not only the name. Because we include duplication detect in it. And also the duplication detection is not to the name but to the whole reference.
Sounds good. I don't think there's a hard requirement for now at least, but helpful to have more context about the rationale. |
Looks good! One small request: can we remove "Reproduced" from dvc/dvc/repo/experiments/base.py Line 39 in c075921
|
|
This PR forbids the duplicate name and name with some illegal character (space, asterisk, etc), in my opinion, better to make an error msg if someone wants to use an illegal name. |
OK I created iterative/dvc.org#3065 then. |
* ref: state exp names shoold be unique rel. iterative/dvc#6848 * Restyled by prettier (#3066) Co-authored-by: Restyled.io <[email protected]> Co-authored-by: restyled-io[bot] <32688539+restyled-io[bot]@users.noreply.github.com> Co-authored-by: Restyled.io <[email protected]>
* ref: state exp names shoold be unique rel. iterative/dvc#6848 * Restyled by prettier (#3066) Co-authored-by: Restyled.io <[email protected]> Co-authored-by: restyled-io[bot] <32688539+restyled-io[bot]@users.noreply.github.com> Co-authored-by: Restyled.io <[email protected]>
fix #6674 and fix #6652
❗ I have followed the Contributing to DVC checklist.
📖 If this PR requires documentation updates, I have created a separate PR (or issue, at least) in dvc.org and linked it here.
Thank you for the contribution - we'll try to review it as soon as possible. 🙏
@pmrowla , in #6819, exp name with '/' inside are request. And these types of exp names are stored in a subdirectory. For example in #6843, it is stored in
.git/refs/heads/dependabot/pip/pylint-2.11.1
. The subdirectory here is used for grouping branches. Currently, ourdvc exp
did not support this. It involves not only name checking here, maybe we should reopen #6819 and create a new PR for it?