Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Simpler param updates with python-benedict #4780

Merged
merged 1 commit into from
Nov 12, 2020
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
19 changes: 3 additions & 16 deletions dvc/repo/experiments/__init__.py
Original file line number Diff line number Diff line change
Expand Up @@ -312,31 +312,18 @@ def _unpack_args(self, tree=None):

def _update_params(self, params: dict):
"""Update experiment params files with the specified values."""
from benedict import benedict
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure if you prefer local or global imports. Since this is only needed for the beta experiments feature, I figured local was appropriate.


from dvc.utils.serialize import MODIFIERS

logger.debug("Using experiment params '%s'", params)

# recursive dict update
def _update(dict_, other):
for key, value in other.items():
if isinstance(dict_, list) and key.isdigit():
key = int(key)
if isinstance(value, Mapping):
if isinstance(dict_, list):
fallback_value = dict_[key]
else:
fallback_value = dict_.get(key, {})
dict_[key] = _update(fallback_value, value)
else:
dict_[key] = value
return dict_

for params_fname in params:
path = PathInfo(self.exp_dvc.root_dir) / params_fname
suffix = path.suffix.lower()
modify_data = MODIFIERS[suffix]
with modify_data(path, tree=self.exp_dvc.tree) as data:
_update(data, params[params_fname])
benedict(data).merge(params[params_fname], overwrite=True)

# Force params file changes to be staged in git
# Otherwise in certain situations the changes to params file may be
Expand Down
3 changes: 1 addition & 2 deletions dvc/repo/experiments/run.py
Original file line number Diff line number Diff line change
Expand Up @@ -12,7 +12,6 @@ def _parse_params(path_params: Iterable):
from ruamel.yaml import YAMLError

from dvc.dependency.param import ParamsDependency
from dvc.utils.flatten import unflatten
from dvc.utils.serialize import loads_yaml

ret = {}
Expand All @@ -31,7 +30,7 @@ def _parse_params(path_params: Iterable):
)
if not path:
path = ParamsDependency.DEFAULT_PARAMS_FILE
ret[path] = unflatten(params)
ret[path] = params
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To be clear, this now returns a dict like:

{'path.to.key': yamlValue1, 'path[2]key': yamlValue2}

return ret


Expand Down
1 change: 1 addition & 0 deletions setup.py
Original file line number Diff line number Diff line change
Expand Up @@ -82,6 +82,7 @@ def run(self):
"shtab>=1.3.2,<2",
"rich>=3.0.5",
"dictdiffer>=0.8.1",
"python-benedict>=0.21.1",
]


Expand Down
34 changes: 26 additions & 8 deletions tests/func/experiments/test_experiments.py
Original file line number Diff line number Diff line change
Expand Up @@ -66,27 +66,45 @@ def test_update_with_pull(tmp_dir, scm, dvc, exp_stage, mocker):


@pytest.mark.parametrize(
"change, expected",
"changes, expected",
[
["foo.1.baz=3", "foo: [bar: 1, baz: 3]"],
["foo.0=bar", "foo: [bar, baz: 2]"],
["foo.1=- baz\n- goo", "foo: [bar: 1, [baz, goo]]"],
[["foo=baz"], "{foo: baz, goo: {bag: 3}, lorem: false}"],
[["foo=baz,goo=bar"], "{foo: baz, goo: bar, lorem: false}"],
[
["goo.bag=4"],
"{foo: [bar: 1, baz: 2], goo: {bag: 4}, lorem: false}",
],
[["foo[0]=bar"], "{foo: [bar, baz: 2], goo: {bag: 3}, lorem: false}"],
[
["foo[1].baz=3"],
"{foo: [bar: 1, baz: 3], goo: {bag: 3}, lorem: false}",
],
[
["foo[1]=- baz\n- goo"],
Copy link
Member

@skshetry skshetry Nov 11, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This syntax is not easy to understand/use. At a first glance, I associated those =- together and assumed it was removing items from the list.

Ideally, I'd like to have the following syntax:

$ dvc exp run --params 'foo[1]=[baz, goo]' 'bar={a:2, b: 3}'

I understand that we split it up by comma, so we cannot achieve that. We should consider using a different delimiter, maybe even space or semicolon and make it look JSON-like.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It seems a natural progression to support appending and removing similar to Hydra's syntax (they use ~foo to remove and +foo=value to append).

Or, maybe even use Hydra later directly.

Copy link
Contributor Author

@sjawhar sjawhar Nov 11, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note that I didn't change anything regarding the syntax. This is the syntax already accepted by the yaml parser being used, which is separate from this change. I'm simply documenting what currently works.

That said, I agree that it it would be nice to be able to specify arguments that include commas, and I think that your suggestion of simply changing --params to read a list instead of a single string would be the most straight-forward way of doing it.

"{foo: [bar: 1, [baz, goo]], goo: {bag: 3}, lorem: false}",
],
[
["lorem.ipsum=3"],
"{foo: [bar: 1, baz: 2], goo: {bag: 3}, lorem: {ipsum: 3}}",
],
],
)
def test_modify_list_param(tmp_dir, scm, dvc, mocker, change, expected):
def test_modify_params(tmp_dir, scm, dvc, mocker, changes, expected):
tmp_dir.gen("copy.py", COPY_SCRIPT)
tmp_dir.gen("params.yaml", "foo: [bar: 1, baz: 2]")
tmp_dir.gen(
"params.yaml", "{foo: [bar: 1, baz: 2], goo: {bag: 3}, lorem: false}"
)
stage = dvc.run(
cmd="python copy.py params.yaml metrics.yaml",
metrics_no_cache=["metrics.yaml"],
params=["foo"],
params=["foo", "goo", "lorem"],
name="copy-file",
)
scm.add(["dvc.yaml", "dvc.lock", "copy.py", "params.yaml", "metrics.yaml"])
scm.commit("init")

new_mock = mocker.spy(dvc.experiments, "new")
dvc.experiments.run(stage.addressing, params=[change])
dvc.experiments.run(stage.addressing, params=changes)

new_mock.assert_called_once()
assert (
Expand Down