Merging/overriding configs with same format in monai bundle run #5899

surajpaib · 2023-01-26T09:32:19Z

Is your feature request related to a problem? Please describe.
Currently, overriding config can be done through an args_file argument pointing to a json/yaml provided to the run script. However, it only merges configs at the base level, and to override nested structures, the # format must be used.

Example,

monai.bundle.run("", **{
    "args_file": "./config2.json",
    "config_file": "./config1.json", 
    
})

```
config1.json
```
{
    "trainer":
    {
        "max_epochs": 5

    }
}

```
config2.json
```
{
    "trainer#devices": 1
}

This gives the output,

[{'_meta_': {}, 'trainer': {'max_epochs': 5, 'devices': 1}}]

However, if the config provided to args_file is in the same format (shown below) as the config_file, it does not do the merge,

```
config2.json
```
{
    "trainer":
    {
        "devices": 1

    }
}

Describe the solution you'd like
It would be beneficial to have the ability to merge these two configs when they are in the same format to allow for use cases where a default config is set to have good parameter definitions, and the user only needs to define/override specific fields.

Describe alternatives you've considered
I tried specifying two of these configs as a list in this form as well,

monai.bundle.run("", **{
    "config_file": ["./config1.json", "./config2.json"], 
    
})

But it doesn't seem to handle overriding keys, although it merges non-conflicting keys.

My temporary fix
To achieve the functionality I needed, I've changed the following,

MONAI/monai/bundle/config_parser.py

Line 223 in 86c7ecd

def update(self, pairs: dict[str, Any]):

to

    def update(self, pairs: dict[str, Any]):
        """
        Set the ``id`` and the corresponding config content in pairs, see also :py:meth:`__setitem__`.
        For example, ``parser.update({"train#epoch": 100, "train#lr": 0.02})``

        Args:
            pairs: dictionary of `id` and config pairs.

        """
        for k, v in pairs.items():
            if isinstance(v, dict):
                self.update({f'{k}#{_k}': _v for _k, _v in v.items()})
                return 
                
            self[k] = v

Additional context
Ideally, it would be nice to have the ability to merge configs in the config_file list. Is there a way to do this, and am potentially missing something?

The text was updated successfully, but these errors were encountered:

Nic-Ma · 2023-01-27T01:40:41Z

Hi @surajpaib ,

Thanks for the feature request, we already have some recursive overriding for the args:
https://github.com/Project-MONAI/MONAI/blob/dev/monai/bundle/scripts.py#L52
Do you mean to also apply it in the config overriding?

Thanks.

surajpaib · 2023-01-27T01:45:31Z

@Nic-Ma Yes exactly.

Among one use case, we might want to have a base config defining a run for a particular dataset and have a separate config for a new dataset. This separate config could have several changes such as a larger batch size, different patch size, more workers etc. It would be inconvenient to define all of these as args to the command and becomes less readable when using the # syntax to define it in the config

Nic-Ma · 2023-01-27T14:40:56Z

Hi @surajpaib ,

If I understand correctly, for your use case, you can pass both the "initial config" and the "override config" to the "config_file" arg, it will automatically override it, refer to the example command in this README about how we run multi-gpu training: https://github.com/Project-MONAI/model-zoo/tree/dev/models/spleen_ct_segmentation.
It's also based on the "#" path for nested items, source code is: https://github.com/Project-MONAI/MONAI/blob/dev/monai/bundle/config_parser.py#L406.
The override arg in the "monai.bundle.run" was designed to quickly change config parameter in CLI command at runtime, not to support complicated (and in JSON file) overriding in the config.

Thanks.

surajpaib · 2023-01-30T02:17:20Z

Hi @Nic-Ma,

Yes, having both the configs in config_file is precisely how I want to override it, but it seems like I have to change my config structure to include # format.

Consider the following basic config,

{
    "model1": {
        "network": {
            "_target_": "UNet",
            "spatial_dims": 3,
            "in_channels": 1,
            "out_channels": 2,
            "channels": [
                16,
                32,
                64,
                128,
                256
            ],
            "strides": [
                2,
                2,
                2,
                2             
            ],
            "num_res_units": 2,
            "norm": "batch"
        },
        "loss": {
            "_target_": "DiceCELoss",
            "to_onehot_y": true,
            "softmax": true,
            "squared_pred": true,
            "batch": true
        },
        "optimizer": {
            "_target_": "torch.optim.Adam",
            "params": "[email protected]()",
            "lr": 0.0001
        }
    },
    "model2": {
        "network": {
            "_target_": "UNet",
            "spatial_dims": 3,
            "in_channels": 2,
            "out_channels": 1,
            "channels": [
                16,
                32,
                64,
                128,
                256
            ],
            "strides": [
                2,
                2,
                2,
                2             
            ],
            "num_res_units": 2,
            "norm": "batch"
        },
        "loss": {
            "_target_": "BCELoss",
        },
        "optimizer": {
            "_target_": "torch.optim.SGD",
            "params": "[email protected]()",
            "lr": 0.001
        }
    }
}

Overriding it now would be done something like this,

{
    "model1#network#channels": [
                16,
                32,
                64,
                128,
                512
            ],
    "model2#network#channels":  [
                16,
                32,
                64,
                128,
                512
            ],
    "model1#optimizer": {
        "_target_": "torch.optim.SGD",
        "params": "[email protected]()",
        "lr": 0.01
    } ,
    "model2#optimizer": {
        "_target_": "torch.optim.Adam",
        "params": "[email protected]()",
        "lr": 0.0001
    }
}

I think this makes it more difficult for the user to read through the configs and make changes, especially when the configs get large. model1 and model2 attributes can be defined separately here, one in the first line and one in the last, making less readable. If we allow configs in the standard json structure without the # tags, it would look like this,

{
    "model1": {
        "network": {
            "channels": [
                16,
                32,
                64,
                128,
                512
            ]
        },
        "optimizer": {
            "_target_": "torch.optim.SGD",
            "params": "[email protected]()",
            "lr": 0.001
        }
    },
    "model2": {
        "network": {
            "channels": [
                16,
                32,
                64,
                128,
                512
            ]
        },
        "optimizer": {
            "_target_": "torch.optim.Adam",
            "params": "[email protected]()",
            "lr": 0.0001
        }
    }
}

Very easy to go through and edit.
Another important use-case is I want to copy over an optimizer or training config from another monai bundle configuration and override it. I have to manually change every key to ensure it follows the # structure.

Do you think it makes sense to add this capability to allow for these use-cases? Thanks in advance!

Nic-Ma · 2023-01-30T02:38:23Z

Hi @surajpaib ,

Thanks for the detailed description of the fearure request.
I have 1 question about your proposal:
For example, in the current design of MONAI bundle config, if we define a model in the config A:

"network": {
    "_target_": "UNet",
    "spatial_dims": 3,
    "in_channels": 1,
    "out_channels": 2,
    "channels": [16, 32, 64, 128, 256],
    "strides": [2, 2, 2, 2],
    "num_res_units": 2,
    "norm": "batch"
}

There can be 2 kinds of cases to override it:
(1) Only override 1 parameter, and use other parameters as the original config, like:

"network#in_channels": 3

(2) Override the whole model definition with a new set of parameters, like:

"network": {
    "_target_": "UNet",
    "spatial_dims": 2,
    "in_channels": 3,
    "out_channels": 1,
    "channels": [16, 32, 64, 128, 256]
}

Then all the "network" config item will be overridden, no need to care about the previous parameters anymore.
So in summary, the "#" path can help to identify the "start position" to override.
How can we distinguish these 2 cases in your proposal?

Thanks in advance.

surajpaib · 2023-02-01T19:19:25Z

Hi @Nic-Ma ,

For cases (1) and (2), the format would be very similar indeed.
So for 1,

"network": {
        "in_channels": 3
}

However, only parameters that are defined should be overridden, both for cases (1) and (2). Having (1) and (2) consistent for the user with the logic that what is defined is overridden makes it easier to understand, in my opinion. Functionally, they both operate very similarly, so I wonder if there is a need to define a different syntax for them?

Nic-Ma · 2023-02-02T00:58:11Z

Hi @surajpaib ,

I still feel it's necessary to have "#" path to help define the "start point" of overriding in the nested structure, let me give you another example:
If we have the below transform chain list in the config:

{
  "train": {
      "transforms": [
          {
              "_target_": "LoadImaged",
              "keys": [
                  "image",
                  "label"
              ]
          },
          {
              "_target_": "EnsureChannelFirstd",
              "keys": [
                  "image",
                  "label"
              ]
          },
          {
              "_target_": "Orientationd",
              "keys": [
                  "image",
                  "label"
              ],
              "axcodes": "RAS"
          },
          {
              "_target_": "Spacingd",
              "keys": [
                  "image",
                  "label"
              ],
              "pixdim": [
                  1.5,
                  1.5,
                  2.0
              ],
              "mode": [
                  "bilinear",
                  "nearest"
              ]
          },
          {
              "_target_": "ScaleIntensityRanged",
              "keys": "image",
              "a_min": -57,
              "a_max": 164,
              "b_min": 0,
              "b_max": 1,
              "clip": true
          },
          {
              "_target_": "EnsureTyped",
              "keys": [
                  "image",
                  "label"
              ]
          },
          {
              "_target_": "RandCropByPosNegLabeld",
              "keys": [
                  "image",
                  "label"
              ],
              "label_key": "label",
              "spatial_size": [
                  96,
                  96,
                  96
              ],
              "pos": 1,
              "neg": 1,
              "num_samples": 4,
              "image_key": "image",
              "image_threshold": 0
          }
      ]
   }
}

If I only want to override the pixdim of Spacingd transform from "[1.5, 1.5, 2.0]" to "[1.5, 1.5, 1.5]", in the current design, we only need 1 line: "train#transforms#3#pixdim#2": 1.5
How to achieve it in your proposal?

Thanks in advance.

surajpaib · 2023-02-02T20:42:35Z

Hi @Nic-Ma,

What you pointed out makes perfect sense. There is no need to remove the current behavior, which is very useful for precisely the case you outlined above. But adding support to have it in a standard config format identical to the base format can also be very useful too.

I think the # format is far superior when it comes to overriding a few elements but the standard format becomes useful when overriding a large number of elements. Also very useful when we want to copy elements from different base configs and put together a new config.

Is there a possibility of supporting both?

Thanks!

Nic-Ma · 2023-02-03T02:34:30Z

Hi @wyli @ericspod ,

Based on the above discussion, do you think it's necessary to support both current "#" based overriding and this nested overriding requirement? I personally feel it would be better to keep the user experience simple only in 1 way.
Anyway, open for discussion.

Thanks.

ericspod · 2023-02-06T14:54:59Z

Would I summarise the issue as that overriding values in a second config file does work as needed if using the # operator and some nesting where needed, but that for large or deep replacements this can be unwieldy? The solution here is to allow a sparse structure to be defined which is merged with the original, replacing or adding values in places. My issue with that is this sparse structure looks a lot like any other that represents a full definition rather than an aspect of one.

Signed-off-by: Wenqi Li <[email protected]>

wyli · 2023-10-10T16:21:59Z

this feature could be easily added, so I'm including it f2ba40c

Fixes #6387 Fixes #5899 ### Description - add api for update_kwargs - add support of merging multiple configs files and dictionaries - remove warning message of directory in `runtests.sh` ### Types of changes  - [x] Non-breaking change (fix or new feature that would not break existing functionality). - [ ] Breaking change (fix or new feature that would cause existing functionality to change). - [ ] New tests added to cover the changes. - [ ] Integration tests passed locally by running `./runtests.sh -f -u --net --coverage`. - [ ] Quick tests passed locally by running `./runtests.sh --quick --unittests --disttests`. - [ ] In-line docstrings updated. - [ ] Documentation updated, tested `make html` command in the `docs/` folder. --------- Signed-off-by: Wenqi Li <[email protected]>

surajpaib · 2023-10-11T16:41:08Z

This is great! Thanks @wyli

wyli · 2023-10-11T17:19:19Z

for the record it's implemented at the args_file input via monai.bundle.update_kwargs:

given config2.json

{
    "trainer":
    {
        "devices": 1

    }
}

config1.json

{
    "trainer":
    {
        "max_epochs": 5

    }
}

python -m monai.bundle run trainer --args_file="['config1.json','config2.json']" --config_file="config1.json"
2023-10-11 18:04:08,097 - INFO - --- input summary of monai.bundle.scripts.run ---
2023-10-11 18:04:08,097 - INFO - > trainer: {'devices': 1, 'max_epochs': 5}
2023-10-11 18:04:08,097 - INFO - > config_file: 'config1.json'
2023-10-11 18:04:08,097 - INFO - > run_id: 'trainer'
2023-10-11 18:04:08,097 - INFO - ---

the merged config will be trainer: {'devices': 1, 'max_epochs': 5}

surajpaib changed the title ~~Merging/overriding config in monai bundle run~~ Merging/overriding configs with same format in monai bundle run Jan 26, 2023

surajpaib mentioned this issue Jan 26, 2023

Improved packaging and CI pipeline for lighter project-lighter/lighter#17

Merged

wyli added the Feature request label Jan 27, 2023

wyli added a commit to wyli/MONAI that referenced this issue Oct 10, 2023

update based on comments; fixes Project-MONAI#5899

f2ba40c

Signed-off-by: Wenqi Li <[email protected]>

wyli mentioned this issue Oct 10, 2023

6387 update_kwargs for merging multiple configs #7109

Merged

7 tasks

wyli closed this as completed in #7109 Oct 10, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Merging/overriding configs with same format in monai bundle run #5899

Merging/overriding configs with same format in monai bundle run #5899

surajpaib commented Jan 26, 2023

Nic-Ma commented Jan 27, 2023

surajpaib commented Jan 27, 2023

Nic-Ma commented Jan 27, 2023

surajpaib commented Jan 30, 2023

Nic-Ma commented Jan 30, 2023

surajpaib commented Feb 1, 2023

Nic-Ma commented Feb 2, 2023

surajpaib commented Feb 2, 2023

Nic-Ma commented Feb 3, 2023

ericspod commented Feb 6, 2023

wyli commented Oct 10, 2023

surajpaib commented Oct 11, 2023

wyli commented Oct 11, 2023

Merging/overriding configs with same format in monai bundle run #5899

Merging/overriding configs with same format in monai bundle run #5899

Comments

surajpaib commented Jan 26, 2023

Nic-Ma commented Jan 27, 2023

surajpaib commented Jan 27, 2023

Nic-Ma commented Jan 27, 2023

surajpaib commented Jan 30, 2023

Nic-Ma commented Jan 30, 2023

surajpaib commented Feb 1, 2023

Nic-Ma commented Feb 2, 2023

surajpaib commented Feb 2, 2023

Nic-Ma commented Feb 3, 2023

ericspod commented Feb 6, 2023

wyli commented Oct 10, 2023

surajpaib commented Oct 11, 2023

wyli commented Oct 11, 2023