Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Configuration check failed :: No action for destination key to check its value #10859

Closed
quancs opened this issue Dec 1, 2021 · 11 comments
Closed
Assignees
Labels
3rd party Related to a 3rd-party bug Something isn't working lightningcli pl.cli.LightningCLI

Comments

@quancs
Copy link
Member

quancs commented Dec 1, 2021

🐛 Bug

Hi, I found a bug when the default class for CLI is not the same with the class specified in config file, an error will report for the keys belonging to the default class but not belonging to the specified class

This is the class specified in config file:
2021-12-01_113911
and its __init__:
2021-12-01_113957

The default class specified in CLI:
2021-12-01_114046
and its __init__:
2021-12-01_113944

The command and error:

python -m models.NBSS --config configs/NBSS/NBSS-fit_FC2SA1.yaml fit --trainer.gpus=0,

usage: NBSS.py [-h] [--config CONFIG] [--print_config [={comments,skip_null}+]] {fit,validate,test,predict,tune} ...
NBSS.py: error: Parser key "model.arch": Problem with given class_path "models.arch.fc2_sa1.FC2_SA1":
  - 'Configuration check failed :: No action for destination key "dropout" to check its value.'

The reported key dropout is not a key for the class in the config file, it's a key for the default class. That's not right.

To Reproduce

Expected behavior

Environment

  • PyTorch Lightning Version (e.g., 1.5.0): 1.5.2 & 1.5.3
  • PyTorch Version (e.g., 1.10): 1.8.1
  • Python version (e.g., 3.9): 3.8
  • OS (e.g., Linux): Linux
  • CUDA/cuDNN version:
  • GPU models and configuration:
  • How you installed PyTorch (conda, pip, source):
  • If compiling from source, the output of torch.__config__.show():
  • Any other relevant information:

Additional context

cc @carmocca @mauvilsa

@quancs quancs added the bug Something isn't working label Dec 1, 2021
@awaelchli awaelchli added the lightningcli pl.cli.LightningCLI label Dec 2, 2021
@carmocca
Copy link
Contributor

carmocca commented Dec 3, 2021

For us to debug this, it would be best if you can create a minimal script or repository that reproduces the problem.

@quancs
Copy link
Member Author

quancs commented Dec 6, 2021

archs.py

class Arch:

    def __init__(self, a: int = 10) -> None:
        pass


class ArchA(Arch):

    def __init__(self, a: int = 10, c: int = 20) -> None:
        super().__init__(a)


class ArchB(Arch):

    def __init__(self, a: int = 10, b: int = 20) -> None:
        super().__init__(a)

boring.py

import os

import torch
from torch.utils.data import DataLoader, Dataset

from pytorch_lightning import LightningModule, Trainer
from archs import *

class RandomDataset(Dataset):

    def __init__(self, size, length):
        self.len = length
        self.data = torch.randn(length, size)

    def __getitem__(self, index):
        return self.data[index]

    def __len__(self):
        return self.len

class BoringModel(LightningModule):

    def __init__(self, arch: Arch):
        super().__init__()
        self.layer = torch.nn.Linear(32, 2)

    def forward(self, x):
        return self.layer(x)

    def training_step(self, batch, batch_idx):
        loss = self(batch).sum()
        self.log("train_loss", loss)
        return {"loss": loss}

    def validation_step(self, batch, batch_idx):
        loss = self(batch).sum()
        self.log("valid_loss", loss)

    def test_step(self, batch, batch_idx):
        loss = self(batch).sum()
        self.log("test_loss", loss)

    def configure_optimizers(self):
        return torch.optim.SGD(self.layer.parameters(), lr=0.1)


from pytorch_lightning.utilities.cli import (LightningArgumentParser, LightningCLI)


class MyCLI(LightningCLI):

    def add_arguments_to_parser(self, parser: LightningArgumentParser) -> None:
        default_arch = {
            "class_path": "archs.ArchA",
            "init_args": {
                "a": "10",
                "c": "10"
            },
        }

        parser.set_defaults({
            "model.arch": default_arch,
        })

        return super().add_arguments_to_parser(parser)


from pytorch_lightning import LightningDataModule
from torch.utils.data import DataLoader
import os


class MyDataModule(LightningDataModule):

    def __init__(self, train_transforms=None, val_transforms=None, test_transforms=None, dims=None):
        super().__init__(train_transforms=train_transforms, val_transforms=val_transforms, test_transforms=test_transforms, dims=dims)

    def train_dataloader(self) -> DataLoader:
        return DataLoader(RandomDataset(32, 64), batch_size=2)

    def val_dataloader(self) -> DataLoader:
        return DataLoader(RandomDataset(32, 64), batch_size=2)

    def test_dataloader(self) -> DataLoader:
        return DataLoader(RandomDataset(32, 64), batch_size=2)


if __name__ == '__main__':
    cli = MyCLI(BoringModel, MyDataModule, seed_everything_default=None, save_config_overwrite=True)

a.yaml

fit:
  seed_everything: null
  trainer:
    logger: true
    checkpoint_callback: null
    enable_checkpointing: true
    callbacks: null
    default_root_dir: null
    gradient_clip_val: null
    gradient_clip_algorithm: null
    process_position: 0
    num_nodes: 1
    num_processes: 1
    devices: null
    gpus: null
    auto_select_gpus: false
    tpu_cores: null
    ipus: null
    log_gpu_memory: null
    progress_bar_refresh_rate: null
    enable_progress_bar: true
    overfit_batches: 0.0
    track_grad_norm: -1
    check_val_every_n_epoch: 1
    fast_dev_run: false
    accumulate_grad_batches: null
    max_epochs: null
    min_epochs: null
    max_steps: -1
    min_steps: null
    max_time: null
    limit_train_batches: 1.0
    limit_val_batches: 1.0
    limit_test_batches: 1.0
    limit_predict_batches: 1.0
    val_check_interval: 1.0
    flush_logs_every_n_steps: null
    log_every_n_steps: 50
    accelerator: null
    strategy: null
    sync_batchnorm: false
    precision: 32
    enable_model_summary: true
    weights_summary: top
    weights_save_path: null
    num_sanity_val_steps: 2
    resume_from_checkpoint: null
    profiler: null
    benchmark: false
    deterministic: false
    reload_dataloaders_every_n_epochs: 0
    reload_dataloaders_every_epoch: false
    auto_lr_find: false
    replace_sampler_ddp: true
    detect_anomaly: false
    auto_scale_batch_size: false
    prepare_data_per_node: null
    plugins: null
    amp_backend: native
    amp_level: null
    move_metrics_to_cpu: false
    multiple_trainloader_mode: max_size_cycle
    stochastic_weight_avg: false
    terminate_on_nan: null
  model:
    arch:
      class_path: archs.ArchB
      init_args:
        a: 10
        b: 10
  data:
    train_transforms: null
    val_transforms: null
    test_transforms: null
    dims: null
  ckpt_path: null

train:

python -m boring --config a.yaml fit

error reported:

(base) ➜  NBSS_pmt git:(master) ✗ python -m boring --config a.yaml fit        
usage: boring.py [-h] [--config CONFIG] [--print_config [={comments,skip_null}+]] {fit,validate,test,predict,tune} ...
boring.py: error: Parser key "model.arch": Problem with given class_path "archs.ArchB":
  - 'Configuration check failed :: No action for destination key "c" to check its value.'

Here in the error the key c is from archs.ArchA, which is the default arch. There should be no error, because I clearly specified the class_path of model.arch

@quancs
Copy link
Member Author

quancs commented Dec 6, 2021

@carmocca Hi, I gave the boring model, and the train command, and error. pls, check it

@FeryET
Copy link

FeryET commented Dec 8, 2021

I am facing the same issue in my project too. I have something like this:

class Simple1DCNNConfig(ModelConfig):
    """The configuration for Simple1DCNN."""

    cls = Simple1DCNN
    input_shape: List[int]
    p_dropout: float = 0.5
    num_classes: int = 2
# config.yaml
model:
  model_config:
    class_path: src.model.pytorch_models.Simple1DCNNConfig
    init_args:
      input_shape: [64, 255]
      p_dropout: 0.5

And the error I'm getting is this:

 Parser key "model.model_config": Problem with given class_path "src.model.pytorch_models.Simple1DCNNConfig":
  - 'Configuration check failed :: No action for destination key "input_shape" to check its value.'

@FeryET
Copy link

FeryET commented Dec 11, 2021

I am facing the same issue in my project too. I have something like this:

class Simple1DCNNConfig(ModelConfig):
    """The configuration for Simple1DCNN."""

    cls = Simple1DCNN
    input_shape: List[int]
    p_dropout: float = 0.5
    num_classes: int = 2
# config.yaml
model:
  model_config:
    class_path: src.model.pytorch_models.Simple1DCNNConfig
    init_args:
      input_shape: [64, 255]
      p_dropout: 0.5

And the error I'm getting is this:

 Parser key "model.model_config": Problem with given class_path "src.model.pytorch_models.Simple1DCNNConfig":
  - 'Configuration check failed :: No action for destination key "input_shape" to check its value.'

I found the solution to my problem. I had to decorate my config class with @dataclass but somehow I forgot. This led to the __init__ not having those arguments. So with this change, the config file should be parsed.

@dataclass
class Simple1DCNNConfig(ModelConfig):
    """The configuration for Simple1DCNN."""

    cls = Simple1DCNN
    input_shape: List[int]
    p_dropout: float = 0.5
    num_classes: int = 2
# config.yaml
model:
  model_config:
    class_path: src.model.pytorch_models.Simple1DCNNConfig
    init_args:
      input_shape: [64, 255]
      p_dropout: 0.5

@carmocca
Copy link
Contributor

cc @mauvilsa as I think it's a bug upstream

Repro without the CLI:

from unittest import mock

from jsonargparse import ArgumentParser


class Arch:
    def __init__(self, a: int = 1) -> None:
        pass


class ArchB(Arch):
    def __init__(self, a: int = 2, b: int = 3) -> None:
        print(f"ArchB, {a=}, {b=}")


class ArchC(Arch):
    def __init__(self, a: int = 4, c: int = 5) -> None:
        print(f"ArchC, {a=}, {c=}")


parser = ArgumentParser()
parser_subcommands = parser.add_subcommands()
subparser = ArgumentParser()
subparser.add_argument("--arch", type=Arch)
parser_subcommands.add_subcommand("fit", subparser)

default = {"class_path": "__main__.ArchB"}
subparser.set_defaults({"arch": default})


value = {"class_path": "__main__.ArchC", "init_args": {"a": "10", "c": "11"}}
with mock.patch("sys.argv", ["any.py", "fit", "--arch", str(value)]):
    args = parser.parse_args()  # 'Configuration check failed :: No action for destination key "b" to check its value.'

If you comment set_defaults, remove the custom arch value from argv, or dont't use subcommand, then it works as expected.

Version without subcommands
from unittest import mock

from jsonargparse import ArgumentParser


class Arch:
    def __init__(self, a: int = 1) -> None:
        pass


class ArchB(Arch):
    def __init__(self, a: int = 2, b: int = 3) -> None:
        print(f"ArchB, {a=}, {b=}")


class ArchC(Arch):
    def __init__(self, a: int = 4, c: int = 5) -> None:
        print(f"ArchC, {a=}, {c=}")


parser = ArgumentParser()
parser.add_argument("--arch", type=Arch)

default = {"class_path": "__main__.ArchB"}
parser.set_defaults({"arch": default})

value = {"class_path": "__main__.ArchC", "init_args": {"a": "10", "c": "11"}}
with mock.patch("sys.argv", ["any.py", "--arch", str(value)]):
    args = parser.parse_args()
print(args)  # Namespace(arch=Namespace(class_path='__main__.ArchC', init_args=Namespace(a=10, c=11)))

@carmocca carmocca added the 3rd party Related to a 3rd-party label Dec 14, 2021
@mauvilsa
Copy link
Contributor

I think it's a bug upstream

Yes, looks like a bug upstream. Will look at it.

@quancs
Copy link
Member Author

quancs commented Dec 28, 2021

And I find another problem similar to this one.

specify the profiler's arguments not in BaseProfiler

trainer:
  profiler:
    class_path: pytorch_lightning.profiler.PyTorchProfiler
    init_args:
      with_flops: true # PyTorchProfiler: **profiler_kwargs
  ...

The error reported:

boring.py: error: Parser key "trainer.profiler": Value "Namespace(class_path='pytorch_lightning.profiler.PyTorchProfiler', init_args=Namespace(dirpath=None, emit_nvtx=False, export_to_chrome=True, filename=None, group_by_input_shapes=False, record_functions=None, record_module_names=True, row_limit=20, sort_by_key=None, with_flops=True))" does not validate against any of the types in typing.Union[pytorch_lightning.profiler.base.BaseProfiler, str, NoneType]:
  - Problem with given class_path "pytorch_lightning.profiler.PyTorchProfiler":
    - 'Configuration check failed :: No action for destination key "with_flops" to check its value.'
  - Expected a <class 'str'> but got "Namespace(class_path='pytorch_lightning.profiler.PyTorchProfiler', init_args=Namespace(dirpath=None, emit_nvtx=False, export_to_chrome=True, filename=None, group_by_input_shapes=False, record_functions=None, record_module_names=True, row_limit=20, sort_by_key=None, with_flops=True))"
  - Expected a <class 'NoneType'> but got "Namespace(class_path='pytorch_lightning.profiler.PyTorchProfiler', init_args=Namespace(dirpath=None, emit_nvtx=False, export_to_chrome=True, filename=None, group_by_input_shapes=False, record_functions=None, record_module_names=True, row_limit=20, sort_by_key=None, with_flops=True))"

@mauvilsa
Copy link
Contributor

And I find another problem similar to this one.

specify the profiler's arguments not in BaseProfiler
...

The cause for this is different and not a bug. This is explained in #8561 (comment).

@mauvilsa
Copy link
Contributor

I just released jsonargparse v4.1.1 which fixes the bug.

@quancs
Copy link
Member Author

quancs commented Jan 13, 2022

I just released jsonargparse v4.1.1 which fixes the bug.

Great! ^^

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
3rd party Related to a 3rd-party bug Something isn't working lightningcli pl.cli.LightningCLI
Projects
None yet
Development

No branches or pull requests

5 participants