You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
If I define a custom base LightningDataModule class and set subclass_mode_data=True in the LightningCLI module, following the instructions here and then provide the data configuration in a config file, I get an exit code 2 without any error message when the custom data module is defined in the same file as the CLI module.
Note, that this only seems to happen if the config is in a subfolder.
Regarding the no error message, there was a regression in recent versions of jsonargparse, see #12303. It has been fixed but haven't had the time to release it. Will try to do so today.
The bug here is in your main.py script and not in pytorch-lightning or in jsonargparse. The problem is that LightningCLI should be inside a if __name__ == '__main__': block, otherwise main.py is not an importable module. In the config you also need to change the class path to __main__.DummyDataModule.
What is happening is that you run main.py and LightningCLI starts parsing the config. Since config/base.yaml is in a different directory, temporarily the working directory changes to config/ so that relative paths in base.yaml if any, work as expected. At some point there is an attempt to import the class path main.DummyDataModule. In python the main script is imported as __main__, and python is a bit dumb and does not consider main.* to be the same module. So main.py is imported once again, which leads to the LightningCLI being executed again. At this point it fails because the path config/base.yaml is no longer valid due to the working directory being different. I am not sure what could be done to fix this since it is just a weird behavior that python has. The class path also needs to change because of python being dumb and considering __main__.BaseDummyDataModule and main.BaseDummyDataModule to be different classes.
To observe the behavior I explained keep the config/base.yaml file and run a script with:
importosfromjsonargparse.typingimportPath_frfromjsonargparse.utilimportchange_to_path_dirclassMyClass:
passprint(f'main.py imported with __name__={__name__} and cwd={os.getcwd()}')
withchange_to_path_dir(Path_fr('config/base.yaml')):
__import__('main', fromlist=['MyClass'])
Regarding the no error message, there was a regression in recent versions of jsonargparse, see #12303. It has been fixed but haven't had the time to release it. Will try to do so today.
Makes sense. Thanks for the explanation.
The bug here is in your main.py script and not in pytorch-lightning or in jsonargparse. The problem is that LightningCLI should be inside a if name == 'main': block, otherwise main.py is not an importable module. In the config you also need to change the class path to main.DummyDataModule.
That was my fault. I noticed this bug in a larger project some time ago where the CLI object was not exposed globally but forgot to the same when I tried to replicate the behaviour with minimal code.
When not exposing the CLI object globally, I indeed get the correct and original error message:
main.py: error: Configuration check failed :: Parser key "data": "main.DummyDataModule" is not a subclass of BaseDummyDataModule
Following your advice, this can indeed be fixed by replacing main.DummyDataModule with __main__.DummyDataModule in the config file.
I guess this is just some non-intuitive behaviour that needs to taken care of. From my side, this issue can be closed.
🐛 Bug
If I define a custom base
LightningDataModule
class and setsubclass_mode_data=True
in theLightningCLI
module, following the instructions here and then provide the data configuration in a config file, I get an exit code2
without any error message when the custom data module is defined in the same file as the CLI module.Note, that this only seems to happen if the config is in a subfolder.
To Reproduce
main.py
:config/base.yaml
:The training is executed with:
This crash can be resolved in the following ways:
base.yaml
in the root folderdata.py
Expected behavior
Training runs without an error.
Environment
- GPU:
- available: False
- version: None
- numpy: 1.20.0
- pyTorch_debug: False
- pyTorch_version: 1.10.1+cpu
- pytorch-lightning: 1.5.10
- tqdm: 4.62.3
- OS: Linux
- architecture:
- 64bit
- ELF
- processor: x86_64
- python: 3.8.0b4
cc @carmocca @mauvilsa
The text was updated successfully, but these errors were encountered: