diff --git a/docs/source-pytorch/cli/lightning_cli.rst b/docs/source-pytorch/cli/lightning_cli.rst index a0933b447ad31..5b0bf73754db3 100644 --- a/docs/source-pytorch/cli/lightning_cli.rst +++ b/docs/source-pytorch/cli/lightning_cli.rst @@ -2,9 +2,25 @@ .. _lightning-cli: -############################ -Eliminate config boilerplate -############################ +###################################### +Configure hyperparameters from the CLI +###################################### + +************* +Why use a CLI +************* + +When running deep learning experiments, there are a couple of good practices that are recommended to follow: + +- Separate configuration from source code +- Guarantee reproducibility of experiments + +Implementing a command line interface (CLI) makes it possible to execute an experiment from a shell terminal. By having +a CLI, there is a clear separation between the Python source code and what hyperparameters are used for a particular +experiment. If the CLI corresponds to a stable version of the code, reproducing an experiment can be achieved by +installing the same version of the code plus dependencies and running with the same configuration (CLI arguments). + +---- ********* Basic use @@ -26,7 +42,7 @@ Basic use :tag: intermediate .. displayitem:: - :header: 2: Mix models and datasets + :header: 2: Mix models, datasets and optimizers :description: Support multiple models, datasets, optimizers and learning rate schedulers :col_css: col-md-4 :button_link: lightning_cli_intermediate_2.html @@ -60,27 +76,38 @@ Advanced use .. displayitem:: :header: YAML for production :description: Use the Lightning CLI with YAMLs for production environments - :col_css: col-md-6 + :col_css: col-md-4 :button_link: lightning_cli_advanced_2.html :height: 150 :tag: advanced .. displayitem:: :header: Customize for complex projects - :description: Learn how to implement CLIs for complex projects. - :col_css: col-md-6 + :description: Learn how to implement CLIs for complex projects + :col_css: col-md-4 :button_link: lightning_cli_advanced_3.html :height: 150 - :tag: expert + :tag: advanced .. displayitem:: :header: Extend the Lightning CLI :description: Customize the Lightning CLI - :col_css: col-md-6 + :col_css: col-md-4 :button_link: lightning_cli_expert.html :height: 150 :tag: expert +---- + +************* +Miscellaneous +************* + +.. raw:: html + +
+
+ .. displayitem:: :header: FAQ :description: Frequently asked questions about working with the Lightning CLI and YAML files @@ -88,6 +115,13 @@ Advanced use :button_link: lightning_cli_faq.html :height: 150 +.. displayitem:: + :header: Legacy CLIs + :description: Documentation for the legacy argparse-based CLIs + :col_css: col-md-6 + :button_link: ../common/hyperparameters.html + :height: 150 + .. raw:: html
diff --git a/docs/source-pytorch/cli/lightning_cli_advanced.rst b/docs/source-pytorch/cli/lightning_cli_advanced.rst index 2d4f3307e7f18..68458c351e798 100644 --- a/docs/source-pytorch/cli/lightning_cli_advanced.rst +++ b/docs/source-pytorch/cli/lightning_cli_advanced.rst @@ -1,113 +1,168 @@ :orphan: -####################################### -Eliminate config boilerplate (Advanced) -####################################### +################################################# +Configure hyperparameters from the CLI (Advanced) +################################################# **Audience:** Users looking to modularize their code for a professional project. -**Pre-reqs:** You must have read :doc:`(Control it all from the CLI) `. +**Pre-reqs:** You must have read :doc:`(Mix models and datasets) `. + +As a project becomes more complex, the number of configurable options becomes very large, making it inconvenient to +control through individual command line arguments. To address this, CLIs implemented using +:class:`~pytorch_lightning.cli.LightningCLI` always support receiving input from configuration files. The default format +used for config files is YAML. + +.. tip:: + + If you are unfamiliar with YAML, it is recommended that you first read :ref:`what-is-a-yaml-config-file`. + ---- -*************************** -What is a yaml config file? -*************************** -A yaml is a standard configuration file that describes parameters for sections of a program. It is a common tool in engineering, and it has recently started to gain popularity in machine learning. +*********************** +Run using a config file +*********************** +To run the CLI using a yaml config, do: -.. code:: yaml +.. code:: bash - # file.yaml - car: - max_speed:100 - max_passengers:2 - plane: - fuel_capacity: 50 - class_3: - option_1: 'x' - option_2: 'y' + python main.py fit --config config.yaml + +Individual arguments can be given to override options in the config file: + +.. code:: bash + + python main.py fit --config config.yaml --trainer.max_epochs 100 ---- +************************ +Automatic save of config +************************ -********************* -Print the config used -********************* -Before or after you run a training routine, you can print the full training spec in yaml format using ``--print_config``: +To ease experiment reporting and reproducibility, by default ``LightningCLI`` automatically saves the full YAML +configuration in the log directory. After multiple fit runs with different hyperparameters, each one will have in its +respective log directory a ``config.yaml`` file. These files can be used to trivially reproduce an experiment, e.g.: + +.. code:: bash + + python main.py fit --config lightning_logs/version_7/config.yaml + +The automatic saving of the config is done by the special callback :class:`~pytorch_lightning.cli.SaveConfigCallback`. +This callback is automatically added to the ``Trainer``. To disable the save of the config, instantiate ``LightningCLI`` +with ``save_config_callback=None``. + +---- + +********************************* +Prepare a config file for the CLI +********************************* +The ``--help`` option of the CLIs can be used to learn which configuration options are available and how to use them. +However, writing a config from scratch can be time-consuming and error-prone. To alleviate this, the CLIs have the +``--print_config`` argument, which prints to stdout the configuration without running the command. + +For a CLI implemented as ``LightningCLI(DemoModel, BoringDataModule)``, executing: .. code:: bash python main.py fit --print_config -which generates the following config: +generates a config with all default values like the following: .. code:: bash seed_everything: null trainer: - logger: true - ... - terminate_on_nan: null + logger: true + ... model: - out_dim: 10 - learning_rate: 0.02 + out_dim: 10 + learning_rate: 0.02 data: - data_dir: ./ + data_dir: ./ ckpt_path: null ----- - -******************************** -Write a config yaml from the CLI -******************************** -To have a copy of the configuration that produced this model, save a *yaml* file from the *--print_config* outputs: +Other command line arguments can be given and considered in the printed configuration. A use case for this is CLIs that +accept multiple models. By default, no model is selected, meaning the printed config will not include model settings. To +get a config with the default values of a particular model would be: .. code:: bash - python main.py fit --model.learning_rate 0.001 --print_config > config.yaml + python main.py fit --model DemoModel --print_config ----- - -********************** -Run from a single yaml -********************** -To run from a yaml, pass a yaml produced with ``--print_config`` to the ``--config`` argument: +which generates a config like: .. code:: bash - python main.py fit --config config.yaml + seed_everything: null + trainer: + ... + model: + class_path: pytorch_lightning.demos.boring_classes.DemoModel + init_args: + out_dim: 10 + learning_rate: 0.02 + ckpt_path: null -when using a yaml to run, you can still pass in inline arguments +.. tip:: -.. code:: bash + A standard procedure to run experiments can be: - python main.py fit --config config.yaml --trainer.max_epochs 100 + .. code:: bash + + # Print a configuration to have as reference + python main.py fit --print_config > config.yaml + # Modify the config to your liking - you can remove all default arguments + nano config.yaml + # Fit your model using the edited configuration + python main.py fit --config config.yaml ---- -****************** -Compose yaml files -****************** -For production or complex research projects it's advisable to have each object in its own config file. To compose all the configs, pass them all inline: +******************** +Compose config files +******************** +Multiple config files can be provided, and they will be parsed sequentially. Let's say we have two configs with common +settings: + +.. code:: yaml + + # config_1.yaml + trainer: + num_epochs: 10 + ... + + # config_2.yaml + trainer: + num_epochs: 20 + ... + +The value from the last config will be used, ``num_epochs = 20`` in this case: .. code-block:: bash - $ python trainer.py fit --config trainer.yaml --config datamodules.yaml --config models.yaml ... + $ python main.py fit --config config_1.yaml --config config_2.yaml -The configs will be parsed sequentially. Let's say we have two configs with the same args: +---- + +********************* +Use groups of options +********************* +Groups of options can also be given as independent config files. For configs like: .. code:: yaml # trainer.yaml - trainer: - num_epochs: 10 + num_epochs: 10 + # model.yaml + out_dim: 7 - # trainer_2.yaml - trainer: - num_epochs: 20 + # data.yaml + data_dir: ./data -the ones from the last config will be used (num_epochs = 20) in this case: +a fit command can be run as: .. code-block:: bash - $ python trainer.py fit --config trainer.yaml --config trainer_2.yaml + $ python main.py fit --trainer trainer.yaml --model model.yaml --data data.yaml [...] diff --git a/docs/source-pytorch/cli/lightning_cli_advanced_2.rst b/docs/source-pytorch/cli/lightning_cli_advanced_2.rst index e54f82edaa1b5..e5b3cd6a08929 100644 --- a/docs/source-pytorch/cli/lightning_cli_advanced_2.rst +++ b/docs/source-pytorch/cli/lightning_cli_advanced_2.rst @@ -6,7 +6,7 @@ import torch from unittest import mock from typing import List - import pytorch_lightning as pl + import pytorch_lightning.cli as pl_cli from pytorch_lightning import LightningModule, LightningDataModule, Trainer, Callback @@ -15,7 +15,7 @@ pass - class LightningCLI(pl.cli.LightningCLI): + class LightningCLI(pl_cli.LightningCLI): def __init__(self, *args, trainer_class=NoFitTrainer, run=False, **kwargs): super().__init__(*args, trainer_class=trainer_class, run=run, **kwargs) @@ -42,13 +42,13 @@ mock_argv.stop() -####################################### -Eliminate config boilerplate (Advanced) -####################################### +################################################# +Configure hyperparameters from the CLI (Advanced) +################################################# -****************************** -Customize arguments by command -****************************** +********************************* +Customize arguments by subcommand +********************************* To customize arguments by subcommand, pass the config *before* the subcommand: .. code-block:: bash @@ -56,11 +56,12 @@ To customize arguments by subcommand, pass the config *before* the subcommand: $ python main.py [before] [subcommand] [after] $ python main.py ... fit ... -For example, here we set the Trainer argument [max_steps = 100] for the full training routine and [max_steps = 10] for testing: +For example, here we set the Trainer argument [max_steps = 100] for the full training routine and [max_steps = 10] for +testing: .. code-block:: bash - # config1.yaml + # config.yaml fit: trainer: max_steps: 100 @@ -73,21 +74,10 @@ now you can toggle this behavior by subcommand: .. code-block:: bash # full routine with max_steps = 100 - $ python main.py --config config1.yaml fit + $ python main.py --config config.yaml fit # test only with max_epochs = 10 - $ python main.py --config config1.yaml test - ----- - -********************* -Use groups of options -********************* -Groups of options can also be given as independent config files: - -.. code-block:: bash - - $ python trainer.py fit --trainer trainer.yaml --model model.yaml --data data.yaml [...] + $ python main.py --config config.yaml test ---- @@ -98,7 +88,7 @@ For certain enterprise workloads, Lightning CLI supports running from hosted con .. code-block:: bash - $ python trainer.py [subcommand] --config s3://bucket/config.yaml + $ python main.py [subcommand] --config s3://bucket/config.yaml For more options, refer to :doc:`Remote filesystems <../common/remote_fs>`. @@ -107,22 +97,23 @@ For more options, refer to :doc:`Remote filesystems <../common/remote_fs>`. ************************************** Use a config via environment variables ************************************** -For certain CI/CD systems, it's useful to pass in config files as environment variables: +For certain CI/CD systems, it's useful to pass in raw yaml config as environment variables: .. code-block:: bash - $ python trainer.py fit --trainer "$TRAINER_CONFIG" --model "$MODEL_CONFIG" [...] + $ python main.py fit --trainer "$TRAINER_CONFIG" --model "$MODEL_CONFIG" [...] ---- *************************************** Run from environment variables directly *************************************** -The Lightning CLI can convert every possible CLI flag into an environment variable. To enable this, set the *env_parse* argument: +The Lightning CLI can convert every possible CLI flag into an environment variable. To enable this, set the *env_parse* +argument: .. code:: python - LightningCLI(env_parse=True) + cli = LightningCLI(DemoModel, env_parse=True) now use the ``--help`` CLI flag with any subcommand: @@ -130,22 +121,20 @@ now use the ``--help`` CLI flag with any subcommand: $ python main.py fit --help -which will show you ALL possible environment variables you can now set: +which will show you ALL possible environment variables that can be set: .. code:: bash usage: main.py [options] fit [-h] [-c CONFIG] - [--trainer.max_epochs MAX_EPOCHS] [--trainer.min_epochs MIN_EPOCHS] - [--trainer.max_steps MAX_STEPS] [--trainer.min_steps MIN_STEPS] ... - [--ckpt_path CKPT_PATH] optional arguments: ... - --model CONFIG Path to a configuration file. - --model.out_dim OUT_DIM + ARG: --model.out_dim OUT_DIM + ENV: PL_FIT__MODEL__OUT_DIM (type: int, default: 10) - --model.learning_rate LEARNING_RATE + ARG: --model.learning_rate LEARNING_RATE + ENV: PL_FIT__MODEL__LEARNING_RATE (type: float, default: 0.02) now you can customize the behavior via environment variables: @@ -153,8 +142,8 @@ now you can customize the behavior via environment variables: .. code:: bash # set the options via env vars - $ export LEARNING_RATE=0.01 - $ export OUT_DIM=5 + $ export PL_FIT__MODEL__LEARNING_RATE=0.01 + $ export PL_FIT__MODEL__OUT_DIM=5 $ python main.py fit @@ -175,16 +164,12 @@ or if you want defaults per subcommand: cli = LightningCLI(MyModel, MyDataModule, parser_kwargs={"fit": {"default_config_files": ["my_fit_defaults.yaml"]}}) -For more configuration options, refer to the `ArgumentParser API -`_ documentation. - ---- ***************************** Enable variable interpolation ***************************** -In certain cases where multiple configs need to share variables, consider using variable interpolation. Variable interpolation -allows you to add variables to your yaml configs like so: +In certain cases where multiple settings need to share a value, consider using variable interpolation. For instance: .. code-block:: yaml @@ -211,3 +196,14 @@ After this, the CLI will automatically perform interpolation in yaml files: .. code:: bash python main.py --model.encoder_layers=12 + +For more details about the interpolation support and its limitations, have a look at the `jsonargparse +`__ and the `omegaconf +`__ documentations. + +.. note:: + + There are many use cases in which variable interpolation is not the correct approach. When a parameter **must + always** be derived from other settings, it shouldn't be up to the CLI user to do this in a config file. For + example, if the data and model both require ``batch_size`` and must be the same value, then + :ref:`cli_link_arguments` should be used instead of interpolation. diff --git a/docs/source-pytorch/cli/lightning_cli_advanced_3.rst b/docs/source-pytorch/cli/lightning_cli_advanced_3.rst index 7f5ed143869e3..38fa0662a012e 100644 --- a/docs/source-pytorch/cli/lightning_cli_advanced_3.rst +++ b/docs/source-pytorch/cli/lightning_cli_advanced_3.rst @@ -6,7 +6,7 @@ import torch from unittest import mock from typing import List - import pytorch_lightning as pl + import pytorch_lightning.cli as pl_cli from pytorch_lightning import LightningModule, LightningDataModule, Trainer, Callback @@ -15,7 +15,7 @@ pass - class LightningCLI(pl.cli.LightningCLI): + class LightningCLI(pl_cli.LightningCLI): def __init__(self, *args, trainer_class=NoFitTrainer, run=False, **kwargs): super().__init__(*args, trainer_class=trainer_class, run=run, **kwargs) @@ -45,12 +45,16 @@ mock_argv.stop() +################################################# +Configure hyperparameters from the CLI (Advanced) +################################################# + Instantiation only mode ^^^^^^^^^^^^^^^^^^^^^^^ -The CLI is designed to start fitting with minimal code changes. On class instantiation, the CLI will automatically -call the trainer function associated to the subcommand provided so you don't have to do it. -To avoid this, you can set the following argument: +The CLI is designed to start fitting with minimal code changes. On class instantiation, the CLI will automatically call +the trainer function associated with the subcommand provided, so you don't have to do it. To avoid this, you can set the +following argument: .. testcode:: @@ -58,20 +62,19 @@ To avoid this, you can set the following argument: # you'll have to call fit yourself: cli.trainer.fit(cli.model) -In this mode, there are subcommands added to the parser. -This can be useful to implement custom logic without having to subclass the CLI, but still using the CLI's instantiation -and argument parsing capabilities. +In this mode, subcommands are **not** added to the parser. This can be useful to implement custom logic without having +to subclass the CLI, but still, use the CLI's instantiation and argument parsing capabilities. Trainer Callbacks and arguments with class type ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ -A very important argument of the :class:`~pytorch_lightning.trainer.trainer.Trainer` class is the :code:`callbacks`. In -contrast to other more simple arguments which just require numbers or strings, :code:`callbacks` expects a list of -instances of subclasses of :class:`~pytorch_lightning.callbacks.Callback`. To specify this kind of argument in a config -file, each callback must be given as a dictionary including a :code:`class_path` entry with an import path of the class, -and optionally an :code:`init_args` entry with arguments required to instantiate it. Therefore, a simple configuration -file example that defines a couple of callbacks is the following: +A very important argument of the :class:`~pytorch_lightning.trainer.trainer.Trainer` class is the ``callbacks``. In +contrast to simpler arguments that take numbers or strings, ``callbacks`` expects a list of instances of subclasses of +:class:`~pytorch_lightning.callbacks.Callback`. To specify this kind of argument in a config file, each callback must be +given as a dictionary, including a ``class_path`` entry with an import path of the class and optionally an ``init_args`` +entry with arguments to use to instantiate. Therefore, a simple configuration file that defines two callbacks is the +following: .. code-block:: yaml @@ -87,9 +90,9 @@ file example that defines a couple of callbacks is the following: Similar to the callbacks, any parameter in :class:`~pytorch_lightning.trainer.trainer.Trainer` and user extended :class:`~pytorch_lightning.core.module.LightningModule` and :class:`~pytorch_lightning.core.datamodule.LightningDataModule` classes that have as type hint a class, can be -configured the same way using :code:`class_path` and :code:`init_args`. If the package that defines a subclass is -imported before the :class:`~pytorch_lightning.cli.LightningCLI` class is run, the name can be used instead of -the full import path. +configured the same way using ``class_path`` and ``init_args``. If the package that defines a subclass is imported +before the :class:`~pytorch_lightning.cli.LightningCLI` class is run, the name can be used instead of the full import +path. From command line the syntax is the following: @@ -117,16 +120,16 @@ callback appended. Here is an example: .. note:: - Serialized config files (e.g. ``--print_config`` or :class:`~pytorch_lightning.cli.SaveConfigCallback`) - always have the full ``class_path``'s, even when class name shorthand notation is used in command line or in input - config files. + Serialized config files (e.g. ``--print_config`` or :class:`~pytorch_lightning.cli.SaveConfigCallback`) always have + the full ``class_path``, even when class name shorthand notation is used in the command line or in input config + files. Multiple models and/or datasets ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ -Additionally, the tool can be configured such that a model and/or a datamodule is -specified by an import path and init arguments. For example, with a tool implemented as: +A CLI can be written such that a model and/or a datamodule is specified by an import path and init arguments. For +example, with a tool implemented as: .. code-block:: python @@ -154,18 +157,17 @@ A possible config file could be as follows: patience: 5 ... -Only model classes that are a subclass of :code:`MyModelBaseClass` would be allowed, and similarly only subclasses of -:code:`MyDataModuleBaseClass`. If as base classes :class:`~pytorch_lightning.core.module.LightningModule` and -:class:`~pytorch_lightning.core.datamodule.LightningDataModule` are given, then the tool would allow any lightning -module and data module. +Only model classes that are a subclass of ``MyModelBaseClass`` would be allowed, and similarly, only subclasses of +``MyDataModuleBaseClass``. If as base classes :class:`~pytorch_lightning.core.module.LightningModule` and +:class:`~pytorch_lightning.core.datamodule.LightningDataModule` is given, then the CLI would allow any lightning module +and data module. .. tip:: - Note that with the subclass modes the :code:`--help` option does not show information for a specific subclass. To - get help for a subclass the options :code:`--model.help` and :code:`--data.help` can be used, followed by the - desired class path. Similarly :code:`--print_config` does not include the settings for a particular subclass. To - include them the class path should be given before the :code:`--print_config` option. Examples for both help and - print config are: + Note that with the subclass modes, the ``--help`` option does not show information for a specific subclass. To get + help for a subclass, the options ``--model.help`` and ``--data.help`` can be used, followed by the desired class + path. Similarly, ``--print_config`` does not include the settings for a particular subclass. To include them, the + class path should be given before the ``--print_config`` option. Examples for both help and print config are: .. code-block:: bash @@ -176,10 +178,13 @@ module and data module. Models with multiple submodules ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ -Many use cases require to have several modules each with its own configurable options. One possible way to handle this -with LightningCLI is to implement a single module having as init parameters each of the submodules. Since the init -parameters have as type a class, then in the configuration these would be specified with :code:`class_path` and -:code:`init_args` entries. For instance a model could be implemented as: +Many use cases require to have several modules, each with its own configurable options. One possible way to handle this +with ``LightningCLI`` is to implement a single module having as init parameters each of the submodules. This is known as +`dependency injection `__ which is a good approach to improve +decoupling in your code base. + +Since the init parameters of the model have as a type hint a class, in the configuration, these would be specified with +``class_path`` and ``init_args`` entries. For instance, a model could be implemented as: .. testcode:: @@ -195,7 +200,7 @@ parameters have as type a class, then in the configuration these would be specif self.encoder = encoder self.decoder = decoder -If the CLI is implemented as :code:`LightningCLI(MyMainModel)` the configuration would be as follows: +If the CLI is implemented as ``LightningCLI(MyMainModel)`` the configuration would be as follows: .. code-block:: yaml @@ -209,69 +214,15 @@ If the CLI is implemented as :code:`LightningCLI(MyMainModel)` the configuration init_args: ... -It is also possible to combine :code:`subclass_mode_model=True` and submodules, thereby having two levels of -:code:`class_path`. - - -Class type defaults -^^^^^^^^^^^^^^^^^^^ - -The support for classes as type hints allows to try many possibilities with the same CLI. This is a useful feature, but -it can make it tempting to use an instance of a class as a default. For example: - -.. code-block:: - - class MyMainModel(LightningModule): - def __init__( - self, - backbone: torch.nn.Module = MyModel(encoder_layers=24), # BAD PRACTICE! - ): - super().__init__() - self.backbone = backbone - -Normally classes are mutable as it is in this case. The instance of :code:`MyModel` would be created the moment that the -module that defines :code:`MyMainModel` is first imported. This means that the default of :code:`backbone` will be -initialized before the CLI class runs :code:`seed_everything` making it non-reproducible. Furthermore, if -:code:`MyMainModel` is used more than once in the same Python process and the :code:`backbone` parameter is not -overridden, the same instance would be used in multiple places which very likely is not what the developer intended. -Having an instance as default also makes it impossible to generate the complete config file since for arbitrary classes -it is not known which arguments were used to instantiate it. - -A good solution to these problems is to not have a default or set the default to a special value (e.g. a -string) which would be checked in the init and instantiated accordingly. If a class parameter has no default and the CLI -is subclassed then a default can be set as follows: - -.. testcode:: - - default_backbone = { - "class_path": "import.path.of.MyModel", - "init_args": { - "encoder_layers": 24, - }, - } - - - class MyLightningCLI(LightningCLI): - def add_arguments_to_parser(self, parser): - parser.set_defaults({"model.backbone": default_backbone}) +It is also possible to combine ``subclass_mode_model=True`` and submodules, thereby having two levels of ``class_path``. -A more compact version that avoids writing a dictionary would be: - -.. testcode:: - - from jsonargparse import lazy_instance - - - class MyLightningCLI(LightningCLI): - def add_arguments_to_parser(self, parser): - parser.set_defaults({"model.backbone": lazy_instance(MyModel, encoder_layers=24)}) Optimizers ^^^^^^^^^^ -If you will not be changing the class, you can manually add the arguments for specific optimizers and/or -learning rate schedulers by subclassing the CLI. This has the advantage of providing the proper help message for those -classes. The following code snippet shows how to implement it: +In some cases, fixing the optimizer and/or learning scheduler might be desired instead of allowing multiple. For this, +you can manually add the arguments for specific classes by subclassing the CLI. The following code snippet shows how to +implement it: .. testcode:: @@ -280,9 +231,8 @@ classes. The following code snippet shows how to implement it: parser.add_optimizer_args(torch.optim.Adam) parser.add_lr_scheduler_args(torch.optim.lr_scheduler.ExponentialLR) -With this, in the config the :code:`optimizer` and :code:`lr_scheduler` groups would accept all of the options for the -given classes, in this example :code:`Adam` and :code:`ExponentialLR`. -Therefore, the config file would be structured like: +With this, in the config, the ``optimizer`` and ``lr_scheduler`` groups would accept all of the options for the given +classes, in this example, ``Adam`` and ``ExponentialLR``. Therefore, the config file would be structured like: .. code-block:: yaml @@ -295,14 +245,14 @@ Therefore, the config file would be structured like: trainer: ... -Where the arguments can be passed directly through command line without specifying the class. For example: +where the arguments can be passed directly through the command line without specifying the class. For example: .. code-block:: bash $ python trainer.py fit --optimizer.lr=0.01 --lr_scheduler.gamma=0.2 -The automatic implementation of :code:`configure_optimizers` can be disabled by linking the configuration group. An -example can be when one wants to add support for multiple optimizers: +The automatic implementation of ``configure_optimizers`` can be disabled by linking the configuration group. An example +can be when someone wants to add support for multiple optimizers: .. code-block:: python @@ -329,11 +279,10 @@ example can be when one wants to add support for multiple optimizers: cli = MyLightningCLI(MyModel) -The value given to :code:`optimizer*_init` will always be a dictionary including :code:`class_path` and -:code:`init_args` entries. The function :func:`~pytorch_lightning.cli.instantiate_class` -takes care of importing the class defined in :code:`class_path` and instantiating it using some positional arguments, -in this case :code:`self.parameters()`, and the :code:`init_args`. -Any number of optimizers and learning rate schedulers can be added when using :code:`link_to`. +The value given to ``optimizer*_init`` will always be a dictionary including ``class_path`` and ``init_args`` entries. +The function :func:`~pytorch_lightning.cli.instantiate_class` takes care of importing the class defined in +``class_path`` and instantiating it using some positional arguments, in this case ``self.parameters()``, and the +``init_args``. Any number of optimizers and learning rate schedulers can be added when using ``link_to``. With shorthand notation: @@ -378,7 +327,7 @@ the main function as follows: cli_main() Then it is possible to import the ``cli_main`` function to run it. Executing in a shell ``my_cli.py ---trainer.max_epochs=100", "--model.encoder_layers=24`` would be equivalent to: +--trainer.max_epochs=100 --model.encoder_layers=24`` would be equivalent to: .. code:: python diff --git a/docs/source-pytorch/cli/lightning_cli_expert.rst b/docs/source-pytorch/cli/lightning_cli_expert.rst index 60454f5e9bd82..94c28242d380c 100644 --- a/docs/source-pytorch/cli/lightning_cli_expert.rst +++ b/docs/source-pytorch/cli/lightning_cli_expert.rst @@ -6,7 +6,7 @@ import torch from unittest import mock from typing import List - import pytorch_lightning as pl + import pytorch_lightning.cli as pl_cli from pytorch_lightning import LightningModule, LightningDataModule, Trainer, Callback @@ -15,7 +15,7 @@ pass - class LightningCLI(pl.cli.LightningCLI): + class LightningCLI(pl_cli.LightningCLI): def __init__(self, *args, trainer_class=NoFitTrainer, run=False, **kwargs): super().__init__(*args, trainer_class=trainer_class, run=run, **kwargs) @@ -51,9 +51,9 @@ mock_argv.stop() -####################################### -Eliminate config boilerplate (Advanced) -####################################### +############################################### +Configure hyperparameters from the CLI (Expert) +############################################### **Audience:** Users who already understand the LightningCLI and want to customize it. ---- @@ -62,26 +62,24 @@ Eliminate config boilerplate (Advanced) Customize the LightningCLI ************************** -The init parameters of the :class:`~pytorch_lightning.cli.LightningCLI` class can be used to customize some -things, namely: the description of the tool, enabling parsing of environment variables and additional arguments to -instantiate the trainer and configuration parser. - -Nevertheless the init arguments are not enough for many use cases. For this reason the class is designed so that can be -extended to customize different parts of the command line tool. The argument parser class used by -:class:`~pytorch_lightning.cli.LightningCLI` is -:class:`~pytorch_lightning.cli.LightningArgumentParser` which is an extension of python's argparse, thus -adding arguments can be done using the :func:`add_argument` method. In contrast to argparse it has additional methods to -add arguments, for example :func:`add_class_arguments` adds all arguments from the init of a class, though requiring -parameters to have type hints. For more details about this please refer to the `respective documentation +The init parameters of the :class:`~pytorch_lightning.cli.LightningCLI` class can be used to customize some things, +e.g., the description of the tool, enabling parsing of environment variables, and additional arguments to instantiate +the trainer and configuration parser. + +Nevertheless, the init arguments are not enough for many use cases. For this reason, the class is designed so that it +can be extended to customize different parts of the command line tool. The argument parser class used by +:class:`~pytorch_lightning.cli.LightningCLI` is :class:`~pytorch_lightning.cli.LightningArgumentParser`, which is an +extension of python's argparse, thus adding arguments can be done using the :func:`add_argument` method. In contrast to +argparse, it has additional methods to add arguments. For example :func:`add_class_arguments` add all arguments from the +init of a class. For more details, see the `respective documentation `_. The :class:`~pytorch_lightning.cli.LightningCLI` class has the -:meth:`~pytorch_lightning.cli.LightningCLI.add_arguments_to_parser` method which can be implemented to include -more arguments. After parsing, the configuration is stored in the :code:`config` attribute of the class instance. The -:class:`~pytorch_lightning.cli.LightningCLI` class also has two methods that can be used to run code before -and after the trainer runs: :code:`before_` and :code:`after_`. -A realistic example for these would be to send an email before and after the execution. -The code for the :code:`fit` subcommand would be something like: +:meth:`~pytorch_lightning.cli.LightningCLI.add_arguments_to_parser` method can be implemented to include more arguments. +After parsing, the configuration is stored in the ``config`` attribute of the class instance. The +:class:`~pytorch_lightning.cli.LightningCLI` class also has two methods that can be used to run code before and after +the trainer runs: ``before_`` and ``after_``. A realistic example of this would be to send an +email before and after the execution. The code for the ``fit`` subcommand would be something like this: .. testcode:: @@ -98,25 +96,25 @@ The code for the :code:`fit` subcommand would be something like: cli = MyLightningCLI(MyModel) -Note that the config object :code:`self.config` is a dictionary whose keys are global options or groups of options. It -has the same structure as the yaml format described previously. This means for instance that the parameters used for -instantiating the trainer class can be found in :code:`self.config['fit']['trainer']`. +Note that the config object ``self.config`` is a namespace whose keys are global options or groups of options. It has +the same structure as the YAML format described previously. This means that the parameters used for instantiating the +trainer class can be found in ``self.config['fit']['trainer']``. .. tip:: - Have a look at the :class:`~pytorch_lightning.cli.LightningCLI` class API reference to learn about other - methods that can be extended to customize a CLI. + Have a look at the :class:`~pytorch_lightning.cli.LightningCLI` class API reference to learn about other methods + that can be extended to customize a CLI. ---- ************************** Configure forced callbacks ************************** -As explained previously, any Lightning callback can be added by passing it through command line or -including it in the config via :code:`class_path` and :code:`init_args` entries. +As explained previously, any Lightning callback can be added by passing it through the command line or including it in +the config via ``class_path`` and ``init_args`` entries. -However, certain callbacks MUST be coupled with a model so they are always present and configurable. -This can be implemented as follows: +However, certain callbacks **must** be coupled with a model so they are always present and configurable. This can be +implemented as follows: .. testcode:: @@ -131,7 +129,7 @@ This can be implemented as follows: cli = MyLightningCLI(MyModel) -To change the configuration of the :code:`EarlyStopping` in the config it would be: +To change the parameters for ``EarlyStopping`` in the config it would be: .. code-block:: yaml @@ -144,11 +142,11 @@ To change the configuration of the :code:`EarlyStopping` in the config it would .. note:: - The example above overrides a default in :code:`add_arguments_to_parser`. This is included to show that defaults can - be changed if needed. However, note that overriding of defaults in the source code is not intended to be used to - store the best hyperparameters for a task after experimentation. To ease reproducibility the source code should be - stable. It is better practice to store the best hyperparameters for a task in a configuration file independent from - the source code. + The example above overrides a default in ``add_arguments_to_parser``. This is included to show that defaults can be + changed if needed. However, note that overriding defaults in the source code is not intended to be used to store the + best hyperparameters for a task after experimentation. To guarantee reproducibility, the source code should be + stable. It is better to practice storing the best hyperparameters for a task in a configuration file independent + from the source code. ---- @@ -157,7 +155,7 @@ Class type defaults ******************* The support for classes as type hints allows to try many possibilities with the same CLI. This is a useful feature, but -it can make it tempting to use an instance of a class as a default. For example: +it is tempting to use an instance of a class as a default. For example: .. testcode:: @@ -169,17 +167,17 @@ it can make it tempting to use an instance of a class as a default. For example: super().__init__() self.backbone = backbone -Normally classes are mutable as it is in this case. The instance of :code:`MyModel` would be created the moment that the -module that defines :code:`MyMainModel` is first imported. This means that the default of :code:`backbone` will be -initialized before the CLI class runs :code:`seed_everything` making it non-reproducible. Furthermore, if -:code:`MyMainModel` is used more than once in the same Python process and the :code:`backbone` parameter is not -overridden, the same instance would be used in multiple places which very likely is not what the developer intended. -Having an instance as default also makes it impossible to generate the complete config file since for arbitrary classes -it is not known which arguments were used to instantiate it. +Normally classes are mutable, as in this case. The instance of ``MyModel`` would be created the moment that the module +that defines ``MyMainModel`` is first imported. This means that the default of ``backbone`` will be initialized before +the CLI class runs ``seed_everything``, making it non-reproducible. Furthermore, if ``MyMainModel`` is used more than +once in the same Python process and the ``backbone`` parameter is not overridden, the same instance would be used in +multiple places. Most likely, this is not what the developer intended. Having an instance as default also makes it +impossible to generate the complete config file since it is not known which arguments were used to instantiate it for +arbitrary classes. -A good solution to these problems is to not have a default or set the default to a special value (e.g. a -string) which would be checked in the init and instantiated accordingly. If a class parameter has no default and the CLI -is subclassed then a default can be set as follows: +An excellent solution to these problems is not to have a default or set the default to a unique value (e.g., a string). +Then check this value and instantiate it in the ``__init__`` body. If a class parameter has no default and the CLI is +subclassed, then a default can be set as follows: .. testcode:: @@ -208,14 +206,16 @@ A more compact version that avoids writing a dictionary would be: ---- -************************ -Connect two config files -************************ -Another case in which it might be desired to extend :class:`~pytorch_lightning.cli.LightningCLI` is that the -model and data module depend on a common parameter. For example in some cases both classes require to know the -:code:`batch_size`. It is a burden and error prone giving the same value twice in a config file. To avoid this the -parser can be configured so that a value is only given once and then propagated accordingly. With a tool implemented -like shown below, the :code:`batch_size` only has to be provided in the :code:`data` section of the config. +.. _cli_link_arguments: + +**************** +Argument linking +**************** +Another case in which it might be desired to extend :class:`~pytorch_lightning.cli.LightningCLI` is that the model and +data module depends on a common parameter. For example, in some cases, both classes require to know the ``batch_size``. +It is a burden and error-prone to give the same value twice in a config file. To avoid this, the parser can be +configured so that a value is only given once and then propagated accordingly. With a tool implemented like the one +shown below, the ``batch_size`` only has to be provided in the ``data`` section of the config. .. testcode:: @@ -236,11 +236,11 @@ The linking of arguments is observed in the help of the tool, which for this exa Number of samples in a batch (type: int, default: 8) Linked arguments: - model.batch_size <-- data.batch_size + data.batch_size --> model.batch_size Number of samples in a batch (type: int) Sometimes a parameter value is only available after class instantiation. An example could be that your model requires -the number of classes to instantiate its fully connected layer (for a classification task) but the value is not +the number of classes to instantiate its fully connected layer (for a classification task). But the value is not available until the data module has been instantiated. The code below illustrates how to address this. .. testcode:: @@ -254,13 +254,14 @@ available until the data module has been instantiated. The code below illustrate Instantiation links are used to automatically determine the order of instantiation, in this case data first. +.. note:: + + The linking of arguments is intended for things that are meant to be non-configurable. This improves the CLI user + experience since it avoids the need to provide more parameters. A related concept is a variable interpolation that + keeps things configurable. + .. tip:: The linking of arguments can be used for more complex cases. For example to derive a value via a function that takes multiple settings as input. For more details have a look at the API of `link_arguments `_. - - -The linking of arguments is intended for things that are meant to be non-configurable. This improves the CLI user -experience since it avoids the need for providing more parameters. A related concept is -variable interpolation which in contrast keeps things being configurable. diff --git a/docs/source-pytorch/cli/lightning_cli_faq.rst b/docs/source-pytorch/cli/lightning_cli_faq.rst index 672e27979f7c9..8830a58cc7664 100644 --- a/docs/source-pytorch/cli/lightning_cli_faq.rst +++ b/docs/source-pytorch/cli/lightning_cli_faq.rst @@ -1,96 +1,40 @@ :orphan: -.. testsetup:: * - :skipif: not _JSONARGPARSE_AVAILABLE +########################################### +Frequently asked questions for LightningCLI +########################################### - import torch - from unittest import mock - from typing import List - import pytorch_lightning as pl - from pytorch_lightning import LightningModule, LightningDataModule, Trainer, Callback +************************ +What does CLI stand for? +************************ +CLI is short for command line interface. This means it is a tool intended to be run from a terminal, similar to commands +like ``git``. +---- - class NoFitTrainer(Trainer): - def fit(self, *_, **__): - pass - - - class LightningCLI(pl.cli.LightningCLI): - def __init__(self, *args, trainer_class=NoFitTrainer, run=False, **kwargs): - super().__init__(*args, trainer_class=trainer_class, run=run, **kwargs) - - - class MyModel(LightningModule): - def __init__( - self, - encoder_layers: int = 12, - decoder_layers: List[int] = [2, 4], - batch_size: int = 8, - ): - pass - - - mock_argv = mock.patch("sys.argv", ["any.py"]) - mock_argv.start() - -.. testcleanup:: * - - mock_argv.stop() - -##################################### -Eliminate config boilerplate (expert) -##################################### - -*************** -Troubleshooting -*************** -The standard behavior for CLIs, when they fail, is to terminate the process with a non-zero exit code and a short message -to hint the user about the cause. This is problematic while developing the CLI since there is no information to track -down the root of the problem. A simple change in the instantiation of the ``LightningCLI`` can be used such that when -there is a failure an exception is raised and the full stack trace printed. - -.. testcode:: - - cli = LightningCLI(MyModel, parser_kwargs={"error_handler": None}) +.. _what-is-a-yaml-config-file: -.. note:: +*************************** +What is a yaml config file? +*************************** +A YAML is a standard for configuration files used to describe parameters for sections of a program. It is a common tool +in engineering and has recently started to gain popularity in machine learning. An example of a YAML file is the +following: - When asking about problems and reporting issues please set the ``error_handler`` to ``None`` and include the stack - trace in your description. With this, it is more likely for people to help out identifying the cause without needing - to create a reproducible script. +.. code:: yaml ----- + # file.yaml + car: + max_speed:100 + max_passengers:2 + plane: + fuel_capacity: 50 + class_3: + option_1: 'x' + option_2: 'y' -************************************* -Reproducibility with the LightningCLI -************************************* -The topic of reproducibility is complex and it is impossible to guarantee reproducibility by just providing a class that -people can use in unexpected ways. Nevertheless, the :class:`~pytorch_lightning.cli.LightningCLI` tries to -give a framework and recommendations to make reproducibility simpler. - -When an experiment is run, it is good practice to use a stable version of the source code, either being a released -package or at least a commit of some version controlled repository. For each run of a CLI the config file is -automatically saved including all settings. This is useful to figure out what was done for a particular run without -requiring to look at the source code. If by mistake the exact version of the source code is lost or some defaults -changed, having the full config means that most of the information is preserved. - -The class is targeted at implementing CLIs because running a command from a shell provides a separation with the Python -source code. Ideally the CLI would be placed in your path as part of the installation of a stable package, instead of -running from a clone of a repository that could have uncommitted local modifications. Creating installable packages that -include CLIs is out of the scope of this document. This is mentioned only as a teaser for people who would strive for -the best practices possible. - - -For every CLI implemented, users are encouraged to learn how to run it by reading the documentation printed with the -:code:`--help` option and use the :code:`--print_config` option to guide the writing of config files. A few more details -that might not be clear by only reading the help are the following. - -:class:`~pytorch_lightning.cli.LightningCLI` is based on argparse and as such follows the same arguments style -as many POSIX command line tools. Long options are prefixed with two dashes and its corresponding values should be -provided with an empty space or an equal sign, as :code:`--option value` or :code:`--option=value`. Command line options -are parsed from left to right, therefore if a setting appears multiple times the value most to the right will override -the previous ones. If a class has an init parameter that is required (i.e. no default value), it is given as -:code:`--option` which makes it explicit and more readable instead of relying on positional arguments. +If you are unfamiliar with YAML, the short introduction at `realpython.com#yaml-syntax +`__ might be a good starting point. ---- @@ -130,7 +74,34 @@ use a subcommand as follows: ---- -**************** -What is the CLI? -**************** -CLI is short for commandline interface. Use your terminal to enter these commands. +******************************************************* +What is the relation between LightningCLI and argparse? +******************************************************* + +:class:`~pytorch_lightning.cli.LightningCLI` makes use of `jsonargparse `__ +which is an extension of `argparse `__. Due to this +:class:`~pytorch_lightning.cli.LightningCLI` follows the same arguments style as many POSIX command line tools. Long +options are prefixed with two dashes and its corresponding values are separated by space or an equal sign, as ``--option +value`` or ``--option=value``. Command line options are parsed from left to right, therefore if a setting appears +multiple times the value most to the right will override the previous ones. + +---- + +**************************** +How do I troubleshoot a CLI? +**************************** +The standard behavior for CLIs, when they fail, is to terminate the process with a non-zero exit code and a short +message to hint the user about the cause. This is problematic while developing the CLI since there is no information to +track down the root of the problem. To troubleshoot set the environment variable ``JSONARGPARSE_DEBUG`` to any value +before running the CLI: + +.. code:: bash + + export JSONARGPARSE_DEBUG=true + python main.py fit + +.. note:: + + When asking about problems and reporting issues, please set the ``JSONARGPARSE_DEBUG`` and include the stack trace + in your description. With this, users are more likely to help identify the cause without needing to create a + reproducible script. diff --git a/docs/source-pytorch/cli/lightning_cli_intermediate.rst b/docs/source-pytorch/cli/lightning_cli_intermediate.rst index db8b6cf4c77ec..d2586cbd84f9c 100644 --- a/docs/source-pytorch/cli/lightning_cli_intermediate.rst +++ b/docs/source-pytorch/cli/lightning_cli_intermediate.rst @@ -1,87 +1,43 @@ :orphan: -########################################### -Eliminate config boilerplate (Intermediate) -########################################### -**Audience:** Users who want advanced modularity via the commandline interface (CLI). +##################################################### +Configure hyperparameters from the CLI (Intermediate) +##################################################### +**Audience:** Users who want advanced modularity via a command line interface (CLI). -**Pre-reqs:** You must already understand how to use a commandline and :doc:`LightningDataModule <../data/datamodule>`. +**Pre-reqs:** You must already understand how to use the command line and :doc:`LightningDataModule <../data/datamodule>`. ---- -*************************** -What is config boilerplate? -*************************** -As Lightning projects grow in complexity it becomes desirable to enable full customizability from the commandline (CLI) so you can -change any hyperparameters without changing your code: +************************* +LightningCLI requirements +************************* -.. code:: bash - - # Mix and match anything - $ python main.py fit --model.learning_rate 0.02 - $ python main.py fit --model.learning_rate 0.01 --trainer.fast_dev_run True - -This is what the Lightning CLI enables. Without the Lightning CLI, you usually end up with a TON of boilerplate that looks like this: - -.. code:: python - - from argparse import ArgumentParser - - if __name__ == "__main__": - parser = ArgumentParser() - parser.add_argument("--learning_rate_1", default=0.02) - parser.add_argument("--learning_rate_2", default=0.03) - parser.add_argument("--model", default="cnn") - parser.add_argument("--command", default="fit") - parser.add_argument("--run_fast", default=True) - ... - # add 100 more of these - ... - - args = parser.parse_args() - - if args.model == "cnn": - model = ConvNet(learning_rate=args.learning_rate_1) - elif args.model == "transformer": - model = Transformer(learning_rate=args.learning_rate_2) - trainer = Trainer(fast_dev_run=args.run_fast) - ... - - if args.command == "fit": - trainer.fit() - elif args.command == "test": - ... - -This kind of boilerplate is unsustainable as projects grow in complexity. - ----- - -************************ -Enable the Lightning CLI -************************ -To enable the Lightning CLI install the extras: +The :class:`~pytorch_lightning.cli.LightningCLI` class is designed to significantly ease the implementation of CLIs. To +use this class, an additional Python requirement is necessary than the minimal installation of Lightning provides. To +enable, either install all extras: .. code:: bash - pip install pytorch-lightning[extra] + pip install "pytorch-lightning[extra]" -if the above fails, only install jsonargparse: +or if only interested in ``LightningCLI``, just install jsonargparse: .. code:: bash - pip install -U jsonargparse[signatures] + pip install "jsonargparse[signatures]" ---- -************************** -Connect a model to the CLI -************************** -The simplest way to control a model with the CLI is to wrap it in the LightningCLI object: +****************** +Implementing a CLI +****************** +Implementing a CLI is as simple as instantiating a :class:`~pytorch_lightning.cli.LightningCLI` object giving as +arguments classes for a ``LightningModule`` and optionally a ``LightningDataModule``: .. code:: python # main.py - import torch from pytorch_lightning.cli import LightningCLI # simple demo classes for your convenience @@ -103,7 +59,7 @@ Now your model can be managed via the CLI. To see the available commands type: $ python main.py --help -Which prints out: +which prints out: .. code:: bash @@ -130,7 +86,7 @@ Which prints out: tune Runs routines to tune hyperparameters before training. -the message tells us that we have a few available subcommands: +The message tells us that we have a few available subcommands: .. code:: bash @@ -151,16 +107,18 @@ which you can use depending on your use case: ************************** Train a model with the CLI ************************** -To run the full training routine (train, val, test), use the subcommand ``fit``: +To train a model, use the ``fit`` subcommand: .. code:: bash python main.py fit -View all available options with the ``--help`` command: +View all available options with the ``--help`` argument given after the subcommand: .. code:: bash + $ python main.py fit --help + usage: main.py [options] fit [-h] [-c CONFIG] [--seed_everything SEED_EVERYTHING] [--trainer CONFIG] ... @@ -183,10 +141,18 @@ With the Lightning CLI enabled, you can now change the parameters without touchi .. code:: bash # change the learning_rate - python main.py fit --model.out_dim 30 + python main.py fit --model.learning_rate 0.1 - # change the out dimensions also + # change the output dimensions also python main.py fit --model.out_dim 10 --model.learning_rate 0.1 # change trainer and data arguments too python main.py fit --model.out_dim 2 --model.learning_rate 0.1 --data.data_dir '~/' --trainer.logger False + +.. tip:: + + The options that become available in the CLI are the ``__init__`` parameters of the ``LightningModule`` and + ``LightningDataModule`` classes. Thus, to make hyperparameters configurable, just add them to your class's + ``__init__``. It is highly recommended that these parameters are described in the docstring so that the CLI shows + them in the help. Also, the parameters should have accurate type hints so that the CLI can fail early and give + understandable error messages when incorrect values are given. diff --git a/docs/source-pytorch/cli/lightning_cli_intermediate_2.rst b/docs/source-pytorch/cli/lightning_cli_intermediate_2.rst index 8e312b7233a6d..04a2795840f50 100644 --- a/docs/source-pytorch/cli/lightning_cli_intermediate_2.rst +++ b/docs/source-pytorch/cli/lightning_cli_intermediate_2.rst @@ -1,20 +1,20 @@ :orphan: -########################################### -Eliminate config boilerplate (intermediate) -########################################### +##################################################### +Configure hyperparameters from the CLI (Intermediate) +##################################################### **Audience:** Users who have multiple models and datasets per project. **Pre-reqs:** You must have read :doc:`(Control it all from the CLI) `. ---- -**************************************** -Why do I want to mix models and datasets -**************************************** -Lightning projects usually begin with one model and one dataset. As the project grows in complexity and you introduce more models and more datasets, it becomes desirable -to mix any model with any dataset directly from the commandline without changing your code. - +*************************** +Why mix models and datasets +*************************** +Lightning projects usually begin with one model and one dataset. As the project grows in complexity and you introduce +more models and more datasets, it becomes desirable to mix any model with any dataset directly from the command line +without changing your code. .. code:: bash @@ -22,7 +22,8 @@ to mix any model with any dataset directly from the commandline without changing $ python main.py fit --model=GAN --data=MNIST $ python main.py fit --model=Transformer --data=MNIST -This is what the Lightning CLI enables. Otherwise, this kind of configuration requires a significant amount of boilerplate that often looks like this: +``LightningCLI`` makes this very simple. Otherwise, this kind of configuration requires a significant amount of +boilerplate that often looks like this: .. code:: python @@ -43,6 +44,8 @@ This is what the Lightning CLI enables. Otherwise, this kind of configuration re # mix them! trainer.fit(model, datamodule) +It is highly recommended that you avoid writing this kind of boilerplate and use ``LightningCLI`` instead. + ---- ************************* @@ -53,9 +56,8 @@ To support multiple models, when instantiating ``LightningCLI`` omit the ``model .. code:: python # main.py - - from pytorch_lightning import demos - from pytorch_lightning.utilities import cli as pl_cli + from pytorch_lightning.cli import LightningCLI + from pytorch_lightning.demos.boring_classes import DemoModel class Model1(DemoModel): @@ -70,7 +72,7 @@ To support multiple models, when instantiating ``LightningCLI`` omit the ``model return super().configure_optimizers() - cli = pl_cli.LightningCLI(datamodule_class=BoringDataModule) + cli = LightningCLI(datamodule_class=BoringDataModule) Now you can choose between any model from the CLI: @@ -82,19 +84,24 @@ Now you can choose between any model from the CLI: # use Model2 python main.py fit --model Model2 +.. tip:: + + Instead of omitting the ``model_class`` parameter, you can give a base class and ``subclass_mode_model=True``. This + will make the CLI only accept models which are a subclass of the given base class. + ---- -******************** -Multiple DataModules -******************** +***************************** +Multiple LightningDataModules +***************************** To support multiple data modules, when instantiating ``LightningCLI`` omit the ``datamodule_class`` parameter: .. code:: python # main.py import torch - from pytorch_lightning.utilities import cli as pl_cli - from pytorch_lightning import demos + from pytorch_lightning.cli import LightningCLI + from pytorch_lightning.demos.boring_classes import BoringDataModule class FakeDataset1(BoringDataModule): @@ -109,7 +116,7 @@ To support multiple data modules, when instantiating ``LightningCLI`` omit the ` return torch.utils.data.DataLoader(self.random_train) - cli = pl_cli.LightningCLI(DemoModel) + cli = LightningCLI(DemoModel) Now you can choose between any dataset at runtime: @@ -121,19 +128,36 @@ Now you can choose between any dataset at runtime: # use Model2 python main.py fit --data FakeDataset2 +.. tip:: + + Instead of omitting the ``datamodule_class`` parameter, you can give a base class and ``subclass_mode_data=True``. + This will make the CLI only accept data modules that are a subclass of the given base class. + ---- -***************** -Custom optimizers -***************** -Any subclass of ``torch.optim.Optimizer`` can be used as an optimizer: +******************* +Multiple optimizers +******************* +Standard optimizers from ``torch.optim`` work out of the box: + +.. code:: bash + + python main.py fit --optimizer AdamW + +If the optimizer you want needs other arguments, add them via the CLI (no need to change your code)! + +.. code:: bash + + python main.py fit --optimizer SGD --optimizer.lr=0.01 + +Furthermore, any custom subclass of :class:`torch.optim.Optimizer` can be used as an optimizer: .. code:: python # main.py import torch - from pytorch_lightning.utilities import cli as pl_cli - from pytorch_lightning import demos + from pytorch_lightning.cli import LightningCLI + from pytorch_lightning.demos.boring_classes import DemoModel, BoringDataModule class LitAdam(torch.optim.Adam): @@ -148,7 +172,7 @@ Any subclass of ``torch.optim.Optimizer`` can be used as an optimizer: super().step(closure) - cli = pl_cli.LightningCLI(DemoModel, BoringDataModule) + cli = LightningCLI(DemoModel, BoringDataModule) Now you can choose between any optimizer at runtime: @@ -160,32 +184,31 @@ Now you can choose between any optimizer at runtime: # use FancyAdam python main.py fit --optimizer FancyAdam -Bonus: If you need only 1 optimizer, the Lightning CLI already works out of the box with any Optimizer from -``torch.optim``: +---- + +******************* +Multiple schedulers +******************* +Standard learning rate schedulers from ``torch.optim.lr_scheduler`` work out of the box: .. code:: bash - python main.py fit --optimizer AdamW + python main.py fit --lr_scheduler CosineAnnealingLR -If the optimizer you want needs other arguments, add them via the CLI (no need to change your code)! +If the scheduler you want needs other arguments, add them via the CLI (no need to change your code)! .. code:: bash - python main.py fit --optimizer SGD --optimizer.lr=0.01 - ----- + python main.py fit --lr_scheduler=ReduceLROnPlateau --lr_scheduler.monitor=epoch -******************** -Custom LR schedulers -******************** -Any subclass of ``torch.optim.lr_scheduler._LRScheduler`` can be used as learning rate scheduler: +Furthermore, any custom subclass of ``torch.optim.lr_scheduler._LRScheduler`` can be used as learning rate scheduler: .. code:: python # main.py import torch - from pytorch_lightning.utilities import cli as pl_cli - from pytorch_lightning import demos + from pytorch_lightning.cli import LightningCLI + from pytorch_lightning.demos.boring_classes import DemoModel, BoringDataModule class LitLRScheduler(torch.optim.lr_scheduler.CosineAnnealingLR): @@ -194,7 +217,7 @@ Any subclass of ``torch.optim.lr_scheduler._LRScheduler`` can be used as learnin super().step() - cli = pl_cli.LightningCLI(DemoModel, BoringDataModule) + cli = LightningCLI(DemoModel, BoringDataModule) Now you can choose between any learning rate scheduler at runtime: @@ -204,38 +227,22 @@ Now you can choose between any learning rate scheduler at runtime: python main.py fit --lr_scheduler LitLRScheduler -Bonus: If you need only 1 LRScheduler, the Lightning CLI already works out of the box with any LRScheduler from -``torch.optim``: - -.. code:: bash - - python main.py fit --lr_scheduler CosineAnnealingLR - python main.py fit --lr_scheduler LinearLR - ... - -If the scheduler you want needs other arguments, add them via the CLI (no need to change your code)! - -.. code:: bash - - python main.py fit --lr_scheduler=ReduceLROnPlateau --lr_scheduler.monitor=epoch - ---- ************************ Classes from any package ************************ -In the previous sections the classes to select were defined in the same python file where the ``LightningCLI`` class is -run. To select classes from any package by using only the class name, import the respective package: +In the previous sections, custom classes to select were defined in the same python file where the ``LightningCLI`` class +is run. To select classes from any package by using only the class name, import the respective package: .. code:: python - import torch - from pytorch_lightning.utilities import cli as pl_cli + from pytorch_lightning.cli import LightningCLI import my_code.models # noqa: F401 import my_code.data_modules # noqa: F401 import my_code.optimizers # noqa: F401 - cli = pl_cli.LightningCLI() + cli = LightningCLI() Now use any of the classes: @@ -243,9 +250,25 @@ Now use any of the classes: python main.py fit --model Model1 --data FakeDataset1 --optimizer LitAdam --lr_scheduler LitLRScheduler -The ``# noqa: F401`` comment avoids a linter warning that the import is unused. It is also possible to select subclasses -that have not been imported by giving the full import path: +The ``# noqa: F401`` comment avoids a linter warning that the import is unused. + +It is also possible to select subclasses that have not been imported by giving the full import path: .. code:: bash python main.py fit --model my_code.models.Model1 + +---- + +************************* +Help for specific classes +************************* +When multiple models or datasets are accepted, the main help of the CLI does not include their specific parameters. To +show this specific help, additional help arguments expect the class name or its import path. For example: + +.. code:: bash + + python main.py fit --model.help Model1 + python main.py fit --data.help FakeDataset2 + python main.py fit --optimizer.help Adagrad + python main.py fit --lr_scheduler.help StepLR diff --git a/docs/source-pytorch/common/hyperparameters.rst b/docs/source-pytorch/common/hyperparameters.rst index 5813109fe2fab..79af37c74345e 100644 --- a/docs/source-pytorch/common/hyperparameters.rst +++ b/docs/source-pytorch/common/hyperparameters.rst @@ -1,11 +1,19 @@ +:orphan: + .. testsetup:: * from argparse import ArgumentParser, Namespace sys.argv = ["foo"] -Configure hyperparameters from the CLI --------------------------------------- +Configure hyperparameters from the CLI (legacy) +----------------------------------------------- + +.. warning:: + + This is the documentation for the use of Python's ``argparse`` to implement a CLI. This approach is no longer + recommended, and people are encouraged to use the new `LightningCLI `_ class instead. + Lightning has utilities to interact seamlessly with the command line ``ArgumentParser`` and plays well with the hyperparameter optimization framework of your choice. @@ -105,84 +113,6 @@ Finally, make sure to start the training like so: ---------- -LightningModule hyperparameters -^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ -Often times we train many versions of a model. You might share that model or come back to it a few months later -at which point it is very useful to know how that model was trained (i.e.: what learning rate, neural network, etc...). - -Lightning has a standardized way of saving the information for you in checkpoints and YAML files. The goal here is to -improve readability and reproducibility. - -save_hyperparameters -"""""""""""""""""""" - -Use :meth:`~pytorch_lightning.core.module.LightningModule.save_hyperparameters` within your -:class:`~pytorch_lightning.core.module.LightningModule`'s ``__init__`` method. -It will enable Lightning to store all the provided arguments under the ``self.hparams`` attribute. -These hyperparameters will also be stored within the model checkpoint, which simplifies model re-instantiation after training. - -.. code-block:: python - - class LitMNIST(LightningModule): - def __init__(self, layer_1_dim=128, learning_rate=1e-2): - super().__init__() - # call this to save (layer_1_dim=128, learning_rate=1e-4) to the checkpoint - self.save_hyperparameters() - - # equivalent - self.save_hyperparameters("layer_1_dim", "learning_rate") - - # Now possible to access layer_1_dim from hparams - self.hparams.layer_1_dim - - -In addition, loggers that support it will automatically log the contents of ``self.hparams``. - -Excluding hyperparameters -""""""""""""""""""""""""" - -By default, every parameter of the ``__init__`` method will be considered a hyperparameter to the LightningModule. -However, sometimes some parameters need to be excluded from saving, for example when they are not serializable. -Those parameters should be provided back when reloading the LightningModule. -In this case, exclude them explicitly: - -.. code-block:: python - - class LitMNIST(LightningModule): - def __init__(self, loss_fx, generator_network, layer_1_dim=128): - super().__init__() - self.layer_1_dim = layer_1_dim - self.loss_fx = loss_fx - - # call this to save only (layer_1_dim=128) to the checkpoint - self.save_hyperparameters("layer_1_dim") - - # equivalent - self.save_hyperparameters(ignore=["loss_fx", "generator_network"]) - - -load_from_checkpoint -"""""""""""""""""""" - -LightningModules that have hyperparameters automatically saved with :meth:`~pytorch_lightning.core.module.LightningModule.save_hyperparameters` -can conveniently be loaded and instantiated directly from a checkpoint with :meth:`~pytorch_lightning.core.module.LightningModule.load_from_checkpoint`: - -.. code-block:: python - - # to load specify the other args - model = LitMNIST.load_from_checkpoint(PATH, loss_fx=torch.nn.SomeOtherLoss, generator_network=MyGenerator()) - - -If parameters were excluded, they need to be provided at the time of loading: - -.. code-block:: python - - # the excluded parameters were `loss_fx` and `generator_network` - model = LitMNIST.load_from_checkpoint(PATH, loss_fx=torch.nn.SomeOtherLoss, generator_network=MyGenerator()) - - ----------- - Trainer args ^^^^^^^^^^^^ To recap, add ALL possible trainer flags to the argparser and init the ``Trainer`` this way diff --git a/docs/source-pytorch/common/lightning_module.rst b/docs/source-pytorch/common/lightning_module.rst index cb6432448675e..cd64307ad953d 100644 --- a/docs/source-pytorch/common/lightning_module.rst +++ b/docs/source-pytorch/common/lightning_module.rst @@ -729,6 +729,87 @@ Check out :ref:`Inference in Production ` guide to learn a ----------- +******************** +Save Hyperparameters +******************** + +Often times we train many versions of a model. You might share that model or come back to it a few months later at which +point it is very useful to know how that model was trained (i.e.: what learning rate, neural network, etc...). + +Lightning has a standardized way of saving the information for you in checkpoints and YAML files. The goal here is to +improve readability and reproducibility. + +save_hyperparameters +==================== + +Use :meth:`~pytorch_lightning.core.module.LightningModule.save_hyperparameters` within your +:class:`~pytorch_lightning.core.module.LightningModule`'s ``__init__`` method. It will enable Lightning to store all the +provided arguments under the ``self.hparams`` attribute. These hyperparameters will also be stored within the model +checkpoint, which simplifies model re-instantiation after training. + +.. code-block:: python + + class LitMNIST(LightningModule): + def __init__(self, layer_1_dim=128, learning_rate=1e-2): + super().__init__() + # call this to save (layer_1_dim=128, learning_rate=1e-4) to the checkpoint + self.save_hyperparameters() + + # equivalent + self.save_hyperparameters("layer_1_dim", "learning_rate") + + # Now possible to access layer_1_dim from hparams + self.hparams.layer_1_dim + + +In addition, loggers that support it will automatically log the contents of ``self.hparams``. + +Excluding hyperparameters +========================= + +By default, every parameter of the ``__init__`` method will be considered a hyperparameter to the LightningModule. +However, sometimes some parameters need to be excluded from saving, for example when they are not serializable. Those +parameters should be provided back when reloading the LightningModule. In this case, exclude them explicitly: + +.. code-block:: python + + class LitMNIST(LightningModule): + def __init__(self, loss_fx, generator_network, layer_1_dim=128): + super().__init__() + self.layer_1_dim = layer_1_dim + self.loss_fx = loss_fx + + # call this to save only (layer_1_dim=128) to the checkpoint + self.save_hyperparameters("layer_1_dim") + + # equivalent + self.save_hyperparameters(ignore=["loss_fx", "generator_network"]) + + +load_from_checkpoint +==================== + +LightningModules that have hyperparameters automatically saved with +:meth:`~pytorch_lightning.core.module.LightningModule.save_hyperparameters` can conveniently be loaded and instantiated +directly from a checkpoint with :meth:`~pytorch_lightning.core.module.LightningModule.load_from_checkpoint`: + +.. code-block:: python + + # to load specify the other args + model = LitMNIST.load_from_checkpoint(PATH, loss_fx=torch.nn.SomeOtherLoss, generator_network=MyGenerator()) + + +If parameters were excluded, they need to be provided at the time of loading: + +.. code-block:: python + + # the excluded parameters were `loss_fx` and `generator_network` + model = LitMNIST.load_from_checkpoint(PATH, loss_fx=torch.nn.SomeOtherLoss, generator_network=MyGenerator()) + + +----------- + + ************* Child Modules ************* diff --git a/docs/source-pytorch/index.rst b/docs/source-pytorch/index.rst index 1c867e1e345e9..77149ffbd6004 100644 --- a/docs/source-pytorch/index.rst +++ b/docs/source-pytorch/index.rst @@ -187,11 +187,10 @@ Current Lightning Users Avoid overfitting model/build_model.rst - common/hyperparameters + cli/lightning_cli common/progress_bar deploy/production advanced/training_tricks - cli/lightning_cli tuning/profiler Manage experiments Organize existing PyTorch into Lightning