Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix LightningCLI docs after overhaul of the documentation #14976

Merged
merged 12 commits into from
Nov 9, 2022
Merged
52 changes: 43 additions & 9 deletions docs/source-pytorch/cli/lightning_cli.rst
Original file line number Diff line number Diff line change
Expand Up @@ -2,9 +2,25 @@

.. _lightning-cli:

############################
Eliminate config boilerplate
awaelchli marked this conversation as resolved.
Show resolved Hide resolved
awaelchli marked this conversation as resolved.
Show resolved Hide resolved
############################
######################################
Configure hyperparameters from the CLI
######################################

*************
Why use a CLI
*************

When running deep learning experiments there are a couple good practices that are recommended to follow:

- Separate configuration from source code
- Guarantee reproducibility of experiments

Implementing a command line interface (CLI) makes possible to execute an experiment from a shell terminal. By having a
CLI, there is a clear separation between the Python source code and what hyperparameters are used for a particular
experiment. If the CLI corresponds to a stable version of the code, then reproducing an experiment can be achieved by
installing the same version of the code plus dependencies and running with the same configuration (CLI arguments).
mauvilsa marked this conversation as resolved.
Show resolved Hide resolved
mauvilsa marked this conversation as resolved.
Show resolved Hide resolved

----

*********
Basic use
Expand All @@ -26,7 +42,7 @@ Basic use
:tag: intermediate

.. displayitem::
:header: 2: Mix models and datasets
:header: 2: Mix models, datasets and optimizers
:description: Support multiple models, datasets, optimizers and learning rate schedulers
:col_css: col-md-4
:button_link: lightning_cli_intermediate_2.html
Expand Down Expand Up @@ -60,34 +76,52 @@ Advanced use
.. displayitem::
:header: YAML for production
:description: Use the Lightning CLI with YAMLs for production environments
:col_css: col-md-6
:col_css: col-md-4
:button_link: lightning_cli_advanced_2.html
:height: 150
:tag: advanced

.. displayitem::
:header: Customize for complex projects
:description: Learn how to implement CLIs for complex projects.
:col_css: col-md-6
:description: Learn how to implement CLIs for complex projects
:col_css: col-md-4
:button_link: lightning_cli_advanced_3.html
:height: 150
:tag: expert
:tag: advanced

.. displayitem::
:header: Extend the Lightning CLI
:description: Customize the Lightning CLI
:col_css: col-md-6
:col_css: col-md-4
:button_link: lightning_cli_expert.html
:height: 150
:tag: expert

----

*************
Miscellaneous
*************

.. raw:: html

<div class="display-card-container">
<div class="row">

.. displayitem::
:header: FAQ
:description: Frequently asked questions about working with the Lightning CLI and YAML files
:col_css: col-md-6
:button_link: lightning_cli_faq.html
:height: 150

.. displayitem::
:header: Legacy CLIs
:description: Documentation for the legacy argparse-based CLIs
:col_css: col-md-6
:button_link: ../common/hyperparameters.html
:height: 150
awaelchli marked this conversation as resolved.
Show resolved Hide resolved

.. raw:: html

</div>
Expand Down
173 changes: 114 additions & 59 deletions docs/source-pytorch/cli/lightning_cli_advanced.rst
Original file line number Diff line number Diff line change
@@ -1,113 +1,168 @@
:orphan:

#######################################
Eliminate config boilerplate (Advanced)
#######################################
#################################################
Configure hyperparameters from the CLI (Advanced)
#################################################
**Audience:** Users looking to modularize their code for a professional project.

**Pre-reqs:** You must have read :doc:`(Control it all from the CLI) <lightning_cli_intermediate>`.
**Pre-reqs:** You must have read :doc:`(Mix models and datasets) <lightning_cli_intermediate_2>`.

As a project becomes more complex, the number of configurable options becomes very large, making it inconvenient to
control through individual command line arguments. To address this, CLIs implemented using
:class:`~pytorch_lightning.cli.LightningCLI` always support receiving input from configuration files. The default format
used for config files is yaml.

.. tip::

If you are unfamiliar with yaml, it is recommended that you first read :ref:`what-is-a-yaml-config-file`.
mauvilsa marked this conversation as resolved.
Show resolved Hide resolved


----

***************************
What is a yaml config file?
***************************
mauvilsa marked this conversation as resolved.
Show resolved Hide resolved
A yaml is a standard configuration file that describes parameters for sections of a program. It is a common tool in engineering, and it has recently started to gain popularity in machine learning.
***********************
Run using a config file
***********************
To run the CLI using a yaml config, do:

.. code:: yaml
.. code:: bash

# file.yaml
car:
max_speed:100
max_passengers:2
plane:
fuel_capacity: 50
class_3:
option_1: 'x'
option_2: 'y'
python main.py fit --config config.yaml

Individual arguments can be given to override options in the config file:

.. code:: bash

python main.py fit --config config.yaml --trainer.max_epochs 100

----

************************
Automatic save of config
************************

*********************
Print the config used
*********************
Before or after you run a training routine, you can print the full training spec in yaml format using ``--print_config``:
To ease experiment reporting and reproducibility, by default ``LightningCLI`` automatically saves the full yaml
configuration in the log directory. After multiple fit runs with different hyperparameters, each one will have in its
respective log directory a ``config.yaml`` file. These files can be used to trivially reproduce an experiment, e.g.:

.. code:: bash

python main.py fit --config lightning_logs/version_7/config.yaml

The automatic saving of the config is done by the special callback :class:`~pytorch_lightning.cli.SaveConfigCallback`.
This callback is automatically added to the ``Trainer``. To disable the save of the config instantiate ``LightningCLI``
with ``save_config_callback=None``.
mauvilsa marked this conversation as resolved.
Show resolved Hide resolved

----

*********************************
Prepare a config file for the CLI
*********************************
The ``--help`` option of the CLIs can be used learn which configuration options are available and how to use them.
However, writing a config from scratch can be time consuming and error prone. To alleviate this, the CLIs have the
``--print_config`` argument, which prints to stdout the configuration without running the command.

For a CLI implemented as ``LightningCLI(DemoModel, BoringDataModule)``, executing:
mauvilsa marked this conversation as resolved.
Show resolved Hide resolved

.. code:: bash

python main.py fit --print_config

which generates the following config:
generates a config with all default values like the following:

.. code:: bash

seed_everything: null
trainer:
logger: true
...
terminate_on_nan: null
logger: true
...
model:
out_dim: 10
learning_rate: 0.02
out_dim: 10
learning_rate: 0.02
data:
data_dir: ./
data_dir: ./
ckpt_path: null

----

********************************
Write a config yaml from the CLI
********************************
To have a copy of the configuration that produced this model, save a *yaml* file from the *--print_config* outputs:
Other command line arguments can be given and will be considered in the printed configuration. A use case for this is CLIs
that accept multiple models. By default no model is selected, which means that the printed config will not include model
settings. To get a config with the default values of a particular model would be:
mauvilsa marked this conversation as resolved.
Show resolved Hide resolved

.. code:: bash

python main.py fit --model.learning_rate 0.001 --print_config > config.yaml
python main.py fit --model DemoModel --print_config

----

**********************
Run from a single yaml
**********************
To run from a yaml, pass a yaml produced with ``--print_config`` to the ``--config`` argument:
which generates a config like:

.. code:: bash

python main.py fit --config config.yaml
seed_everything: null
trainer:
...
model:
class_path: pytorch_lightning.demos.boring_classes.DemoModel
init_args:
out_dim: 10
learning_rate: 0.02
ckpt_path: null

when using a yaml to run, you can still pass in inline arguments
.. tip::

.. code:: bash
A standard procedure to run experiments can be:

python main.py fit --config config.yaml --trainer.max_epochs 100
.. code:: bash

# Print a configuration to have as reference
python main.py fit --print_config > config.yaml
# Modify the config to your liking - you can remove all default arguments
nano config.yaml
# Fit your model using the edited configuration
python main.py fit --config config.yaml

----

******************
Compose yaml files
******************
For production or complex research projects it's advisable to have each object in its own config file. To compose all the configs, pass them all inline:
********************
Compose config files
********************
Multiple config files can be provided and they will be parsed sequentially. Let's say we have two configs with common
settings:
mauvilsa marked this conversation as resolved.
Show resolved Hide resolved

.. code:: yaml

# config_1.yaml
trainer:
num_epochs: 10
...

# config_2.yaml
trainer:
num_epochs: 20
...

The value from the last config will be used, ``num_epochs = 20`` in this case:

.. code-block:: bash

$ python trainer.py fit --config trainer.yaml --config datamodules.yaml --config models.yaml ...
$ python main.py fit --config config_1.yaml --config config_2.yaml

The configs will be parsed sequentially. Let's say we have two configs with the same args:
----

*********************
Use groups of options
*********************
Groups of options can also be given as independent config files. For configs like:

.. code:: yaml

# trainer.yaml
trainer:
num_epochs: 10
num_epochs: 10

# model.yaml
out_dim: 7

# trainer_2.yaml
trainer:
num_epochs: 20
# data.yaml
data_dir: ./data

the ones from the last config will be used (num_epochs = 20) in this case:
a fit command can be run as:

.. code-block:: bash

$ python trainer.py fit --config trainer.yaml --config trainer_2.yaml
$ python main.py fit --trainer trainer.yaml --model model.yaml --data data.yaml [...]
Loading