diff --git a/docs/source-pytorch/cli/lightning_cli.rst b/docs/source-pytorch/cli/lightning_cli.rst
index a0933b447ad31..5b0bf73754db3 100644
--- a/docs/source-pytorch/cli/lightning_cli.rst
+++ b/docs/source-pytorch/cli/lightning_cli.rst
@@ -2,9 +2,25 @@
 
 .. _lightning-cli:
 
-############################
-Eliminate config boilerplate
-############################
+######################################
+Configure hyperparameters from the CLI
+######################################
+
+*************
+Why use a CLI
+*************
+
+When running deep learning experiments, there are a couple of good practices that are recommended to follow:
+
+- Separate configuration from source code
+- Guarantee reproducibility of experiments
+
+Implementing a command line interface (CLI) makes it possible to execute an experiment from a shell terminal. By having
+a CLI, there is a clear separation between the Python source code and what hyperparameters are used for a particular
+experiment. If the CLI corresponds to a stable version of the code, reproducing an experiment can be achieved by
+installing the same version of the code plus dependencies and running with the same configuration (CLI arguments).
+
+----
 
 *********
 Basic use
@@ -26,7 +42,7 @@ Basic use
    :tag: intermediate
 
 .. displayitem::
-   :header: 2: Mix models and datasets
+   :header: 2: Mix models, datasets and optimizers
    :description: Support multiple models, datasets, optimizers and learning rate schedulers
    :col_css: col-md-4
    :button_link: lightning_cli_intermediate_2.html
@@ -60,27 +76,38 @@ Advanced use
 .. displayitem::
    :header: YAML for production
    :description: Use the Lightning CLI with YAMLs for production environments
-   :col_css: col-md-6
+   :col_css: col-md-4
    :button_link: lightning_cli_advanced_2.html
    :height: 150
    :tag: advanced
 
 .. displayitem::
    :header: Customize for complex projects
-   :description: Learn how to implement CLIs for complex projects.
-   :col_css: col-md-6
+   :description: Learn how to implement CLIs for complex projects
+   :col_css: col-md-4
    :button_link: lightning_cli_advanced_3.html
    :height: 150
-   :tag: expert
+   :tag: advanced
 
 .. displayitem::
    :header: Extend the Lightning CLI
    :description: Customize the Lightning CLI
-   :col_css: col-md-6
+   :col_css: col-md-4
    :button_link: lightning_cli_expert.html
    :height: 150
    :tag: expert
 
+----
+
+*************
+Miscellaneous
+*************
+
+.. raw:: html
+
+    <div class="display-card-container">
+        <div class="row">
+
 .. displayitem::
    :header: FAQ
    :description: Frequently asked questions about working with the Lightning CLI and YAML files
@@ -88,6 +115,13 @@ Advanced use
    :button_link: lightning_cli_faq.html
    :height: 150
 
+.. displayitem::
+   :header: Legacy CLIs
+   :description: Documentation for the legacy argparse-based CLIs
+   :col_css: col-md-6
+   :button_link: ../common/hyperparameters.html
+   :height: 150
+
 .. raw:: html
 
         </div>
diff --git a/docs/source-pytorch/cli/lightning_cli_advanced.rst b/docs/source-pytorch/cli/lightning_cli_advanced.rst
index 2d4f3307e7f18..68458c351e798 100644
--- a/docs/source-pytorch/cli/lightning_cli_advanced.rst
+++ b/docs/source-pytorch/cli/lightning_cli_advanced.rst
@@ -1,113 +1,168 @@
 :orphan:
 
-#######################################
-Eliminate config boilerplate (Advanced)
-#######################################
+#################################################
+Configure hyperparameters from the CLI (Advanced)
+#################################################
 **Audience:** Users looking to modularize their code for a professional project.
 
-**Pre-reqs:** You must have read :doc:`(Control it all from the CLI) <lightning_cli_intermediate>`.
+**Pre-reqs:** You must have read :doc:`(Mix models and datasets) <lightning_cli_intermediate_2>`.
+
+As a project becomes more complex, the number of configurable options becomes very large, making it inconvenient to
+control through individual command line arguments. To address this, CLIs implemented using
+:class:`~pytorch_lightning.cli.LightningCLI` always support receiving input from configuration files. The default format
+used for config files is YAML.
+
+.. tip::
+
+    If you are unfamiliar with YAML, it is recommended that you first read :ref:`what-is-a-yaml-config-file`.
+
 
 ----
 
-***************************
-What is a yaml config file?
-***************************
-A yaml is a standard configuration file that describes parameters for sections of a program. It is a common tool in engineering, and it has recently started to gain popularity in machine learning.
+***********************
+Run using a config file
+***********************
+To run the CLI using a yaml config, do:
 
-.. code:: yaml
+.. code:: bash
 
-    # file.yaml
-    car:
-        max_speed:100
-        max_passengers:2
-    plane:
-        fuel_capacity: 50
-    class_3:
-        option_1: 'x'
-        option_2: 'y'
+    python main.py fit --config config.yaml
+
+Individual arguments can be given to override options in the config file:
+
+.. code:: bash
+
+    python main.py fit --config config.yaml --trainer.max_epochs 100
 
 ----
 
+************************
+Automatic save of config
+************************
 
-*********************
-Print the config used
-*********************
-Before or after you run a training routine, you can print the full training spec in yaml format using ``--print_config``:
+To ease experiment reporting and reproducibility, by default ``LightningCLI`` automatically saves the full YAML
+configuration in the log directory. After multiple fit runs with different hyperparameters, each one will have in its
+respective log directory a ``config.yaml`` file. These files can be used to trivially reproduce an experiment, e.g.:
+
+.. code:: bash
+
+    python main.py fit --config lightning_logs/version_7/config.yaml
+
+The automatic saving of the config is done by the special callback :class:`~pytorch_lightning.cli.SaveConfigCallback`.
+This callback is automatically added to the ``Trainer``. To disable the save of the config, instantiate ``LightningCLI``
+with ``save_config_callback=None``.
+
+----
+
+*********************************
+Prepare a config file for the CLI
+*********************************
+The ``--help`` option of the CLIs can be used to learn which configuration options are available and how to use them.
+However, writing a config from scratch can be time-consuming and error-prone. To alleviate this, the CLIs have the
+``--print_config`` argument, which prints to stdout the configuration without running the command.
+
+For a CLI implemented as ``LightningCLI(DemoModel, BoringDataModule)``, executing:
 
 .. code:: bash
 
     python main.py fit --print_config
 
-which generates the following config:
+generates a config with all default values like the following:
 
 .. code:: bash
 
     seed_everything: null
     trainer:
-        logger: true
-        ...
-        terminate_on_nan: null
+      logger: true
+      ...
     model:
-        out_dim: 10
-        learning_rate: 0.02
+      out_dim: 10
+      learning_rate: 0.02
     data:
-        data_dir: ./
+      data_dir: ./
     ckpt_path: null
 
-----
-
-********************************
-Write a config yaml from the CLI
-********************************
-To have a copy of the configuration that produced this model, save a *yaml* file from the *--print_config* outputs:
+Other command line arguments can be given and considered in the printed configuration. A use case for this is CLIs that
+accept multiple models. By default, no model is selected, meaning the printed config will not include model settings. To
+get a config with the default values of a particular model would be:
 
 .. code:: bash
 
-    python main.py fit --model.learning_rate 0.001 --print_config > config.yaml
+    python main.py fit --model DemoModel --print_config
 
-----
-
-**********************
-Run from a single yaml
-**********************
-To run from a yaml, pass a yaml produced with ``--print_config`` to the ``--config`` argument:
+which generates a config like:
 
 .. code:: bash
 
-    python main.py fit --config config.yaml
+    seed_everything: null
+    trainer:
+      ...
+    model:
+      class_path: pytorch_lightning.demos.boring_classes.DemoModel
+      init_args:
+        out_dim: 10
+        learning_rate: 0.02
+    ckpt_path: null
 
-when using a yaml to run, you can still pass in inline arguments
+.. tip::
 
-.. code:: bash
+    A standard procedure to run experiments can be:
 
-    python main.py fit --config config.yaml --trainer.max_epochs 100
+    .. code:: bash
+
+        # Print a configuration to have as reference
+        python main.py fit --print_config > config.yaml
+        # Modify the config to your liking - you can remove all default arguments
+        nano config.yaml
+        # Fit your model using the edited configuration
+        python main.py fit --config config.yaml
 
 ----
 
-******************
-Compose yaml files
-******************
-For production or complex research projects it's advisable to have each object in its own config file. To compose all the configs, pass them all inline:
+********************
+Compose config files
+********************
+Multiple config files can be provided, and they will be parsed sequentially. Let's say we have two configs with common
+settings:
+
+.. code:: yaml
+
+    # config_1.yaml
+    trainer:
+      num_epochs: 10
+      ...
+
+    # config_2.yaml
+    trainer:
+      num_epochs: 20
+      ...
+
+The value from the last config will be used, ``num_epochs = 20`` in this case:
 
 .. code-block:: bash
 
-    $ python trainer.py fit --config trainer.yaml --config datamodules.yaml --config models.yaml ...
+    $ python main.py fit --config config_1.yaml --config config_2.yaml
 
-The configs will be parsed sequentially. Let's say we have two configs with the same args:
+----
+
+*********************
+Use groups of options
+*********************
+Groups of options can also be given as independent config files. For configs like:
 
 .. code:: yaml
 
     # trainer.yaml
-    trainer:
-        num_epochs: 10
+    num_epochs: 10
 
+    # model.yaml
+    out_dim: 7
 
-    # trainer_2.yaml
-    trainer:
-        num_epochs: 20
+    # data.yaml
+    data_dir: ./data
 
-the ones from the last config will be used (num_epochs = 20) in this case:
+a fit command can be run as:
 
 .. code-block:: bash
 
-    $ python trainer.py fit --config trainer.yaml --config trainer_2.yaml
+    $ python main.py fit --trainer trainer.yaml --model model.yaml --data data.yaml [...]
diff --git a/docs/source-pytorch/cli/lightning_cli_advanced_2.rst b/docs/source-pytorch/cli/lightning_cli_advanced_2.rst
index e54f82edaa1b5..e5b3cd6a08929 100644
--- a/docs/source-pytorch/cli/lightning_cli_advanced_2.rst
+++ b/docs/source-pytorch/cli/lightning_cli_advanced_2.rst
@@ -6,7 +6,7 @@
     import torch
     from unittest import mock
     from typing import List
-    import pytorch_lightning as pl
+    import pytorch_lightning.cli as pl_cli
     from pytorch_lightning import LightningModule, LightningDataModule, Trainer, Callback
 
 
@@ -15,7 +15,7 @@
             pass
 
 
-    class LightningCLI(pl.cli.LightningCLI):
+    class LightningCLI(pl_cli.LightningCLI):
         def __init__(self, *args, trainer_class=NoFitTrainer, run=False, **kwargs):
             super().__init__(*args, trainer_class=trainer_class, run=run, **kwargs)
 
@@ -42,13 +42,13 @@
 
     mock_argv.stop()
 
-#######################################
-Eliminate config boilerplate (Advanced)
-#######################################
+#################################################
+Configure hyperparameters from the CLI (Advanced)
+#################################################
 
-******************************
-Customize arguments by command
-******************************
+*********************************
+Customize arguments by subcommand
+*********************************
 To customize arguments by subcommand, pass the config *before* the subcommand:
 
 .. code-block:: bash
@@ -56,11 +56,12 @@ To customize arguments by subcommand, pass the config *before* the subcommand:
     $ python main.py [before] [subcommand] [after]
     $ python main.py  ...         fit       ...
 
-For example, here we set the Trainer argument [max_steps = 100] for the full training routine and [max_steps = 10] for testing:
+For example, here we set the Trainer argument [max_steps = 100] for the full training routine and [max_steps = 10] for
+testing:
 
 .. code-block:: bash
 
-    # config1.yaml
+    # config.yaml
     fit:
         trainer:
             max_steps: 100
@@ -73,21 +74,10 @@ now you can toggle this behavior by subcommand:
 .. code-block:: bash
 
     # full routine with max_steps = 100
-    $ python main.py --config config1.yaml fit
+    $ python main.py --config config.yaml fit
 
     # test only with max_epochs = 10
-    $ python main.py --config config1.yaml test
-
-----
-
-*********************
-Use groups of options
-*********************
-Groups of options can also be given as independent config files:
-
-.. code-block:: bash
-
-    $ python trainer.py fit --trainer trainer.yaml --model model.yaml --data data.yaml [...]
+    $ python main.py --config config.yaml test
 
 ----
 
@@ -98,7 +88,7 @@ For certain enterprise workloads, Lightning CLI supports running from hosted con
 
 .. code-block:: bash
 
-    $ python trainer.py [subcommand] --config s3://bucket/config.yaml
+    $ python main.py [subcommand] --config s3://bucket/config.yaml
 
 For more options, refer to :doc:`Remote filesystems <../common/remote_fs>`.
 
@@ -107,22 +97,23 @@ For more options, refer to :doc:`Remote filesystems <../common/remote_fs>`.
 **************************************
 Use a config via environment variables
 **************************************
-For certain CI/CD systems, it's useful to pass in config files as environment variables:
+For certain CI/CD systems, it's useful to pass in raw yaml config as environment variables:
 
 .. code-block:: bash
 
-    $ python trainer.py fit --trainer "$TRAINER_CONFIG" --model "$MODEL_CONFIG" [...]
+    $ python main.py fit --trainer "$TRAINER_CONFIG" --model "$MODEL_CONFIG" [...]
 
 ----
 
 ***************************************
 Run from environment variables directly
 ***************************************
-The Lightning CLI can convert every possible CLI flag into an environment variable. To enable this, set the *env_parse* argument:
+The Lightning CLI can convert every possible CLI flag into an environment variable. To enable this, set the *env_parse*
+argument:
 
 .. code:: python
 
-    LightningCLI(env_parse=True)
+    cli = LightningCLI(DemoModel, env_parse=True)
 
 now use the ``--help`` CLI flag with any subcommand:
 
@@ -130,22 +121,20 @@ now use the ``--help`` CLI flag with any subcommand:
 
     $ python main.py fit --help
 
-which will show you ALL possible environment variables you can now set:
+which will show you ALL possible environment variables that can be set:
 
 .. code:: bash
 
     usage: main.py [options] fit [-h] [-c CONFIG]
-                                [--trainer.max_epochs MAX_EPOCHS] [--trainer.min_epochs MIN_EPOCHS]
-                                [--trainer.max_steps MAX_STEPS] [--trainer.min_steps MIN_STEPS]
                                 ...
-                                [--ckpt_path CKPT_PATH]
 
     optional arguments:
     ...
-    --model CONFIG        Path to a configuration file.
-    --model.out_dim OUT_DIM
+    ARG:   --model.out_dim OUT_DIM
+    ENV:   PL_FIT__MODEL__OUT_DIM
                             (type: int, default: 10)
-    --model.learning_rate LEARNING_RATE
+    ARG:   --model.learning_rate LEARNING_RATE
+    ENV:   PL_FIT__MODEL__LEARNING_RATE
                             (type: float, default: 0.02)
 
 now you can customize the behavior via environment variables:
@@ -153,8 +142,8 @@ now you can customize the behavior via environment variables:
 .. code:: bash
 
     # set the options via env vars
-    $ export LEARNING_RATE=0.01
-    $ export OUT_DIM=5
+    $ export PL_FIT__MODEL__LEARNING_RATE=0.01
+    $ export PL_FIT__MODEL__OUT_DIM=5
 
     $ python main.py fit
 
@@ -175,16 +164,12 @@ or if you want defaults per subcommand:
 
     cli = LightningCLI(MyModel, MyDataModule, parser_kwargs={"fit": {"default_config_files": ["my_fit_defaults.yaml"]}})
 
-For more configuration options, refer to the `ArgumentParser API
-<https://jsonargparse.readthedocs.io/en/stable/#jsonargparse.core.ArgumentParser.__init__>`_ documentation.
-
 ----
 
 *****************************
 Enable variable interpolation
 *****************************
-In certain cases where multiple configs need to share variables, consider using variable interpolation. Variable interpolation
-allows you to add variables to your yaml configs like so:
+In certain cases where multiple settings need to share a value, consider using variable interpolation. For instance:
 
 .. code-block:: yaml
 
@@ -211,3 +196,14 @@ After this, the CLI will automatically perform interpolation in yaml files:
 .. code:: bash
 
     python main.py --model.encoder_layers=12
+
+For more details about the interpolation support and its limitations, have a look at the `jsonargparse
+<https://jsonargparse.readthedocs.io/en/stable/#variable-interpolation>`__ and the `omegaconf
+<https://omegaconf.readthedocs.io/en/2.1_branch/usage.html#variable-interpolation>`__ documentations.
+
+.. note::
+
+    There are many use cases in which variable interpolation is not the correct approach. When a parameter **must
+    always** be derived from other settings, it shouldn't be up to the CLI user to do this in a config file. For
+    example, if the data and model both require ``batch_size`` and must be the same value, then
+    :ref:`cli_link_arguments` should be used instead of interpolation.
diff --git a/docs/source-pytorch/cli/lightning_cli_advanced_3.rst b/docs/source-pytorch/cli/lightning_cli_advanced_3.rst
index 7f5ed143869e3..38fa0662a012e 100644
--- a/docs/source-pytorch/cli/lightning_cli_advanced_3.rst
+++ b/docs/source-pytorch/cli/lightning_cli_advanced_3.rst
@@ -6,7 +6,7 @@
     import torch
     from unittest import mock
     from typing import List
-    import pytorch_lightning as pl
+    import pytorch_lightning.cli as pl_cli
     from pytorch_lightning import LightningModule, LightningDataModule, Trainer, Callback
 
 
@@ -15,7 +15,7 @@
             pass
 
 
-    class LightningCLI(pl.cli.LightningCLI):
+    class LightningCLI(pl_cli.LightningCLI):
         def __init__(self, *args, trainer_class=NoFitTrainer, run=False, **kwargs):
             super().__init__(*args, trainer_class=trainer_class, run=run, **kwargs)
 
@@ -45,12 +45,16 @@
 
     mock_argv.stop()
 
+#################################################
+Configure hyperparameters from the CLI (Advanced)
+#################################################
+
 Instantiation only mode
 ^^^^^^^^^^^^^^^^^^^^^^^
 
-The CLI is designed to start fitting with minimal code changes. On class instantiation, the CLI will automatically
-call the trainer function associated to the subcommand provided so you don't have to do it.
-To avoid this, you can set the following argument:
+The CLI is designed to start fitting with minimal code changes. On class instantiation, the CLI will automatically call
+the trainer function associated with the subcommand provided, so you don't have to do it. To avoid this, you can set the
+following argument:
 
 .. testcode::
 
@@ -58,20 +62,19 @@ To avoid this, you can set the following argument:
     # you'll have to call fit yourself:
     cli.trainer.fit(cli.model)
 
-In this mode, there are subcommands added to the parser.
-This can be useful to implement custom logic without having to subclass the CLI, but still using the CLI's instantiation
-and argument parsing capabilities.
+In this mode, subcommands are **not** added to the parser. This can be useful to implement custom logic without having
+to subclass the CLI, but still, use the CLI's instantiation and argument parsing capabilities.
 
 
 Trainer Callbacks and arguments with class type
 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
 
-A very important argument of the :class:`~pytorch_lightning.trainer.trainer.Trainer` class is the :code:`callbacks`. In
-contrast to other more simple arguments which just require numbers or strings, :code:`callbacks` expects a list of
-instances of subclasses of :class:`~pytorch_lightning.callbacks.Callback`. To specify this kind of argument in a config
-file, each callback must be given as a dictionary including a :code:`class_path` entry with an import path of the class,
-and optionally an :code:`init_args` entry with arguments required to instantiate it. Therefore, a simple configuration
-file example that defines a couple of callbacks is the following:
+A very important argument of the :class:`~pytorch_lightning.trainer.trainer.Trainer` class is the ``callbacks``. In
+contrast to simpler arguments that take numbers or strings, ``callbacks`` expects a list of instances of subclasses of
+:class:`~pytorch_lightning.callbacks.Callback`. To specify this kind of argument in a config file, each callback must be
+given as a dictionary, including a ``class_path`` entry with an import path of the class and optionally an ``init_args``
+entry with arguments to use to instantiate. Therefore, a simple configuration file that defines two callbacks is the
+following:
 
 .. code-block:: yaml
 
@@ -87,9 +90,9 @@ file example that defines a couple of callbacks is the following:
 Similar to the callbacks, any parameter in :class:`~pytorch_lightning.trainer.trainer.Trainer` and user extended
 :class:`~pytorch_lightning.core.module.LightningModule` and
 :class:`~pytorch_lightning.core.datamodule.LightningDataModule` classes that have as type hint a class, can be
-configured the same way using :code:`class_path` and :code:`init_args`. If the package that defines a subclass is
-imported before the :class:`~pytorch_lightning.cli.LightningCLI` class is run, the name can be used instead of
-the full import path.
+configured the same way using ``class_path`` and ``init_args``. If the package that defines a subclass is imported
+before the :class:`~pytorch_lightning.cli.LightningCLI` class is run, the name can be used instead of the full import
+path.
 
 From command line the syntax is the following:
 
@@ -117,16 +120,16 @@ callback appended. Here is an example:
 
 .. note::
 
-    Serialized config files (e.g. ``--print_config`` or :class:`~pytorch_lightning.cli.SaveConfigCallback`)
-    always have the full ``class_path``'s, even when class name shorthand notation is used in command line or in input
-    config files.
+    Serialized config files (e.g. ``--print_config`` or :class:`~pytorch_lightning.cli.SaveConfigCallback`) always have
+    the full ``class_path``, even when class name shorthand notation is used in the command line or in input config
+    files.
 
 
 Multiple models and/or datasets
 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
 
-Additionally, the tool can be configured such that a model and/or a datamodule is
-specified by an import path and init arguments. For example, with a tool implemented as:
+A CLI can be written such that a model and/or a datamodule is specified by an import path and init arguments. For
+example, with a tool implemented as:
 
 .. code-block:: python
 
@@ -154,18 +157,17 @@ A possible config file could be as follows:
             patience: 5
         ...
 
-Only model classes that are a subclass of :code:`MyModelBaseClass` would be allowed, and similarly only subclasses of
-:code:`MyDataModuleBaseClass`. If as base classes :class:`~pytorch_lightning.core.module.LightningModule` and
-:class:`~pytorch_lightning.core.datamodule.LightningDataModule` are given, then the tool would allow any lightning
-module and data module.
+Only model classes that are a subclass of ``MyModelBaseClass`` would be allowed, and similarly, only subclasses of
+``MyDataModuleBaseClass``. If as base classes :class:`~pytorch_lightning.core.module.LightningModule` and
+:class:`~pytorch_lightning.core.datamodule.LightningDataModule` is given, then the CLI would allow any lightning module
+and data module.
 
 .. tip::
 
-    Note that with the subclass modes the :code:`--help` option does not show information for a specific subclass. To
-    get help for a subclass the options :code:`--model.help` and :code:`--data.help` can be used, followed by the
-    desired class path. Similarly :code:`--print_config` does not include the settings for a particular subclass. To
-    include them the class path should be given before the :code:`--print_config` option. Examples for both help and
-    print config are:
+    Note that with the subclass modes, the ``--help`` option does not show information for a specific subclass. To get
+    help for a subclass, the options ``--model.help`` and ``--data.help`` can be used, followed by the desired class
+    path. Similarly, ``--print_config`` does not include the settings for a particular subclass. To include them, the
+    class path should be given before the ``--print_config`` option. Examples for both help and print config are:
 
     .. code-block:: bash
 
@@ -176,10 +178,13 @@ module and data module.
 Models with multiple submodules
 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
 
-Many use cases require to have several modules each with its own configurable options. One possible way to handle this
-with LightningCLI is to implement a single module having as init parameters each of the submodules. Since the init
-parameters have as type a class, then in the configuration these would be specified with :code:`class_path` and
-:code:`init_args` entries. For instance a model could be implemented as:
+Many use cases require to have several modules, each with its own configurable options. One possible way to handle this
+with ``LightningCLI`` is to implement a single module having as init parameters each of the submodules. This is known as
+`dependency injection <https://en.wikipedia.org/wiki/Dependency_injection>`__ which is a good approach to improve
+decoupling in your code base.
+
+Since the init parameters of the model have as a type hint a class, in the configuration, these would be specified with
+``class_path`` and ``init_args`` entries. For instance, a model could be implemented as:
 
 .. testcode::
 
@@ -195,7 +200,7 @@ parameters have as type a class, then in the configuration these would be specif
             self.encoder = encoder
             self.decoder = decoder
 
-If the CLI is implemented as :code:`LightningCLI(MyMainModel)` the configuration would be as follows:
+If the CLI is implemented as ``LightningCLI(MyMainModel)`` the configuration would be as follows:
 
 .. code-block:: yaml
 
@@ -209,69 +214,15 @@ If the CLI is implemented as :code:`LightningCLI(MyMainModel)` the configuration
         init_args:
           ...
 
-It is also possible to combine :code:`subclass_mode_model=True` and submodules, thereby having two levels of
-:code:`class_path`.
-
-
-Class type defaults
-^^^^^^^^^^^^^^^^^^^
-
-The support for classes as type hints allows to try many possibilities with the same CLI. This is a useful feature, but
-it can make it tempting to use an instance of a class as a default. For example:
-
-.. code-block::
-
-    class MyMainModel(LightningModule):
-        def __init__(
-            self,
-            backbone: torch.nn.Module = MyModel(encoder_layers=24),  # BAD PRACTICE!
-        ):
-            super().__init__()
-            self.backbone = backbone
-
-Normally classes are mutable as it is in this case. The instance of :code:`MyModel` would be created the moment that the
-module that defines :code:`MyMainModel` is first imported. This means that the default of :code:`backbone` will be
-initialized before the CLI class runs :code:`seed_everything` making it non-reproducible. Furthermore, if
-:code:`MyMainModel` is used more than once in the same Python process and the :code:`backbone` parameter is not
-overridden, the same instance would be used in multiple places which very likely is not what the developer intended.
-Having an instance as default also makes it impossible to generate the complete config file since for arbitrary classes
-it is not known which arguments were used to instantiate it.
-
-A good solution to these problems is to not have a default or set the default to a special value (e.g. a
-string) which would be checked in the init and instantiated accordingly. If a class parameter has no default and the CLI
-is subclassed then a default can be set as follows:
-
-.. testcode::
-
-    default_backbone = {
-        "class_path": "import.path.of.MyModel",
-        "init_args": {
-            "encoder_layers": 24,
-        },
-    }
-
-
-    class MyLightningCLI(LightningCLI):
-        def add_arguments_to_parser(self, parser):
-            parser.set_defaults({"model.backbone": default_backbone})
+It is also possible to combine ``subclass_mode_model=True`` and submodules, thereby having two levels of ``class_path``.
 
-A more compact version that avoids writing a dictionary would be:
-
-.. testcode::
-
-    from jsonargparse import lazy_instance
-
-
-    class MyLightningCLI(LightningCLI):
-        def add_arguments_to_parser(self, parser):
-            parser.set_defaults({"model.backbone": lazy_instance(MyModel, encoder_layers=24)})
 
 Optimizers
 ^^^^^^^^^^
 
-If you will not be changing the class, you can manually add the arguments for specific optimizers and/or
-learning rate schedulers by subclassing the CLI. This has the advantage of providing the proper help message for those
-classes. The following code snippet shows how to implement it:
+In some cases, fixing the optimizer and/or learning scheduler might be desired instead of allowing multiple. For this,
+you can manually add the arguments for specific classes by subclassing the CLI. The following code snippet shows how to
+implement it:
 
 .. testcode::
 
@@ -280,9 +231,8 @@ classes. The following code snippet shows how to implement it:
             parser.add_optimizer_args(torch.optim.Adam)
             parser.add_lr_scheduler_args(torch.optim.lr_scheduler.ExponentialLR)
 
-With this, in the config the :code:`optimizer` and :code:`lr_scheduler` groups would accept all of the options for the
-given classes, in this example :code:`Adam` and :code:`ExponentialLR`.
-Therefore, the config file would be structured like:
+With this, in the config, the ``optimizer`` and ``lr_scheduler`` groups would accept all of the options for the given
+classes, in this example, ``Adam`` and ``ExponentialLR``. Therefore, the config file would be structured like:
 
 .. code-block:: yaml
 
@@ -295,14 +245,14 @@ Therefore, the config file would be structured like:
     trainer:
       ...
 
-Where the arguments can be passed directly through command line without specifying the class. For example:
+where the arguments can be passed directly through the command line without specifying the class. For example:
 
 .. code-block:: bash
 
     $ python trainer.py fit --optimizer.lr=0.01 --lr_scheduler.gamma=0.2
 
-The automatic implementation of :code:`configure_optimizers` can be disabled by linking the configuration group. An
-example can be when one wants to add support for multiple optimizers:
+The automatic implementation of ``configure_optimizers`` can be disabled by linking the configuration group. An example
+can be when someone wants to add support for multiple optimizers:
 
 .. code-block:: python
 
@@ -329,11 +279,10 @@ example can be when one wants to add support for multiple optimizers:
 
     cli = MyLightningCLI(MyModel)
 
-The value given to :code:`optimizer*_init` will always be a dictionary including :code:`class_path` and
-:code:`init_args` entries. The function :func:`~pytorch_lightning.cli.instantiate_class`
-takes care of importing the class defined in :code:`class_path` and instantiating it using some positional arguments,
-in this case :code:`self.parameters()`, and the :code:`init_args`.
-Any number of optimizers and learning rate schedulers can be added when using :code:`link_to`.
+The value given to ``optimizer*_init`` will always be a dictionary including ``class_path`` and ``init_args`` entries.
+The function :func:`~pytorch_lightning.cli.instantiate_class` takes care of importing the class defined in
+``class_path`` and instantiating it using some positional arguments, in this case ``self.parameters()``, and the
+``init_args``. Any number of optimizers and learning rate schedulers can be added when using ``link_to``.
 
 With shorthand notation:
 
@@ -378,7 +327,7 @@ the main function as follows:
         cli_main()
 
 Then it is possible to import the ``cli_main`` function to run it. Executing in a shell ``my_cli.py
---trainer.max_epochs=100", "--model.encoder_layers=24`` would be equivalent to:
+--trainer.max_epochs=100 --model.encoder_layers=24`` would be equivalent to:
 
 .. code:: python
 
diff --git a/docs/source-pytorch/cli/lightning_cli_expert.rst b/docs/source-pytorch/cli/lightning_cli_expert.rst
index 60454f5e9bd82..94c28242d380c 100644
--- a/docs/source-pytorch/cli/lightning_cli_expert.rst
+++ b/docs/source-pytorch/cli/lightning_cli_expert.rst
@@ -6,7 +6,7 @@
     import torch
     from unittest import mock
     from typing import List
-    import pytorch_lightning as pl
+    import pytorch_lightning.cli as pl_cli
     from pytorch_lightning import LightningModule, LightningDataModule, Trainer, Callback
 
 
@@ -15,7 +15,7 @@
             pass
 
 
-    class LightningCLI(pl.cli.LightningCLI):
+    class LightningCLI(pl_cli.LightningCLI):
         def __init__(self, *args, trainer_class=NoFitTrainer, run=False, **kwargs):
             super().__init__(*args, trainer_class=trainer_class, run=run, **kwargs)
 
@@ -51,9 +51,9 @@
 
     mock_argv.stop()
 
-#######################################
-Eliminate config boilerplate (Advanced)
-#######################################
+###############################################
+Configure hyperparameters from the CLI (Expert)
+###############################################
 **Audience:** Users who already understand the LightningCLI and want to customize it.
 
 ----
@@ -62,26 +62,24 @@ Eliminate config boilerplate (Advanced)
 Customize the LightningCLI
 **************************
 
-The init parameters of the :class:`~pytorch_lightning.cli.LightningCLI` class can be used to customize some
-things, namely: the description of the tool, enabling parsing of environment variables and additional arguments to
-instantiate the trainer and configuration parser.
-
-Nevertheless the init arguments are not enough for many use cases. For this reason the class is designed so that can be
-extended to customize different parts of the command line tool. The argument parser class used by
-:class:`~pytorch_lightning.cli.LightningCLI` is
-:class:`~pytorch_lightning.cli.LightningArgumentParser` which is an extension of python's argparse, thus
-adding arguments can be done using the :func:`add_argument` method. In contrast to argparse it has additional methods to
-add arguments, for example :func:`add_class_arguments` adds all arguments from the init of a class, though requiring
-parameters to have type hints. For more details about this please refer to the `respective documentation
+The init parameters of the :class:`~pytorch_lightning.cli.LightningCLI` class can be used to customize some things,
+e.g., the description of the tool, enabling parsing of environment variables, and additional arguments to instantiate
+the trainer and configuration parser.
+
+Nevertheless, the init arguments are not enough for many use cases. For this reason, the class is designed so that it
+can be extended to customize different parts of the command line tool. The argument parser class used by
+:class:`~pytorch_lightning.cli.LightningCLI` is :class:`~pytorch_lightning.cli.LightningArgumentParser`, which is an
+extension of python's argparse, thus adding arguments can be done using the :func:`add_argument` method. In contrast to
+argparse, it has additional methods to add arguments. For example :func:`add_class_arguments` add all arguments from the
+init of a class. For more details, see the `respective documentation
 <https://jsonargparse.readthedocs.io/en/stable/#classes-methods-and-functions>`_.
 
 The :class:`~pytorch_lightning.cli.LightningCLI` class has the
-:meth:`~pytorch_lightning.cli.LightningCLI.add_arguments_to_parser` method which can be implemented to include
-more arguments. After parsing, the configuration is stored in the :code:`config` attribute of the class instance. The
-:class:`~pytorch_lightning.cli.LightningCLI` class also has two methods that can be used to run code before
-and after the trainer runs: :code:`before_<subcommand>` and :code:`after_<subcommand>`.
-A realistic example for these would be to send an email before and after the execution.
-The code for the :code:`fit` subcommand would be something like:
+:meth:`~pytorch_lightning.cli.LightningCLI.add_arguments_to_parser` method can be implemented to include more arguments.
+After parsing, the configuration is stored in the ``config`` attribute of the class instance. The
+:class:`~pytorch_lightning.cli.LightningCLI` class also has two methods that can be used to run code before and after
+the trainer runs: ``before_<subcommand>`` and ``after_<subcommand>``. A realistic example of this would be to send an
+email before and after the execution. The code for the ``fit`` subcommand would be something like this:
 
 .. testcode::
 
@@ -98,25 +96,25 @@ The code for the :code:`fit` subcommand would be something like:
 
     cli = MyLightningCLI(MyModel)
 
-Note that the config object :code:`self.config` is a dictionary whose keys are global options or groups of options. It
-has the same structure as the yaml format described previously. This means for instance that the parameters used for
-instantiating the trainer class can be found in :code:`self.config['fit']['trainer']`.
+Note that the config object ``self.config`` is a namespace whose keys are global options or groups of options. It has
+the same structure as the YAML format described previously. This means that the parameters used for instantiating the
+trainer class can be found in ``self.config['fit']['trainer']``.
 
 .. tip::
 
-    Have a look at the :class:`~pytorch_lightning.cli.LightningCLI` class API reference to learn about other
-    methods that can be extended to customize a CLI.
+    Have a look at the :class:`~pytorch_lightning.cli.LightningCLI` class API reference to learn about other methods
+    that can be extended to customize a CLI.
 
 ----
 
 **************************
 Configure forced callbacks
 **************************
-As explained previously, any Lightning callback can be added by passing it through command line or
-including it in the config via :code:`class_path` and :code:`init_args` entries.
+As explained previously, any Lightning callback can be added by passing it through the command line or including it in
+the config via ``class_path`` and ``init_args`` entries.
 
-However, certain callbacks MUST be coupled with a model so they are always present and configurable.
-This can be implemented as follows:
+However, certain callbacks **must** be coupled with a model so they are always present and configurable. This can be
+implemented as follows:
 
 .. testcode::
 
@@ -131,7 +129,7 @@ This can be implemented as follows:
 
     cli = MyLightningCLI(MyModel)
 
-To change the configuration of the :code:`EarlyStopping` in the config it would be:
+To change the parameters for ``EarlyStopping`` in the config it would be:
 
 .. code-block:: yaml
 
@@ -144,11 +142,11 @@ To change the configuration of the :code:`EarlyStopping` in the config it would
 
 .. note::
 
-    The example above overrides a default in :code:`add_arguments_to_parser`. This is included to show that defaults can
-    be changed if needed. However, note that overriding of defaults in the source code is not intended to be used to
-    store the best hyperparameters for a task after experimentation. To ease reproducibility the source code should be
-    stable. It is better practice to store the best hyperparameters for a task in a configuration file independent from
-    the source code.
+    The example above overrides a default in ``add_arguments_to_parser``. This is included to show that defaults can be
+    changed if needed. However, note that overriding defaults in the source code is not intended to be used to store the
+    best hyperparameters for a task after experimentation. To guarantee reproducibility, the source code should be
+    stable. It is better to practice storing the best hyperparameters for a task in a configuration file independent
+    from the source code.
 
 ----
 
@@ -157,7 +155,7 @@ Class type defaults
 *******************
 
 The support for classes as type hints allows to try many possibilities with the same CLI. This is a useful feature, but
-it can make it tempting to use an instance of a class as a default. For example:
+it is tempting to use an instance of a class as a default. For example:
 
 .. testcode::
 
@@ -169,17 +167,17 @@ it can make it tempting to use an instance of a class as a default. For example:
             super().__init__()
             self.backbone = backbone
 
-Normally classes are mutable as it is in this case. The instance of :code:`MyModel` would be created the moment that the
-module that defines :code:`MyMainModel` is first imported. This means that the default of :code:`backbone` will be
-initialized before the CLI class runs :code:`seed_everything` making it non-reproducible. Furthermore, if
-:code:`MyMainModel` is used more than once in the same Python process and the :code:`backbone` parameter is not
-overridden, the same instance would be used in multiple places which very likely is not what the developer intended.
-Having an instance as default also makes it impossible to generate the complete config file since for arbitrary classes
-it is not known which arguments were used to instantiate it.
+Normally classes are mutable, as in this case. The instance of ``MyModel`` would be created the moment that the module
+that defines ``MyMainModel`` is first imported. This means that the default of ``backbone`` will be initialized before
+the CLI class runs ``seed_everything``, making it non-reproducible. Furthermore, if ``MyMainModel`` is used more than
+once in the same Python process and the ``backbone`` parameter is not overridden, the same instance would be used in
+multiple places. Most likely, this is not what the developer intended. Having an instance as default also makes it
+impossible to generate the complete config file since it is not known which arguments were used to instantiate it for
+arbitrary classes.
 
-A good solution to these problems is to not have a default or set the default to a special value (e.g. a
-string) which would be checked in the init and instantiated accordingly. If a class parameter has no default and the CLI
-is subclassed then a default can be set as follows:
+An excellent solution to these problems is not to have a default or set the default to a unique value (e.g., a string).
+Then check this value and instantiate it in the ``__init__`` body. If a class parameter has no default and the CLI is
+subclassed, then a default can be set as follows:
 
 .. testcode::
 
@@ -208,14 +206,16 @@ A more compact version that avoids writing a dictionary would be:
 
 ----
 
-************************
-Connect two config files
-************************
-Another case in which it might be desired to extend :class:`~pytorch_lightning.cli.LightningCLI` is that the
-model and data module depend on a common parameter. For example in some cases both classes require to know the
-:code:`batch_size`. It is a burden and error prone giving the same value twice in a config file. To avoid this the
-parser can be configured so that a value is only given once and then propagated accordingly. With a tool implemented
-like shown below, the :code:`batch_size` only has to be provided in the :code:`data` section of the config.
+.. _cli_link_arguments:
+
+****************
+Argument linking
+****************
+Another case in which it might be desired to extend :class:`~pytorch_lightning.cli.LightningCLI` is that the model and
+data module depends on a common parameter. For example, in some cases, both classes require to know the ``batch_size``.
+It is a burden and error-prone to give the same value twice in a config file. To avoid this, the parser can be
+configured so that a value is only given once and then propagated accordingly. With a tool implemented like the one
+shown below, the ``batch_size`` only has to be provided in the ``data`` section of the config.
 
 .. testcode::
 
@@ -236,11 +236,11 @@ The linking of arguments is observed in the help of the tool, which for this exa
                               Number of samples in a batch (type: int, default: 8)
 
       Linked arguments:
-        model.batch_size <-- data.batch_size
+        data.batch_size --> model.batch_size
                               Number of samples in a batch (type: int)
 
 Sometimes a parameter value is only available after class instantiation. An example could be that your model requires
-the number of classes to instantiate its fully connected layer (for a classification task) but the value is not
+the number of classes to instantiate its fully connected layer (for a classification task). But the value is not
 available until the data module has been instantiated. The code below illustrates how to address this.
 
 .. testcode::
@@ -254,13 +254,14 @@ available until the data module has been instantiated. The code below illustrate
 
 Instantiation links are used to automatically determine the order of instantiation, in this case data first.
 
+.. note::
+
+    The linking of arguments is intended for things that are meant to be non-configurable. This improves the CLI user
+    experience since it avoids the need to provide more parameters. A related concept is a variable interpolation that
+    keeps things configurable.
+
 .. tip::
 
     The linking of arguments can be used for more complex cases. For example to derive a value via a function that takes
     multiple settings as input. For more details have a look at the API of `link_arguments
     <https://jsonargparse.readthedocs.io/en/stable/#jsonargparse.ArgumentParser.link_arguments>`_.
-
-
-The linking of arguments is intended for things that are meant to be non-configurable. This improves the CLI user
-experience since it avoids the need for providing more parameters. A related concept is
-variable interpolation which in contrast keeps things being configurable.
diff --git a/docs/source-pytorch/cli/lightning_cli_faq.rst b/docs/source-pytorch/cli/lightning_cli_faq.rst
index 672e27979f7c9..8830a58cc7664 100644
--- a/docs/source-pytorch/cli/lightning_cli_faq.rst
+++ b/docs/source-pytorch/cli/lightning_cli_faq.rst
@@ -1,96 +1,40 @@
 :orphan:
 
-.. testsetup:: *
-    :skipif: not _JSONARGPARSE_AVAILABLE
+###########################################
+Frequently asked questions for LightningCLI
+###########################################
 
-    import torch
-    from unittest import mock
-    from typing import List
-    import pytorch_lightning as pl
-    from pytorch_lightning import LightningModule, LightningDataModule, Trainer, Callback
+************************
+What does CLI stand for?
+************************
+CLI is short for command line interface. This means it is a tool intended to be run from a terminal, similar to commands
+like ``git``.
 
+----
 
-    class NoFitTrainer(Trainer):
-        def fit(self, *_, **__):
-            pass
-
-
-    class LightningCLI(pl.cli.LightningCLI):
-        def __init__(self, *args, trainer_class=NoFitTrainer, run=False, **kwargs):
-            super().__init__(*args, trainer_class=trainer_class, run=run, **kwargs)
-
-
-    class MyModel(LightningModule):
-        def __init__(
-            self,
-            encoder_layers: int = 12,
-            decoder_layers: List[int] = [2, 4],
-            batch_size: int = 8,
-        ):
-            pass
-
-
-    mock_argv = mock.patch("sys.argv", ["any.py"])
-    mock_argv.start()
-
-.. testcleanup:: *
-
-    mock_argv.stop()
-
-#####################################
-Eliminate config boilerplate (expert)
-#####################################
-
-***************
-Troubleshooting
-***************
-The standard behavior for CLIs, when they fail, is to terminate the process with a non-zero exit code and a short message
-to hint the user about the cause. This is problematic while developing the CLI since there is no information to track
-down the root of the problem. A simple change in the instantiation of the ``LightningCLI`` can be used such that when
-there is a failure an exception is raised and the full stack trace printed.
-
-.. testcode::
-
-    cli = LightningCLI(MyModel, parser_kwargs={"error_handler": None})
+.. _what-is-a-yaml-config-file:
 
-.. note::
+***************************
+What is a yaml config file?
+***************************
+A YAML is a standard for configuration files used to describe parameters for sections of a program. It is a common tool
+in engineering and has recently started to gain popularity in machine learning. An example of a YAML file is the
+following:
 
-    When asking about problems and reporting issues please set the ``error_handler`` to ``None`` and include the stack
-    trace in your description. With this, it is more likely for people to help out identifying the cause without needing
-    to create a reproducible script.
+.. code:: yaml
 
-----
+    # file.yaml
+    car:
+        max_speed:100
+        max_passengers:2
+    plane:
+        fuel_capacity: 50
+    class_3:
+        option_1: 'x'
+        option_2: 'y'
 
-*************************************
-Reproducibility with the LightningCLI
-*************************************
-The topic of reproducibility is complex and it is impossible to guarantee reproducibility by just providing a class that
-people can use in unexpected ways. Nevertheless, the :class:`~pytorch_lightning.cli.LightningCLI` tries to
-give a framework and recommendations to make reproducibility simpler.
-
-When an experiment is run, it is good practice to use a stable version of the source code, either being a released
-package or at least a commit of some version controlled repository. For each run of a CLI the config file is
-automatically saved including all settings. This is useful to figure out what was done for a particular run without
-requiring to look at the source code. If by mistake the exact version of the source code is lost or some defaults
-changed, having the full config means that most of the information is preserved.
-
-The class is targeted at implementing CLIs because running a command from a shell provides a separation with the Python
-source code. Ideally the CLI would be placed in your path as part of the installation of a stable package, instead of
-running from a clone of a repository that could have uncommitted local modifications. Creating installable packages that
-include CLIs is out of the scope of this document. This is mentioned only as a teaser for people who would strive for
-the best practices possible.
-
-
-For every CLI implemented, users are encouraged to learn how to run it by reading the documentation printed with the
-:code:`--help` option and use the :code:`--print_config` option to guide the writing of config files. A few more details
-that might not be clear by only reading the help are the following.
-
-:class:`~pytorch_lightning.cli.LightningCLI` is based on argparse and as such follows the same arguments style
-as many POSIX command line tools. Long options are prefixed with two dashes and its corresponding values should be
-provided with an empty space or an equal sign, as :code:`--option value` or :code:`--option=value`. Command line options
-are parsed from left to right, therefore if a setting appears multiple times the value most to the right will override
-the previous ones. If a class has an init parameter that is required (i.e. no default value), it is given as
-:code:`--option` which makes it explicit and more readable instead of relying on positional arguments.
+If you are unfamiliar with YAML, the short introduction at `realpython.com#yaml-syntax
+<https://realpython.com/python-yaml/#yaml-syntax>`__ might be a good starting point.
 
 ----
 
@@ -130,7 +74,34 @@ use a subcommand as follows:
 
 ----
 
-****************
-What is the CLI?
-****************
-CLI is short for commandline interface. Use your terminal to enter these commands.
+*******************************************************
+What is the relation between LightningCLI and argparse?
+*******************************************************
+
+:class:`~pytorch_lightning.cli.LightningCLI` makes use of `jsonargparse <https://github.com/omni-us/jsonargparse>`__
+which is an extension of `argparse <https://docs.python.org/3/library/argparse.html>`__. Due to this
+:class:`~pytorch_lightning.cli.LightningCLI` follows the same arguments style as many POSIX command line tools. Long
+options are prefixed with two dashes and its corresponding values are separated by space or an equal sign, as ``--option
+value`` or ``--option=value``. Command line options are parsed from left to right, therefore if a setting appears
+multiple times the value most to the right will override the previous ones.
+
+----
+
+****************************
+How do I troubleshoot a CLI?
+****************************
+The standard behavior for CLIs, when they fail, is to terminate the process with a non-zero exit code and a short
+message to hint the user about the cause. This is problematic while developing the CLI since there is no information to
+track down the root of the problem. To troubleshoot set the environment variable ``JSONARGPARSE_DEBUG`` to any value
+before running the CLI:
+
+.. code:: bash
+
+    export JSONARGPARSE_DEBUG=true
+    python main.py fit
+
+.. note::
+
+    When asking about problems and reporting issues, please set the ``JSONARGPARSE_DEBUG`` and include the stack trace
+    in your description. With this, users are more likely to help identify the cause without needing to create a
+    reproducible script.
diff --git a/docs/source-pytorch/cli/lightning_cli_intermediate.rst b/docs/source-pytorch/cli/lightning_cli_intermediate.rst
index db8b6cf4c77ec..d2586cbd84f9c 100644
--- a/docs/source-pytorch/cli/lightning_cli_intermediate.rst
+++ b/docs/source-pytorch/cli/lightning_cli_intermediate.rst
@@ -1,87 +1,43 @@
 :orphan:
 
-###########################################
-Eliminate config boilerplate (Intermediate)
-###########################################
-**Audience:** Users who want advanced modularity via the commandline interface (CLI).
+#####################################################
+Configure hyperparameters from the CLI (Intermediate)
+#####################################################
+**Audience:** Users who want advanced modularity via a command line interface (CLI).
 
-**Pre-reqs:** You must already understand how to use a commandline and :doc:`LightningDataModule <../data/datamodule>`.
+**Pre-reqs:** You must already understand how to use the command line and :doc:`LightningDataModule <../data/datamodule>`.
 
 ----
 
-***************************
-What is config boilerplate?
-***************************
-As Lightning projects grow in complexity it becomes desirable to enable full customizability from the commandline (CLI) so you can
-change any hyperparameters without changing your code:
+*************************
+LightningCLI requirements
+*************************
 
-.. code:: bash
-
-    # Mix and match anything
-    $ python main.py fit --model.learning_rate 0.02
-    $ python main.py fit --model.learning_rate 0.01 --trainer.fast_dev_run True
-
-This is what the Lightning CLI enables. Without the Lightning CLI, you usually end up with a TON of boilerplate that looks like this:
-
-.. code:: python
-
-    from argparse import ArgumentParser
-
-    if __name__ == "__main__":
-        parser = ArgumentParser()
-        parser.add_argument("--learning_rate_1", default=0.02)
-        parser.add_argument("--learning_rate_2", default=0.03)
-        parser.add_argument("--model", default="cnn")
-        parser.add_argument("--command", default="fit")
-        parser.add_argument("--run_fast", default=True)
-        ...
-        # add 100 more of these
-        ...
-
-        args = parser.parse_args()
-
-        if args.model == "cnn":
-            model = ConvNet(learning_rate=args.learning_rate_1)
-        elif args.model == "transformer":
-            model = Transformer(learning_rate=args.learning_rate_2)
-        trainer = Trainer(fast_dev_run=args.run_fast)
-        ...
-
-        if args.command == "fit":
-            trainer.fit()
-        elif args.command == "test":
-            ...
-
-This kind of boilerplate is unsustainable as projects grow in complexity.
-
-----
-
-************************
-Enable the Lightning CLI
-************************
-To enable the Lightning CLI install the extras:
+The :class:`~pytorch_lightning.cli.LightningCLI` class is designed to significantly ease the implementation of CLIs. To
+use this class, an additional Python requirement is necessary than the minimal installation of Lightning provides. To
+enable, either install all extras:
 
 .. code:: bash
 
-    pip install pytorch-lightning[extra]
+    pip install "pytorch-lightning[extra]"
 
-if the above fails, only install jsonargparse:
+or if only interested in ``LightningCLI``, just install jsonargparse:
 
 .. code:: bash
 
-    pip install -U jsonargparse[signatures]
+    pip install "jsonargparse[signatures]"
 
 ----
 
-**************************
-Connect a model to the CLI
-**************************
-The simplest way to control a model with the CLI is to wrap it in the LightningCLI object:
+******************
+Implementing a CLI
+******************
+Implementing a CLI is as simple as instantiating a :class:`~pytorch_lightning.cli.LightningCLI` object giving as
+arguments classes for a ``LightningModule`` and optionally a ``LightningDataModule``:
 
 .. code:: python
 
     # main.py
-    import torch
     from pytorch_lightning.cli import LightningCLI
 
     # simple demo classes for your convenience
@@ -103,7 +59,7 @@ Now your model can be managed via the CLI. To see the available commands type:
 
     $ python main.py --help
 
-Which prints out:
+which prints out:
 
 .. code:: bash
 
@@ -130,7 +86,7 @@ Which prints out:
         tune                Runs routines to tune hyperparameters before training.
 
 
-the message tells us that we have a few available subcommands:
+The message tells us that we have a few available subcommands:
 
 .. code:: bash
 
@@ -151,16 +107,18 @@ which you can use depending on your use case:
 **************************
 Train a model with the CLI
 **************************
-To run the full training routine (train, val, test), use the subcommand ``fit``:
+To train a model, use the ``fit`` subcommand:
 
 .. code:: bash
 
     python main.py fit
 
-View all available options with the ``--help`` command:
+View all available options with the ``--help`` argument given after the subcommand:
 
 .. code:: bash
 
+    $ python main.py fit --help
+
     usage: main.py [options] fit [-h] [-c CONFIG]
                                 [--seed_everything SEED_EVERYTHING] [--trainer CONFIG]
                                 ...
@@ -183,10 +141,18 @@ With the Lightning CLI enabled, you can now change the parameters without touchi
 .. code:: bash
 
     # change the learning_rate
-    python main.py fit --model.out_dim 30
+    python main.py fit --model.learning_rate 0.1
 
-    # change the out dimensions also
+    # change the output dimensions also
     python main.py fit --model.out_dim 10 --model.learning_rate 0.1
 
     # change trainer and data arguments too
     python main.py fit --model.out_dim 2 --model.learning_rate 0.1 --data.data_dir '~/' --trainer.logger False
+
+.. tip::
+
+    The options that become available in the CLI are the ``__init__`` parameters of the ``LightningModule`` and
+    ``LightningDataModule`` classes. Thus, to make hyperparameters configurable, just add them to your class's
+    ``__init__``. It is highly recommended that these parameters are described in the docstring so that the CLI shows
+    them in the help. Also, the parameters should have accurate type hints so that the CLI can fail early and give
+    understandable error messages when incorrect values are given.
diff --git a/docs/source-pytorch/cli/lightning_cli_intermediate_2.rst b/docs/source-pytorch/cli/lightning_cli_intermediate_2.rst
index 8e312b7233a6d..04a2795840f50 100644
--- a/docs/source-pytorch/cli/lightning_cli_intermediate_2.rst
+++ b/docs/source-pytorch/cli/lightning_cli_intermediate_2.rst
@@ -1,20 +1,20 @@
 :orphan:
 
-###########################################
-Eliminate config boilerplate (intermediate)
-###########################################
+#####################################################
+Configure hyperparameters from the CLI (Intermediate)
+#####################################################
 **Audience:** Users who have multiple models and datasets per project.
 
 **Pre-reqs:** You must have read :doc:`(Control it all from the CLI) <lightning_cli_intermediate>`.
 
 ----
 
-****************************************
-Why do I want to mix models and datasets
-****************************************
-Lightning projects usually begin with one model and one dataset. As the project grows in complexity and you introduce more models and more datasets, it becomes desirable
-to mix any model with any dataset directly from the commandline without changing your code.
-
+***************************
+Why mix models and datasets
+***************************
+Lightning projects usually begin with one model and one dataset. As the project grows in complexity and you introduce
+more models and more datasets, it becomes desirable to mix any model with any dataset directly from the command line
+without changing your code.
 
 .. code:: bash
 
@@ -22,7 +22,8 @@ to mix any model with any dataset directly from the commandline without changing
     $ python main.py fit --model=GAN --data=MNIST
     $ python main.py fit --model=Transformer --data=MNIST
 
-This is what the Lightning CLI enables. Otherwise, this kind of configuration requires a significant amount of boilerplate that often looks like this:
+``LightningCLI`` makes this very simple. Otherwise, this kind of configuration requires a significant amount of
+boilerplate that often looks like this:
 
 .. code:: python
 
@@ -43,6 +44,8 @@ This is what the Lightning CLI enables. Otherwise, this kind of configuration re
     # mix them!
     trainer.fit(model, datamodule)
 
+It is highly recommended that you avoid writing this kind of boilerplate and use ``LightningCLI`` instead.
+
 ----
 
 *************************
@@ -53,9 +56,8 @@ To support multiple models, when instantiating ``LightningCLI`` omit the ``model
 .. code:: python
 
     # main.py
-
-    from pytorch_lightning import demos
-    from pytorch_lightning.utilities import cli as pl_cli
+    from pytorch_lightning.cli import LightningCLI
+    from pytorch_lightning.demos.boring_classes import DemoModel
 
 
     class Model1(DemoModel):
@@ -70,7 +72,7 @@ To support multiple models, when instantiating ``LightningCLI`` omit the ``model
             return super().configure_optimizers()
 
 
-    cli = pl_cli.LightningCLI(datamodule_class=BoringDataModule)
+    cli = LightningCLI(datamodule_class=BoringDataModule)
 
 Now you can choose between any model from the CLI:
 
@@ -82,19 +84,24 @@ Now you can choose between any model from the CLI:
     # use Model2
     python main.py fit --model Model2
 
+.. tip::
+
+    Instead of omitting the ``model_class`` parameter, you can give a base class and ``subclass_mode_model=True``. This
+    will make the CLI only accept models which are a subclass of the given base class.
+
 ----
 
-********************
-Multiple DataModules
-********************
+*****************************
+Multiple LightningDataModules
+*****************************
 To support multiple data modules, when instantiating ``LightningCLI`` omit the ``datamodule_class`` parameter:
 
 .. code:: python
 
     # main.py
     import torch
-    from pytorch_lightning.utilities import cli as pl_cli
-    from pytorch_lightning import demos
+    from pytorch_lightning.cli import LightningCLI
+    from pytorch_lightning.demos.boring_classes import BoringDataModule
 
 
     class FakeDataset1(BoringDataModule):
@@ -109,7 +116,7 @@ To support multiple data modules, when instantiating ``LightningCLI`` omit the `
             return torch.utils.data.DataLoader(self.random_train)
 
 
-    cli = pl_cli.LightningCLI(DemoModel)
+    cli = LightningCLI(DemoModel)
 
 Now you can choose between any dataset at runtime:
 
@@ -121,19 +128,36 @@ Now you can choose between any dataset at runtime:
     # use Model2
     python main.py fit --data FakeDataset2
 
+.. tip::
+
+    Instead of omitting the ``datamodule_class`` parameter, you can give a base class and ``subclass_mode_data=True``.
+    This will make the CLI only accept data modules that are a subclass of the given base class.
+
 ----
 
-*****************
-Custom optimizers
-*****************
-Any subclass of ``torch.optim.Optimizer`` can be used as an optimizer:
+*******************
+Multiple optimizers
+*******************
+Standard optimizers from ``torch.optim`` work out of the box:
+
+.. code:: bash
+
+    python main.py fit --optimizer AdamW
+
+If the optimizer you want needs other arguments, add them via the CLI (no need to change your code)!
+
+.. code:: bash
+
+    python main.py fit --optimizer SGD --optimizer.lr=0.01
+
+Furthermore, any custom subclass of :class:`torch.optim.Optimizer` can be used as an optimizer:
 
 .. code:: python
 
     # main.py
     import torch
-    from pytorch_lightning.utilities import cli as pl_cli
-    from pytorch_lightning import demos
+    from pytorch_lightning.cli import LightningCLI
+    from pytorch_lightning.demos.boring_classes import DemoModel, BoringDataModule
 
 
     class LitAdam(torch.optim.Adam):
@@ -148,7 +172,7 @@ Any subclass of ``torch.optim.Optimizer`` can be used as an optimizer:
             super().step(closure)
 
 
-    cli = pl_cli.LightningCLI(DemoModel, BoringDataModule)
+    cli = LightningCLI(DemoModel, BoringDataModule)
 
 Now you can choose between any optimizer at runtime:
 
@@ -160,32 +184,31 @@ Now you can choose between any optimizer at runtime:
     # use FancyAdam
     python main.py fit --optimizer FancyAdam
 
-Bonus: If you need only 1 optimizer, the Lightning CLI already works out of the box with any Optimizer from
-``torch.optim``:
+----
+
+*******************
+Multiple schedulers
+*******************
+Standard learning rate schedulers from ``torch.optim.lr_scheduler``  work out of the box:
 
 .. code:: bash
 
-    python main.py fit --optimizer AdamW
+    python main.py fit --lr_scheduler CosineAnnealingLR
 
-If the optimizer you want needs other arguments, add them via the CLI (no need to change your code)!
+If the scheduler you want needs other arguments, add them via the CLI (no need to change your code)!
 
 .. code:: bash
 
-    python main.py fit --optimizer SGD --optimizer.lr=0.01
-
-----
+    python main.py fit --lr_scheduler=ReduceLROnPlateau --lr_scheduler.monitor=epoch
 
-********************
-Custom LR schedulers
-********************
-Any subclass of ``torch.optim.lr_scheduler._LRScheduler`` can be used as learning rate scheduler:
+Furthermore, any custom subclass of ``torch.optim.lr_scheduler._LRScheduler`` can be used as learning rate scheduler:
 
 .. code:: python
 
     # main.py
     import torch
-    from pytorch_lightning.utilities import cli as pl_cli
-    from pytorch_lightning import demos
+    from pytorch_lightning.cli import LightningCLI
+    from pytorch_lightning.demos.boring_classes import DemoModel, BoringDataModule
 
 
     class LitLRScheduler(torch.optim.lr_scheduler.CosineAnnealingLR):
@@ -194,7 +217,7 @@ Any subclass of ``torch.optim.lr_scheduler._LRScheduler`` can be used as learnin
             super().step()
 
 
-    cli = pl_cli.LightningCLI(DemoModel, BoringDataModule)
+    cli = LightningCLI(DemoModel, BoringDataModule)
 
 Now you can choose between any learning rate scheduler at runtime:
 
@@ -204,38 +227,22 @@ Now you can choose between any learning rate scheduler at runtime:
     python main.py fit --lr_scheduler LitLRScheduler
 
 
-Bonus: If you need only 1 LRScheduler, the Lightning CLI already works out of the box with any LRScheduler from
-``torch.optim``:
-
-.. code:: bash
-
-    python main.py fit --lr_scheduler CosineAnnealingLR
-    python main.py fit --lr_scheduler LinearLR
-    ...
-
-If the scheduler you want needs other arguments, add them via the CLI (no need to change your code)!
-
-.. code:: bash
-
-    python main.py fit --lr_scheduler=ReduceLROnPlateau --lr_scheduler.monitor=epoch
-
 ----
 
 ************************
 Classes from any package
 ************************
-In the previous sections the classes to select were defined in the same python file where the ``LightningCLI`` class is
-run. To select classes from any package by using only the class name, import the respective package:
+In the previous sections, custom classes to select were defined in the same python file where the ``LightningCLI`` class
+is run. To select classes from any package by using only the class name, import the respective package:
 
 .. code:: python
 
-    import torch
-    from pytorch_lightning.utilities import cli as pl_cli
+    from pytorch_lightning.cli import LightningCLI
     import my_code.models  # noqa: F401
     import my_code.data_modules  # noqa: F401
     import my_code.optimizers  # noqa: F401
 
-    cli = pl_cli.LightningCLI()
+    cli = LightningCLI()
 
 Now use any of the classes:
 
@@ -243,9 +250,25 @@ Now use any of the classes:
 
     python main.py fit --model Model1 --data FakeDataset1 --optimizer LitAdam --lr_scheduler LitLRScheduler
 
-The ``# noqa: F401`` comment avoids a linter warning that the import is unused. It is also possible to select subclasses
-that have not been imported by giving the full import path:
+The ``# noqa: F401`` comment avoids a linter warning that the import is unused.
+
+It is also possible to select subclasses that have not been imported by giving the full import path:
 
 .. code:: bash
 
     python main.py fit --model my_code.models.Model1
+
+----
+
+*************************
+Help for specific classes
+*************************
+When multiple models or datasets are accepted, the main help of the CLI does not include their specific parameters. To
+show this specific help, additional help arguments expect the class name or its import path. For example:
+
+.. code:: bash
+
+    python main.py fit --model.help Model1
+    python main.py fit --data.help FakeDataset2
+    python main.py fit --optimizer.help Adagrad
+    python main.py fit --lr_scheduler.help StepLR
diff --git a/docs/source-pytorch/common/hyperparameters.rst b/docs/source-pytorch/common/hyperparameters.rst
index 5813109fe2fab..79af37c74345e 100644
--- a/docs/source-pytorch/common/hyperparameters.rst
+++ b/docs/source-pytorch/common/hyperparameters.rst
@@ -1,11 +1,19 @@
+:orphan:
+
 .. testsetup:: *
 
     from argparse import ArgumentParser, Namespace
 
     sys.argv = ["foo"]
 
-Configure hyperparameters from the CLI
---------------------------------------
+Configure hyperparameters from the CLI (legacy)
+-----------------------------------------------
+
+.. warning::
+
+    This is the documentation for the use of Python's ``argparse`` to implement a CLI. This approach is no longer
+    recommended, and people are encouraged to use the new `LightningCLI <cli/lightning_cli.html>`_ class instead.
+
 
 Lightning has utilities to interact seamlessly with the command line ``ArgumentParser``
 and plays well with the hyperparameter optimization framework of your choice.
@@ -105,84 +113,6 @@ Finally, make sure to start the training like so:
 
 ----------
 
-LightningModule hyperparameters
-^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
-Often times we train many versions of a model. You might share that model or come back to it a few months later
-at which point it is very useful to know how that model was trained (i.e.: what learning rate, neural network, etc...).
-
-Lightning has a standardized way of saving the information for you in checkpoints and YAML files. The goal here is to
-improve readability and reproducibility.
-
-save_hyperparameters
-""""""""""""""""""""
-
-Use :meth:`~pytorch_lightning.core.module.LightningModule.save_hyperparameters` within your
-:class:`~pytorch_lightning.core.module.LightningModule`'s ``__init__`` method.
-It will enable Lightning to store all the provided arguments under the ``self.hparams`` attribute.
-These hyperparameters will also be stored within the model checkpoint, which simplifies model re-instantiation after training.
-
-.. code-block:: python
-
-    class LitMNIST(LightningModule):
-        def __init__(self, layer_1_dim=128, learning_rate=1e-2):
-            super().__init__()
-            # call this to save (layer_1_dim=128, learning_rate=1e-4) to the checkpoint
-            self.save_hyperparameters()
-
-            # equivalent
-            self.save_hyperparameters("layer_1_dim", "learning_rate")
-
-            # Now possible to access layer_1_dim from hparams
-            self.hparams.layer_1_dim
-
-
-In addition, loggers that support it will automatically log the contents of ``self.hparams``.
-
-Excluding hyperparameters
-"""""""""""""""""""""""""
-
-By default, every parameter of the ``__init__`` method will be considered a hyperparameter to the LightningModule.
-However, sometimes some parameters need to be excluded from saving, for example when they are not serializable.
-Those parameters should be provided back when reloading the LightningModule.
-In this case, exclude them explicitly:
-
-.. code-block:: python
-
-    class LitMNIST(LightningModule):
-        def __init__(self, loss_fx, generator_network, layer_1_dim=128):
-            super().__init__()
-            self.layer_1_dim = layer_1_dim
-            self.loss_fx = loss_fx
-
-            # call this to save only (layer_1_dim=128) to the checkpoint
-            self.save_hyperparameters("layer_1_dim")
-
-            # equivalent
-            self.save_hyperparameters(ignore=["loss_fx", "generator_network"])
-
-
-load_from_checkpoint
-""""""""""""""""""""
-
-LightningModules that have hyperparameters automatically saved with :meth:`~pytorch_lightning.core.module.LightningModule.save_hyperparameters`
-can conveniently be loaded and instantiated directly from a checkpoint with :meth:`~pytorch_lightning.core.module.LightningModule.load_from_checkpoint`:
-
-.. code-block:: python
-
-    # to load specify the other args
-    model = LitMNIST.load_from_checkpoint(PATH, loss_fx=torch.nn.SomeOtherLoss, generator_network=MyGenerator())
-
-
-If parameters were excluded, they need to be provided at the time of loading:
-
-.. code-block:: python
-
-    # the excluded parameters were `loss_fx` and `generator_network`
-    model = LitMNIST.load_from_checkpoint(PATH, loss_fx=torch.nn.SomeOtherLoss, generator_network=MyGenerator())
-
-
-----------
-
 Trainer args
 ^^^^^^^^^^^^
 To recap, add ALL possible trainer flags to the argparser and init the ``Trainer`` this way
diff --git a/docs/source-pytorch/common/lightning_module.rst b/docs/source-pytorch/common/lightning_module.rst
index cb6432448675e..cd64307ad953d 100644
--- a/docs/source-pytorch/common/lightning_module.rst
+++ b/docs/source-pytorch/common/lightning_module.rst
@@ -729,6 +729,87 @@ Check out :ref:`Inference in Production <production_inference>` guide to learn a
 -----------
 
 
+********************
+Save Hyperparameters
+********************
+
+Often times we train many versions of a model. You might share that model or come back to it a few months later at which
+point it is very useful to know how that model was trained (i.e.: what learning rate, neural network, etc...).
+
+Lightning has a standardized way of saving the information for you in checkpoints and YAML files. The goal here is to
+improve readability and reproducibility.
+
+save_hyperparameters
+====================
+
+Use :meth:`~pytorch_lightning.core.module.LightningModule.save_hyperparameters` within your
+:class:`~pytorch_lightning.core.module.LightningModule`'s ``__init__`` method. It will enable Lightning to store all the
+provided arguments under the ``self.hparams`` attribute. These hyperparameters will also be stored within the model
+checkpoint, which simplifies model re-instantiation after training.
+
+.. code-block:: python
+
+    class LitMNIST(LightningModule):
+        def __init__(self, layer_1_dim=128, learning_rate=1e-2):
+            super().__init__()
+            # call this to save (layer_1_dim=128, learning_rate=1e-4) to the checkpoint
+            self.save_hyperparameters()
+
+            # equivalent
+            self.save_hyperparameters("layer_1_dim", "learning_rate")
+
+            # Now possible to access layer_1_dim from hparams
+            self.hparams.layer_1_dim
+
+
+In addition, loggers that support it will automatically log the contents of ``self.hparams``.
+
+Excluding hyperparameters
+=========================
+
+By default, every parameter of the ``__init__`` method will be considered a hyperparameter to the LightningModule.
+However, sometimes some parameters need to be excluded from saving, for example when they are not serializable. Those
+parameters should be provided back when reloading the LightningModule. In this case, exclude them explicitly:
+
+.. code-block:: python
+
+    class LitMNIST(LightningModule):
+        def __init__(self, loss_fx, generator_network, layer_1_dim=128):
+            super().__init__()
+            self.layer_1_dim = layer_1_dim
+            self.loss_fx = loss_fx
+
+            # call this to save only (layer_1_dim=128) to the checkpoint
+            self.save_hyperparameters("layer_1_dim")
+
+            # equivalent
+            self.save_hyperparameters(ignore=["loss_fx", "generator_network"])
+
+
+load_from_checkpoint
+====================
+
+LightningModules that have hyperparameters automatically saved with
+:meth:`~pytorch_lightning.core.module.LightningModule.save_hyperparameters` can conveniently be loaded and instantiated
+directly from a checkpoint with :meth:`~pytorch_lightning.core.module.LightningModule.load_from_checkpoint`:
+
+.. code-block:: python
+
+    # to load specify the other args
+    model = LitMNIST.load_from_checkpoint(PATH, loss_fx=torch.nn.SomeOtherLoss, generator_network=MyGenerator())
+
+
+If parameters were excluded, they need to be provided at the time of loading:
+
+.. code-block:: python
+
+    # the excluded parameters were `loss_fx` and `generator_network`
+    model = LitMNIST.load_from_checkpoint(PATH, loss_fx=torch.nn.SomeOtherLoss, generator_network=MyGenerator())
+
+
+-----------
+
+
 *************
 Child Modules
 *************
diff --git a/docs/source-pytorch/index.rst b/docs/source-pytorch/index.rst
index 1c867e1e345e9..77149ffbd6004 100644
--- a/docs/source-pytorch/index.rst
+++ b/docs/source-pytorch/index.rst
@@ -187,11 +187,10 @@ Current Lightning Users
 
    Avoid overfitting <common/evaluation>
    model/build_model.rst
-   common/hyperparameters
+   cli/lightning_cli
    common/progress_bar
    deploy/production
    advanced/training_tricks
-   cli/lightning_cli
    tuning/profiler
    Manage experiments <visualize/logging_intermediate>
    Organize existing PyTorch into Lightning <starter/converting>