Lightning-AI · kaushikb11 · Mar 29, 2022 · Mar 24, 2022 · Mar 24, 2022 · Mar 24, 2022
@@ -296,7 +296,6 @@ Below we show an example of running `ZeRO-Offload <https://www.deepspeed.ai/tuto
 .. code-block:: python
 
     from pytorch_lightning import Trainer
-    from pytorch_lightning.strategies import DeepSpeedStrategy
 
     model = MyModel()
     trainer = Trainer(accelerator="gpu", devices=4, strategy="deepspeed_stage_2_offload", precision=16)
@@ -341,7 +340,6 @@ For even more speed benefit, DeepSpeed offers an optimized CPU version of ADAM c
 
     import pytorch_lightning
     from pytorch_lightning import Trainer
-    from pytorch_lightning.strategies import DeepSpeedStrategy
     from deepspeed.ops.adam import DeepSpeedCPUAdam
 
 
@@ -385,7 +383,6 @@ Also please have a look at our :ref:`deepspeed-zero-stage-3-tips` which contains
 .. code-block:: python
 
     from pytorch_lightning import Trainer
-    from pytorch_lightning.strategies import DeepSpeedStrategy
     from deepspeed.ops.adam import FusedAdam
 
 
@@ -409,7 +406,6 @@ You can also use the Lightning Trainer to run predict or evaluate with DeepSpeed
 .. code-block:: python
 
     from pytorch_lightning import Trainer
-    from pytorch_lightning.strategies import DeepSpeedStrategy
 
 
     class MyModel(pl.LightningModule):
@@ -435,7 +431,6 @@ This reduces the time taken to initialize very large models, as well as ensure w
 
     import torch.nn as nn
     from pytorch_lightning import Trainer
-    from pytorch_lightning.strategies import DeepSpeedStrategy
     from deepspeed.ops.adam import FusedAdam
 
 
@@ -549,7 +544,6 @@ This saves memory when training larger models, however requires using a checkpoi
 .. code-block:: python
 
     from pytorch_lightning import Trainer
-    from pytorch_lightning.strategies import DeepSpeedStrategy
     import deepspeed
 
 
@@ -686,7 +680,7 @@ In some cases you may want to define your own DeepSpeed Config, to access all pa
     }
 
     model = MyModel()
-    trainer = Trainer(accelerator="gpu", devices=4, strategy=DeepSpeedStrategy(deepspeed_config), precision=16)
+    trainer = Trainer(accelerator="gpu", devices=4, strategy=DeepSpeedStrategy(config=deepspeed_config), precision=16)
     trainer.fit(model)
 
 
@@ -699,7 +693,7 @@ We support taking the config as a json formatted file:
 
     model = MyModel()
     trainer = Trainer(
-        accelerator="gpu", devices=4, strategy=DeepSpeedStrategy("/path/to/deepspeed_config.json"), precision=16
+        accelerator="gpu", devices=4, strategy=DeepSpeedStrategy(config="/path/to/deepspeed_config.json"), precision=16
     )
     trainer.fit(model)
 

@@ -315,6 +315,7 @@ and the Lightning Team will be happy to integrate/help integrate it.
 
 -----------
 
+.. _customize_checkpointing:
 
 ***********************
 Customize Checkpointing
@@ -392,7 +393,7 @@ Custom Checkpoint IO Plugin
 
 .. note::
 
-    Some ``TrainingTypePlugins`` like ``DeepSpeedStrategy`` do not support custom ``CheckpointIO`` as checkpointing logic is not modifiable.
+    Some strategies like :class:`~pytorch_lightning.strategies.deepspeed.DeepSpeedStrategy` do not support custom :class:`~pytorch_lightning.plugins.io.checkpoint_plugin.CheckpointIO` as checkpointing logic is not modifiable.
 
 -----------
 

@@ -1056,7 +1056,7 @@ automatic_optimization
 When set to ``False``, Lightning does not automate the optimization process. This means you are responsible for handling
 your optimizers. However, we do take care of precision and any accelerators used.
 
-See :ref:`manual optimization<common/optimization:Manual optimization>` for details.
+See :ref:`manual optimization <common/optimization:Manual optimization>` for details.
 
 .. code-block:: python
 

@@ -6,54 +6,32 @@ Plugins
 
 .. include:: ../links.rst
 
-Plugins allow custom integrations to the internals of the Trainer such as a custom precision or
-distributed implementation.
+Plugins allow custom integrations to the internals of the Trainer such as a custom precision, checkpointing or
+cluster environment implementation.
 
 Under the hood, the Lightning Trainer is using plugins in the training routine, added automatically
-depending on the provided Trainer arguments. For example:
-
-.. code-block:: python
-
-    # accelerator: GPUAccelerator
-    # training strategy: DDPStrategy
-    # precision: NativeMixedPrecisionPlugin
-    trainer = Trainer(accelerator="gpu", devices=4, precision=16)
-
-
-We expose Accelerators and Plugins mainly for expert users that want to extend Lightning for:
-
-- New hardware (like TPU plugin)
-- Distributed backends (e.g. a backend not yet supported by
-  `PyTorch <https://pytorch.org/docs/stable/distributed.html#backends>`_ itself)
-- Clusters (e.g. customized access to the cluster's environment interface)
-
-There are two types of Plugins in Lightning with different responsibilities:
-
-Strategy
---------
-
-- Launching and teardown of training processes (if applicable)
-- Setup communication between processes (NCCL, GLOO, MPI, ...)
-- Provide a unified communication interface for reduction, broadcast, etc.
-- Provide access to the wrapped LightningModule
+depending on the provided Trainer arguments.
 
+There are three types of Plugins in Lightning with different responsibilities:
 
-Furthermore, for multi-node training Lightning provides cluster environment plugins that allow the advanced user
-to configure Lightning to integrate with a :ref:`custom-cluster`.
+- Precision Plugins
+- CheckpointIO Plugins
+- Cluster Environments
 
 
-.. image:: ../_static/images/accelerator/overview.svg
-
-
-The full list of built-in plugins is listed below.
+*****************
+Precision Plugins
+*****************
 
+We provide precision plugins for the users so that they can benefit from numerical representations with lower precision than
+32-bit floating-point or higher precision, such as 64-bit floating-point.
 
-.. warning:: The Plugin API is in beta and subject to change.
-    For help setting up custom plugins/accelerators, please reach out to us at **[email protected]**
+.. code-block:: python
 
+    # Training with 16-bit precision
+    trainer = Trainer(precision=16)
 
-Precision Plugins
------------------
+The full list of built-in precision plugins is listed below.
 
 .. currentmodule:: pytorch_lightning.plugins.precision
 
@@ -74,9 +52,43 @@ Precision Plugins
     TPUBf16PrecisionPlugin
     TPUPrecisionPlugin
 
+More information regarding precision with Lightning can be found :doc:`here <../advanced/precision>`
+
+
+-----------
+
+
+********************
+CheckpointIO Plugins
+********************
 
+As part of our commitment to extensibility, we have abstracted Lightning's checkpointing logic into the :class:`~pytorch_lightning.plugins.io.CheckpointIO` plugin.
+With this, users have the ability to customize the checkpointing logic to match the needs of their infrastructure.
+
+Below is a list of built-in plugins for checkpointing.
+
+.. currentmodule:: pytorch_lightning.plugins.io
+
+.. autosummary::
+    :nosignatures:
+    :template: classtemplate.rst
+
+    CheckpointIO
+    HPUCheckpointIO
+    TorchCheckpointIO
+    XLACheckpointIO
+
+You could learn more about custom checkpointing with Lightning :ref:`here <customize_checkpointing>`.
+
+
+-----------
+
+
+********************
 Cluster Environments
---------------------
+********************
+
+Users can define the interface of their own cluster environment based on the requirements of their infrastructure.
 
 .. currentmodule:: pytorch_lightning.plugins.environments
 
@@ -85,8 +97,8 @@ Cluster Environments
     :template: classtemplate.rst
 
     ClusterEnvironment
+    KubeflowEnvironment
     LightningEnvironment
     LSFEnvironment
-    TorchElasticEnvironment
-    KubeflowEnvironment
     SLURMEnvironment
+    TorchElasticEnvironment