[Nano] How-To Guides: Accelerate PyTorch Training with IPEX & Multi-i…

…nstance & BF16 & Channels last (intel-analytics#7035) * add pytorch training ipex guide * add pytorch training multi-instance guide * add bf16 guide * small changes of presentation * add channels last guide * remove validation loader * hide code block * update based on comments * add guide for reference * update guides w.r.t. comments
ForJadeForest · Jan 10, 2023 · e3c1c44 · e3c1c44
1 parent d775955
commit e3c1c44
Show file tree

Hide file tree

Showing 13 changed files with 1,505 additions and 12 deletions.
diff --git a/docs/readthedocs/source/_toc.yml b/docs/readthedocs/source/_toc.yml
@@ -108,6 +108,10 @@ subtrees:
                   - file: doc/Nano/Howto/Training/PyTorchLightning/pytorch_lightning_training_bf16
                   - file: doc/Nano/Howto/Training/PyTorch/convert_pytorch_training_torchnano
                   - file: doc/Nano/Howto/Training/PyTorch/use_nano_decorator_pytorch_training
+                  - file: doc/Nano/Howto/Training/PyTorch/accelerate_pytorch_training_ipex
+                  - file: doc/Nano/Howto/Training/PyTorch/accelerate_pytorch_training_multi_instance
+                  - file: doc/Nano/Howto/Training/PyTorch/pytorch_training_channels_last
+                  - file: doc/Nano/Howto/Training/PyTorch/accelerate_pytorch_training_bf16
                   - file: doc/Nano/Howto/Training/TensorFlow/accelerate_tensorflow_training_multi_instance
                   - file: doc/Nano/Howto/Training/TensorFlow/tensorflow_training_embedding_sparseadam
                   - file: doc/Nano/Howto/Training/TensorFlow/tensorflow_training_bf16

diff --git a/...eadthedocs/source/doc/Nano/Howto/Training/PyTorch/accelerate_pytorch_training_bf16.nblink b/...eadthedocs/source/doc/Nano/Howto/Training/PyTorch/accelerate_pytorch_training_bf16.nblink
@@ -0,0 +1,3 @@
+{
+    "path": "../../../../../../../../python/nano/tutorial/notebook/training/pytorch/accelerate_pytorch_training_bf16.ipynb"
+}
diff --git a/...eadthedocs/source/doc/Nano/Howto/Training/PyTorch/accelerate_pytorch_training_ipex.nblink b/...eadthedocs/source/doc/Nano/Howto/Training/PyTorch/accelerate_pytorch_training_ipex.nblink
@@ -0,0 +1,3 @@
+{
+    "path": "../../../../../../../../python/nano/tutorial/notebook/training/pytorch/accelerate_pytorch_training_ipex.ipynb"
+}
diff --git a/.../source/doc/Nano/Howto/Training/PyTorch/accelerate_pytorch_training_multi_instance.nblink b/.../source/doc/Nano/Howto/Training/PyTorch/accelerate_pytorch_training_multi_instance.nblink
@@ -0,0 +1,3 @@
+{
+    "path": "../../../../../../../../python/nano/tutorial/notebook/training/pytorch/accelerate_pytorch_training_multi_instance.ipynb"
+}
diff --git a/.../readthedocs/source/doc/Nano/Howto/Training/PyTorch/pytorch_training_channels_last.nblink b/.../readthedocs/source/doc/Nano/Howto/Training/PyTorch/pytorch_training_channels_last.nblink
@@ -0,0 +1,3 @@
+{
+    "path": "../../../../../../../../python/nano/tutorial/notebook/training/pytorch/pytorch_training_channels_last.ipynb"
+}
diff --git a/docs/readthedocs/source/doc/Nano/Howto/index.rst b/docs/readthedocs/source/doc/Nano/Howto/index.rst
@@ -27,6 +27,10 @@ PyTorch
 ~~~~~~~~~~~~~~~~~~~~~~~~~
 * |convert_pytorch_training_torchnano|_
 * |use_nano_decorator_pytorch_training|_
+* `How to accelerate a PyTorch application on training workloads through Intel® Extension for PyTorch* <Training/PyTorch/accelerate_pytorch_training_ipex.html>`_
+* `How to accelerate a PyTorch application on training workloads through multiple instances <Training/PyTorch/accelerate_pytorch_training_multi_instance.html>`_
+* `How to use the channels last memory format in your PyTorch application for training <Training/PyTorch/pytorch_training_channels_last.html>`_
+* `How to conduct BFloat16 Mixed Precision training in your PyTorch application <Training/PyTorch/accelerate_pytorch_training_bf16.html>`_
 
 .. |use_nano_decorator_pytorch_training| replace:: How to accelerate your PyTorch training loop with ``@nano`` decorator
 .. _use_nano_decorator_pytorch_training: Training/PyTorch/use_nano_decorator_pytorch_training.html

diff --git a/...ook/training/pytorch-lightning/accelerate_pytorch_lightning_training_multi_instance.ipynb b/...ook/training/pytorch-lightning/accelerate_pytorch_lightning_training_multi_instance.ipynb
@@ -192,9 +192,9 @@
    "source": [
     "> 📝 **Note**\n",
     ">\n",
-    "> By setting `num_processes`, Nano will launch the specific number of processes to perform data-parallel training. By default, CPU cores will be automatically and evenly distributed among processes to avoid conflicts and maximize training throughput. If you would like to specifiy the CPU cores used by each process, You could set `cpu_for_each_process` to a list of length `num_processes`, in which each item is a list of CPU indices.\n",
+    "> By setting `num_processes`, Nano will launch the specific number of processes to perform data-parallel training. By default, CPU cores will be automatically and evenly distributed among processes to avoid conflicts and maximize training throughput. If you would like to specify the CPU cores used by each process, You could set `cpu_for_each_process` to a list of length `num_processes`, in which each item is a list of CPU indices.\n",
     "> \n",
-    "> During multi-instance training, the effective batch size is the `batch_size` (in dataloader) $\\times$ `num_processes`, which will cause the number of iterations in each epoch to reduce by a factor of `num_processes`. To achieve the same effect as single instance training, a common practice to compensate is to gradually increase the learning rate to `num_processes` times. BigDL-Nano Trainer enable this pratice by default through `auto_lr=True`\n",
+    "> During multi-instance training, the effective batch size is the `batch_size` (in dataloader) $\\times$ `num_processes`, which will cause the number of iterations in each epoch to reduce by a factor of `num_processes`. To achieve the same effect as single instance training, a common practice to compensate is to gradually increase the learning rate to `num_processes` times. BigDL-Nano Trainer enable this practice by default through `auto_lr=True`\n",
     ">\n",
     "> Please refer to the [API doc](https://bigdl.readthedocs.io/en/latest/doc/PythonAPI/Nano/pytorch.html#bigdl.nano.pytorch.Trainer) for more detailed information regarding multi-instance related parameters in `bigdl.nano.pytorch.Trainer`."
    ]