From 2880ae8b07e009a3b10b1e31d41b3010fd33232d Mon Sep 17 00:00:00 2001
From: Will Lin <wlsaidhi@gmail.com>
Date: Wed, 28 Aug 2024 07:04:39 +0000
Subject: [PATCH] docs: profiler example

---
 docs/source/dev/profiling/profiling_index.rst | 16 ++++++++++++++--
 1 file changed, 14 insertions(+), 2 deletions(-)

diff --git a/docs/source/dev/profiling/profiling_index.rst b/docs/source/dev/profiling/profiling_index.rst
index 0cf714282e7b2..40ce38a2193b5 100644
--- a/docs/source/dev/profiling/profiling_index.rst
+++ b/docs/source/dev/profiling/profiling_index.rst
@@ -24,13 +24,25 @@ Traces can be visualized using https://ui.perfetto.dev/.
    Set the env variable VLLM_RPC_GET_DATA_TIMEOUT_MS to a big number before you start the server. Say something like 30 minutes.
    ``export VLLM_RPC_GET_DATA_TIMEOUT_MS=1800000``
   
-Example commands:
+Example commands and usage:
+===========================
+
+Offline Inference:
+------------------
+
+Source https://github.com/vllm-project/vllm/blob/main/examples/offline_inference_with_profiler.py.
+
+.. literalinclude:: ../../../../examples/offline_inference_with_profiler.py
+    :language: python
+    :linenos:
+
 
 OpenAI Server:
+--------------
 
 .. code-block:: bash
 
-    VLLM_TORCH_PROFILER_DIR=/mnt/traces/ python -m vllm.entrypoints.openai.api_server --model meta-llama/Meta-Llama-3-70B 
+    VLLM_TORCH_PROFILER_DIR=./vllm_profile python -m vllm.entrypoints.openai.api_server --model meta-llama/Meta-Llama-3-70B 
 
 benchmark_serving.py: