From bed4e1711939d47b0318d80e1f713e7986b7d88e Mon Sep 17 00:00:00 2001
From: sgolebiewski-intel <sebastianx.golebiewski@intel.com>
Date: Tue, 17 Dec 2024 13:05:20 +0100
Subject: [PATCH] Fixing code snippet for GenAI inference on NPU

Signed-off-by: sgolebiewski-intel <sebastianx.golebiewski@intel.com>
---
 .../llm_inference_guide/genai-guide-npu.rst               | 2 ++
 .../learn-openvino/llm_inference_guide/genai-guide.rst    | 8 ++++----
 2 files changed, 6 insertions(+), 4 deletions(-)

diff --git a/docs/articles_en/learn-openvino/llm_inference_guide/genai-guide-npu.rst b/docs/articles_en/learn-openvino/llm_inference_guide/genai-guide-npu.rst
index bd18ed29860c00..25cc1ef59fbed5 100644
--- a/docs/articles_en/learn-openvino/llm_inference_guide/genai-guide-npu.rst
+++ b/docs/articles_en/learn-openvino/llm_inference_guide/genai-guide-npu.rst
@@ -90,6 +90,7 @@ which do not require specifying quantization parameters:
 | Below is a list of such models:
 
 * meta-llama/Meta-Llama-3-8B-Instruct
+* meta-llama/Llama-3.1-8B
 * microsoft/Phi-3-mini-4k-instruct
 * Qwen/Qwen2-7B
 * mistralai/Mistral-7B-Instruct-v0.2
@@ -136,6 +137,7 @@ you need to add ``do_sample=False`` **to the** ``generate()`` **method:**
             ov::genai::GenerationConfig config;
             config.do_sample=false;
             config.max_new_tokens=100;
+            ov::genai::LLMPipeline pipe(models_path, "NPU");
             std::cout << pipe.generate("The Sun is yellow because", config);
          }
 
diff --git a/docs/articles_en/learn-openvino/llm_inference_guide/genai-guide.rst b/docs/articles_en/learn-openvino/llm_inference_guide/genai-guide.rst
index eff30eed054295..2c2491de7b74cf 100644
--- a/docs/articles_en/learn-openvino/llm_inference_guide/genai-guide.rst
+++ b/docs/articles_en/learn-openvino/llm_inference_guide/genai-guide.rst
@@ -583,9 +583,9 @@ compression is done by NNCF at the model export stage. The exported model contai
 information necessary for execution, including the tokenizer/detokenizer and the generation
 config, ensuring that its results match those generated by Hugging Face.
 
-The `LLMPipeline` is the main object to setup the model for text generation. You can provide the
-converted model to this object, specify the device for inference, and provide additional
-parameters.
+The LLMPipeline is the main object to setup the model for the text generation.
+You can provide converted model to this object, specifically device for inference
+and provide other parameters.
 
 
 .. tab-set::
@@ -916,7 +916,7 @@ running the following code:
 GenAI API
 #######################################
 
-The use case described here uses the following OpenVINO GenAI API classes:
+The use case described here regards the following OpenVINO GenAI API classes:
 
 * generation_config - defines a configuration class for text generation,
   enabling customization of the generation process such as the maximum length of