Fixing code snippet for GenAI inference on NPU

Signed-off-by: sgolebiewski-intel <[email protected]>
openvinotoolkit · Dec 17, 2024 · bed4e17 · bed4e17
1 parent 919a04c
commit bed4e17
Show file tree

Hide file tree

Showing 2 changed files with 6 additions and 4 deletions.
diff --git a/docs/articles_en/learn-openvino/llm_inference_guide/genai-guide-npu.rst b/docs/articles_en/learn-openvino/llm_inference_guide/genai-guide-npu.rst
@@ -90,6 +90,7 @@ which do not require specifying quantization parameters:
 | Below is a list of such models:
 
 * meta-llama/Meta-Llama-3-8B-Instruct
+* meta-llama/Llama-3.1-8B
 * microsoft/Phi-3-mini-4k-instruct
 * Qwen/Qwen2-7B
 * mistralai/Mistral-7B-Instruct-v0.2
@@ -136,6 +137,7 @@ you need to add ``do_sample=False`` **to the** ``generate()`` **method:**
             ov::genai::GenerationConfig config;
             config.do_sample=false;
             config.max_new_tokens=100;
+            ov::genai::LLMPipeline pipe(models_path, "NPU");
             std::cout << pipe.generate("The Sun is yellow because", config);
          }
 

diff --git a/docs/articles_en/learn-openvino/llm_inference_guide/genai-guide.rst b/docs/articles_en/learn-openvino/llm_inference_guide/genai-guide.rst
@@ -583,9 +583,9 @@ compression is done by NNCF at the model export stage. The exported model contai
 information necessary for execution, including the tokenizer/detokenizer and the generation
 config, ensuring that its results match those generated by Hugging Face.
 
-The `LLMPipeline` is the main object to setup the model for text generation. You can provide the
-converted model to this object, specify the device for inference, and provide additional
-parameters.
+The LLMPipeline is the main object to setup the model for the text generation.
+You can provide converted model to this object, specifically device for inference
+and provide other parameters.
 
 
 .. tab-set::
@@ -916,7 +916,7 @@ running the following code:
 GenAI API
 #######################################
 
-The use case described here uses the following OpenVINO GenAI API classes:
+The use case described here regards the following OpenVINO GenAI API classes:
 
 * generation_config - defines a configuration class for text generation,
   enabling customization of the generation process such as the maximum length of