GenAI: Add a draft for NPU doc #25841

dmatveev · 2024-07-31T15:46:25Z

Details:

item1
...

Tickets:

ticket-id

dmatveev · 2024-08-02T16:02:56Z

docs/articles_en/learn-openvino/llm_inference_guide/genai-guide-npu.rst

+      openvino==2024.2.0
+      openvino-tokenizers==2024.2.0
+      nncf==2.11.0
+      optimum-intel @ git+https://github.com/huggingface/optimum-intel.git@439d61f79cf55d5d0b28334f577b6ac3c5ced28f


optimum-intel has different release cadence and usually taken from main branch

docs/articles_en/learn-openvino/llm_inference_guide/genai-guide-npu.rst

TolyaTalamanov · 2024-08-07T12:19:13Z

docs/articles_en/learn-openvino/llm_inference_guide/genai-guide-npu.rst

+   .. code-block:: text
+
+   # requirements.txt
+   openvino==2024.3.1


2024.3.1 hasn't been released yet!

docs/articles_en/learn-openvino/llm_inference_guide/genai-guide-npu.rst

dmatveev · 2024-08-07T22:23:16Z

docs/articles_en/learn-openvino/llm_inference_guide/genai-guide-npu.rst

+   openvino==2024.2.0
+   openvino-tokenizers==2024.2.0
+   nncf==2.11.0
+   optimum-intel @ git+https://github.com/huggingface/optimum-intel.git@439d61f79cf55d5d0b28334f577b6ac3c5ced28f


Where does this point to? Is there a specific version or tag instead of a hash commit?

This is the hash commit from main branch that was verified to work with NPU.

In general, we may not specify the exact commit, hopefully it won't break NPU (not guaranteed).
openvino-notebooks takes optimum-intel just from main branch, see:
https://github.com/openvinotoolkit/openvino_notebooks/blob/a99c0ec648fc6414fb5c169e2dd0ef396c71f613/notebooks/llm-question-answering/llm-question-answering.ipynb#L28

%pip install -q "torch>=2.1" "nncf>=2.7" "transformers>=4.40.0" onnx "optimum>=1.16.1" "accelerate" "datasets>=2.14.6" "gradio>=4.19" "git+https://github.com/huggingface/optimum-intel.git" --extra-index-url https://download.pytorch.org/whl/cpu

can you find a proper fixed version or tag here?

dmatveev · 2024-08-07T22:23:47Z

docs/articles_en/learn-openvino/llm_inference_guide/genai-guide-npu.rst

+
+      pip install -r requirements.txt
+
+2. A chat-tuned TinyLlama model is used in this example. The following conversion & optimization settings are recommended when using NPU:


Is chat-tuned the right term here?

Why not, model finetuned for chat scenarios

I can confirm that this term is used in the main article as well and can be found in the net scarcely. Changing it to "fine-tuned for chat" may be a good idea. We would change it in the genAI article too, in that case.

docs/articles_en/learn-openvino/llm_inference_guide/genai-guide-npu.rst

dmatveev · 2024-08-07T22:26:24Z

docs/articles_en/learn-openvino/llm_inference_guide/genai-guide-npu.rst

+Additional configuration options
+################################
+
+Compiling models for NPU may take a while. By default, LLMPipeline for NPU is configured for faster compilation, but it may result in lower performance. To achieve better performance at the expense of compilation time, you may try these settings:


Doesn't alter the way it looks in RST but please limit lines to ~80-100 characters long.

In emacs it is easy, in vim I don't know.

I thought it's your piece, didn't touch this. No prob, will format it

Don't worry too much about formatting, we can polish the whole thing, make sure line breaks are fine, references work, directives render properly, and all that stuff :)

docs/articles_en/learn-openvino/llm_inference_guide/genai-guide-npu.rst

TolyaTalamanov · 2024-08-08T12:58:33Z

Preview: http://openvino-doc.iotg.sclab.intel.com/genai-npu-preview/learn-openvino/llm_inference_guide/genai-guide-npu.html

@dmatveev Could you have a look one more time, please?

dmatveev · 2024-08-08T14:17:52Z

Preview: http://openvino-doc.iotg.sclab.intel.com/genai-npu-preview/learn-openvino/llm_inference_guide/genai-guide-npu.html

@dmatveev Could you have a look one more time, please?

Reviewed, looks good to me, but I think it is now a dry how-to rather some useful educational text. What I mean here, there's no explanation that or emphasis that a specific type of quantization preferable (asymmetric in this case). Why does this instruction exist at all? What makes it different from the default one except a device name?

Also there's no other options covered, e.g. how to achieve better performance at the cost of compile time.

TolyaTalamanov · 2024-08-12T08:58:15Z

Preview: http://openvino-doc.iotg.sclab.intel.com/genai-npu-preview/learn-openvino/llm_inference_guide/genai-guide-npu.html
@dmatveev Could you have a look one more time, please?

Reviewed, looks good to me, but I think it is now a dry how-to rather some useful educational text. What I mean here, there's no explanation that or emphasis that a specific type of quantization preferable (asymmetric in this case). Why does this instruction exist at all? What makes it different from the default one except a device name?

Also there's no other options covered, e.g. how to achieve better performance at the cost of compile time.

The option to achieve better performance is covered in "Additional configuration options" part

TolyaTalamanov · 2024-08-12T12:14:56Z

LGTM 👍

dmatveev · 2024-08-12T15:09:36Z

@kblaszczak-intel can you please put your approve here? It seems merging is still blocked for this PR as @TolyaTalamanov seem not enough

kblaszczak-intel

Some tweaks added but nothing major.

docs/articles_en/learn-openvino/llm_inference_guide/genai-guide-npu.rst

…e-npu.rst

Port from #25841

GenAI: Add a draft for NPU doc

fb42566

dmatveev self-assigned this Jul 31, 2024

github-actions bot added the category: docs OpenVINO documentation label Jul 31, 2024

dmatveev added the category: NPU OpenVINO NPU plugin label Jul 31, 2024

GenAI NPU: Fix indent

5a8a8ee

github-actions bot removed the category: NPU OpenVINO NPU plugin label Jul 31, 2024

TolyaTalamanov and others added 3 commits August 2, 2024 12:26

Update genai-guide-npu.rst

5378152

add genai-guide for NPU to TOCtree

c240fa3

Update genai-guide-npu.rst

c995ed9

dmatveev commented Aug 2, 2024

View reviewed changes

Update genai-guide-npu.rst

b474210

TolyaTalamanov reviewed Aug 7, 2024

View reviewed changes

docs/articles_en/learn-openvino/llm_inference_guide/genai-guide-npu.rst Outdated Show resolved Hide resolved

TolyaTalamanov added 2 commits August 7, 2024 12:49

Update genai-guide-npu.rst

192d593

Update genai-guide-npu.rst

f427a7d

TolyaTalamanov reviewed Aug 7, 2024

View reviewed changes

dmatveev commented Aug 7, 2024

View reviewed changes

TolyaTalamanov added 2 commits August 8, 2024 12:31

Update genai-guide-npu.rst

d4e1744

Update genai-guide-npu.rst

f9cce87

dmatveev marked this pull request as ready for review August 8, 2024 14:17

dmatveev requested a review from a team as a code owner August 8, 2024 14:17

dmatveev requested review from tsavina and removed request for a team August 8, 2024 14:17

Update genai-guide-npu.rst

82d0c85

Update genai-guide-npu.rst

31842b7

TolyaTalamanov approved these changes Aug 12, 2024

View reviewed changes

kblaszczak-intel approved these changes Aug 13, 2024

View reviewed changes

kblaszczak-intel added 7 commits August 13, 2024 12:03

Update docs/articles_en/learn-openvino/llm_inference_guide/genai-guid…

7d8483c

…e-npu.rst

Update docs/articles_en/learn-openvino/llm_inference_guide/genai-guid…

838f864

…e-npu.rst

Update docs/articles_en/learn-openvino/llm_inference_guide/genai-guid…

af966e1

…e-npu.rst

Update docs/articles_en/learn-openvino/llm_inference_guide/genai-guid…

5b16915

…e-npu.rst

Update docs/articles_en/learn-openvino/llm_inference_guide/genai-guid…

28ff15f

…e-npu.rst

Update docs/articles_en/learn-openvino/llm_inference_guide/genai-guid…

271e710

…e-npu.rst

Merge branch 'master' into dm/genai_npu_doc

1b9fe31

kblaszczak-intel enabled auto-merge August 13, 2024 10:04

kblaszczak-intel disabled auto-merge August 13, 2024 10:07

kblaszczak-intel enabled auto-merge August 13, 2024 10:07

kblaszczak-intel added this pull request to the merge queue Aug 13, 2024

msmykx-intel mentioned this pull request Aug 13, 2024

[DOCS] GenAI: Add a draft for NPU doc for 24.3 #26045

Merged

Merged via the queue into openvinotoolkit:master with commit 55ffb33 Aug 13, 2024
103 checks passed

github-merge-queue bot pushed a commit that referenced this pull request Aug 13, 2024

[DOCS] GenAI: Add a draft for NPU doc for 24.3 (#26045)

5e25d3f

Port from #25841

This was referenced Aug 13, 2024

if there is a sample code to run on NPU? openvinotoolkit/openvino.genai#579

Closed

[Feature Request]: How to run LLM with NPU device？ #22580

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

GenAI: Add a draft for NPU doc #25841

GenAI: Add a draft for NPU doc #25841

dmatveev commented Jul 31, 2024

dmatveev Aug 2, 2024

TolyaTalamanov Aug 6, 2024

TolyaTalamanov Aug 7, 2024 •

edited

Loading

dmatveev Aug 7, 2024

TolyaTalamanov Aug 8, 2024

dmatveev Aug 8, 2024

dmatveev Aug 7, 2024

TolyaTalamanov Aug 8, 2024

kblaszczak-intel Aug 9, 2024

dmatveev Aug 7, 2024

TolyaTalamanov Aug 8, 2024

kblaszczak-intel Aug 8, 2024

TolyaTalamanov commented Aug 8, 2024

dmatveev commented Aug 8, 2024

TolyaTalamanov commented Aug 12, 2024

TolyaTalamanov commented Aug 12, 2024

dmatveev commented Aug 12, 2024

kblaszczak-intel left a comment


		pip install -r requirements.txt

		2. A chat-tuned TinyLlama model is used in this example. The following conversion & optimization settings are recommended when using NPU:

GenAI: Add a draft for NPU doc #25841

GenAI: Add a draft for NPU doc #25841

Conversation

dmatveev commented Jul 31, 2024

Details:

Tickets:

Choose a reason for hiding this comment

Choose a reason for hiding this comment

TolyaTalamanov Aug 7, 2024 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

TolyaTalamanov commented Aug 8, 2024

dmatveev commented Aug 8, 2024

TolyaTalamanov commented Aug 12, 2024

TolyaTalamanov commented Aug 12, 2024

dmatveev commented Aug 12, 2024

kblaszczak-intel left a comment

Choose a reason for hiding this comment

TolyaTalamanov Aug 7, 2024 •

edited

Loading