From cd50ba92faa5fde731b8a1372bafa439f7910617 Mon Sep 17 00:00:00 2001
From: Yuwen Hu <yuwen.hu@intel.com>
Date: Thu, 5 Dec 2024 11:16:22 +0800
Subject: [PATCH 1/4] Add NPU demo gif to main readme

---
 README.md | 16 ++++++++--------
 1 file changed, 8 insertions(+), 8 deletions(-)
diff --git a/README.md b/README.md
index d202ac40a49..6159d905b39 100644
--- a/README.md
+++ b/README.md
@@ -78,20 +78,20 @@ See demos of running local LLMs *on Intel Iris iGPU, Intel Core Ultra iGPU, sing
 
 <table width="100%">
   <tr>
-    <td align="center" colspan="1"><strong>Intel Iris iGPU</strong></td>
-    <td align="center" colspan="1"><strong>Intel Core Ultra iGPU</strong></td>
+    <td align="center" colspan="1"><strong>Intel Core Ultra (Series 1) iGPU</strong></td>
+    <td align="center" colspan="1"><strong>Intel Core Ultra (Series 2) NPU</strong></td>
     <td align="center" colspan="1"><strong>Intel Arc dGPU</strong></td>
     <td align="center" colspan="1"><strong>2-Card Intel Arc dGPUs</strong></td>
   </tr>
   <tr>
     <td>
-      <a href="https://llm-assets.readthedocs.io/en/latest/_images/iris_phi3-3.8B_q4_0_llamacpp_long.gif" target="_blank">
-        <img src="https://llm-assets.readthedocs.io/en/latest/_images/iris_phi3-3.8B_q4_0_llamacpp_long.gif" width=100%; />
+      <a href="https://llm-assets.readthedocs.io/en/latest/_images/mtl_mistral-7B_q4_k_m_ollama.gif" target="_blank">
+        <img src="https://llm-assets.readthedocs.io/en/latest/_images/mtl_mistral-7B_q4_k_m_ollama.gif" width=100%; />
       </a>
     </td>
     <td>
-      <a href="https://llm-assets.readthedocs.io/en/latest/_images/mtl_mistral-7B_q4_k_m_ollama.gif" target="_blank">
-        <img src="https://llm-assets.readthedocs.io/en/latest/_images/mtl_mistral-7B_q4_k_m_ollama.gif" width=100%; />
+      <a href="https://llm-assets.readthedocs.io/en/latest/_images/npu_llama3.2-3B.gif" target="_blank">
+        <img src="https://llm-assets.readthedocs.io/en/latest/_images/npu_llama3.2-3B.gif" width=100%; />
       </a>
     </td>
     <td>
@@ -107,10 +107,10 @@ See demos of running local LLMs *on Intel Iris iGPU, Intel Core Ultra iGPU, sing
   </tr>
   <tr>
     <td align="center" width="25%">
-      <a href="docs/mddocs/Quickstart/llama_cpp_quickstart.md">llama.cpp (Phi-3-mini Q4_0)</a>
+      <a href="docs/mddocs/Quickstart/ollama_quickstart.md">Ollama (Mistral-7B Q4_K) </a>
     </td>
     <td align="center" width="25%">
-      <a href="docs/mddocs/Quickstart/ollama_quickstart.md">Ollama (Mistral-7B Q4_K) </a>
+      <a href="docs/mddocs/Quickstart/llama_cpp_quickstart.md">NPU (Llama3.2-3B Q4_0)</a>
     </td>
     <td align="center" width="25%">
       <a href="docs/mddocs/Quickstart/webui_quickstart.md">TextGeneration-WebUI (Llama3-8B FP8) </a>

From 11fb016c14f550712ee91e5448bc222f2c173482 Mon Sep 17 00:00:00 2001
From: Yuwen Hu <yuwen.hu@intel.com>
Date: Thu, 5 Dec 2024 11:21:51 +0800
Subject: [PATCH 2/4] Small fix

---
 README.md | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/README.md b/README.md
index 6159d905b39..0d4c7669911 100644
--- a/README.md
+++ b/README.md
@@ -110,7 +110,7 @@ See demos of running local LLMs *on Intel Iris iGPU, Intel Core Ultra iGPU, sing
       <a href="docs/mddocs/Quickstart/ollama_quickstart.md">Ollama (Mistral-7B Q4_K) </a>
     </td>
     <td align="center" width="25%">
-      <a href="docs/mddocs/Quickstart/llama_cpp_quickstart.md">NPU (Llama3.2-3B Q4_0)</a>
+      <a href="docs/mddocs/Quickstart/npu_quickstart.md">PyTorch (Llama3.2-3B SYM_INT4)</a>
     </td>
     <td align="center" width="25%">
       <a href="docs/mddocs/Quickstart/webui_quickstart.md">TextGeneration-WebUI (Llama3-8B FP8) </a>

From 09b9d7a905508e7c17959af5124c39815388a200 Mon Sep 17 00:00:00 2001
From: Yuwen Hu <yuwen.hu@intel.com>
Date: Thu, 5 Dec 2024 12:13:10 +0800
Subject: [PATCH 3/4] Update based on comments

---
 README.md | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/README.md b/README.md
index 0d4c7669911..e14fa7f55c0 100644
--- a/README.md
+++ b/README.md
@@ -110,7 +110,7 @@ See demos of running local LLMs *on Intel Iris iGPU, Intel Core Ultra iGPU, sing
       <a href="docs/mddocs/Quickstart/ollama_quickstart.md">Ollama (Mistral-7B Q4_K) </a>
     </td>
     <td align="center" width="25%">
-      <a href="docs/mddocs/Quickstart/npu_quickstart.md">PyTorch (Llama3.2-3B SYM_INT4)</a>
+      <a href="docs/mddocs/Quickstart/npu_quickstart.md">HuggingFace (Llama3.2-3B SYM_INT4)</a>
     </td>
     <td align="center" width="25%">
       <a href="docs/mddocs/Quickstart/webui_quickstart.md">TextGeneration-WebUI (Llama3-8B FP8) </a>

From abd2f0bf14bf8d2fc352d213960f3d760d3769e7 Mon Sep 17 00:00:00 2001
From: Yuwen Hu <yuwen.hu@intel.com>
Date: Thu, 5 Dec 2024 12:15:50 +0800
Subject: [PATCH 4/4] Test on style fix

---
 README.md | 8 ++++----
 1 file changed, 4 insertions(+), 4 deletions(-)

diff --git a/README.md b/README.md
index e14fa7f55c0..17bf1ef01b6 100644
--- a/README.md
+++ b/README.md
@@ -107,16 +107,16 @@ See demos of running local LLMs *on Intel Iris iGPU, Intel Core Ultra iGPU, sing
   </tr>
   <tr>
     <td align="center" width="25%">
-      <a href="docs/mddocs/Quickstart/ollama_quickstart.md">Ollama (Mistral-7B Q4_K) </a>
+      <a href="docs/mddocs/Quickstart/ollama_quickstart.md">Ollama <br> (Mistral-7B Q4_K) </a>
     </td>
     <td align="center" width="25%">
-      <a href="docs/mddocs/Quickstart/npu_quickstart.md">HuggingFace (Llama3.2-3B SYM_INT4)</a>
+      <a href="docs/mddocs/Quickstart/npu_quickstart.md">HuggingFace <br> (Llama3.2-3B SYM_INT4)</a>
     </td>
     <td align="center" width="25%">
-      <a href="docs/mddocs/Quickstart/webui_quickstart.md">TextGeneration-WebUI (Llama3-8B FP8) </a>
+      <a href="docs/mddocs/Quickstart/webui_quickstart.md">TextGeneration-WebUI <br> (Llama3-8B FP8) </a>
     </td>
     <td align="center" width="25%">
-      <a href="docs/mddocs/Quickstart/fastchat_quickstart.md">FastChat (QWen1.5-32B FP6)</a>
+      <a href="docs/mddocs/Quickstart/fastchat_quickstart.md">FastChat <br> (QWen1.5-32B FP6)</a>
     </td>  </tr>
 </table>
 

Intel Iris iGPU	Intel Core Ultra iGPU	Intel Core Ultra (Series 1) iGPU	Intel Core Ultra (Series 2) NPU	Intel Arc dGPU	2-Card Intel Arc dGPUs
- - + +	- - + +	@@ -107,10 +107,10 @@ See demos of running local LLMs *on Intel Iris iGPU, Intel Core Ultra iGPU, sing
- llama.cpp (Phi-3-mini Q4_0) + Ollama (Mistral-7B Q4_K)	- Ollama (Mistral-7B Q4_K) + NPU (Llama3.2-3B Q4_0)	TextGeneration-WebUI (Llama3-8B FP8) From 11fb016c14f550712ee91e5448bc222f2c173482 Mon Sep 17 00:00:00 2001 From: Yuwen Hu Date: Thu, 5 Dec 2024 11:21:51 +0800 Subject: [PATCH 2/4] Small fix --- README.md \| 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/README.md b/README.md index 6159d905b39..0d4c7669911 100644 --- a/README.md +++ b/README.md @@ -110,7 +110,7 @@ See demos of running local LLMs *on Intel Iris iGPU, Intel Core Ultra iGPU, sing Ollama (Mistral-7B Q4_K)	- NPU (Llama3.2-3B Q4_0) + PyTorch (Llama3.2-3B SYM_INT4)	TextGeneration-WebUI (Llama3-8B FP8) From 09b9d7a905508e7c17959af5124c39815388a200 Mon Sep 17 00:00:00 2001 From: Yuwen Hu Date: Thu, 5 Dec 2024 12:13:10 +0800 Subject: [PATCH 3/4] Update based on comments --- README.md \| 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/README.md b/README.md index 0d4c7669911..e14fa7f55c0 100644 --- a/README.md +++ b/README.md @@ -110,7 +110,7 @@ See demos of running local LLMs *on Intel Iris iGPU, Intel Core Ultra iGPU, sing Ollama (Mistral-7B Q4_K)	- PyTorch (Llama3.2-3B SYM_INT4) + HuggingFace (Llama3.2-3B SYM_INT4)	TextGeneration-WebUI (Llama3-8B FP8) From abd2f0bf14bf8d2fc352d213960f3d760d3769e7 Mon Sep 17 00:00:00 2001 From: Yuwen Hu Date: Thu, 5 Dec 2024 12:15:50 +0800 Subject: [PATCH 4/4] Test on style fix --- README.md \| 8 ++++---- 1 file changed, 4 insertions(+), 4 deletions(-) diff --git a/README.md b/README.md index e14fa7f55c0..17bf1ef01b6 100644 --- a/README.md +++ b/README.md @@ -107,16 +107,16 @@ See demos of running local LLMs *on Intel Iris iGPU, Intel Core Ultra iGPU, sing
- Ollama (Mistral-7B Q4_K) + Ollama (Mistral-7B Q4_K)	- HuggingFace (Llama3.2-3B SYM_INT4) + HuggingFace (Llama3.2-3B SYM_INT4)	- TextGeneration-WebUI (Llama3-8B FP8) + TextGeneration-WebUI (Llama3-8B FP8)	- FastChat (QWen1.5-32B FP6) + FastChat (QWen1.5-32B FP6)