Update tutorial-installing-llamacpp-aws-cuda.md

distantmagic · May 21, 2024 · 15e28f6 · 15e28f6
1 parent f3b5370
commit 15e28f6
Showing 1 changed file with 4 additions and 1 deletion.
diff --git a/tutorial-installing-llamacpp-aws-cuda.md b/tutorial-installing-llamacpp-aws-cuda.md
@@ -33,10 +33,13 @@ system. This tutorial was specifically written for an installation on a `Ubuntu
     ```shell
     make LLAMA_CUDA=1
     ```
-      
+5. Benchmark llama.cpp (optional):
+
     Follow the official tutorial if you intend to run the benchmark. However, keep using `make LLAMA_CUDA=1` to compile the llama.cpp (do *not* use `LLAMA_CUBLAS=1`):
   https://github.com/ggerganov/llama.cpp/discussions/4225
 
+    Instead of performing a model quantization by yourself, you can download quantized models from Hugging Face. For example, `Mistral Instruct` you can download from https://huggingface.co/TheBloke/Mistral-7B-Instruct-v0.2-GGUF/tree/main
+
 ## Potential Errors
 
 ### libtinfo5 is not installable