Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update install_linux_gpu.md, add version for Iris. #12353

Merged
merged 1 commit into from
Nov 7, 2024
Merged
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
5 changes: 4 additions & 1 deletion docs/mddocs/Quickstart/install_linux_gpu.md
Original file line number Diff line number Diff line change
Expand Up @@ -166,6 +166,8 @@ whose output should contain `Intel(R) Arc(TM) Graphics` or `Intel(R) Graphics` b
linux-headers-$(uname -r) \
libc6-dev
sudo apt install intel-i915-dkms intel-fw-gpu
# Notice: if you are using Iris Graphics(integrated in 10-13 Gen Intel laptop CPU), please use below version of intel-i915-dkms instead.
# sudo apt install intel-i915-dkms=1.24.2.17.240301.20+i29-1 intel-fw-gpu=2024.17.5-329~22.04

# Install Compute Runtime
sudo apt-get install -y udev \
Expand Down Expand Up @@ -229,6 +231,8 @@ whose output should contain `Intel(R) Arc(TM) Graphics` or `Intel(R) Graphics` b
linux-headers-$(uname -r) \
libc6-dev
sudo apt install -y intel-i915-dkms intel-fw-gpu
# Notice: if you are using Iris Graphics(integrated in 10-13 Gen Intel laptop CPU), please use below version of intel-i915-dkms instead.
# sudo apt install intel-i915-dkms=1.24.2.17.240301.20+i29-1 intel-fw-gpu=2024.17.5-329~22.04

# Install Compute Runtime
sudo apt-get install -y udev \
Expand Down Expand Up @@ -487,4 +491,3 @@ Answer: AI stands for Artificial Intelligence, which is the simulation of human

### Warmup for optimial performance on first run
When running LLMs on GPU for the first time, you might notice the performance is lower than expected, with delays up to several minutes before the first token is generated. This delay occurs because the GPU kernels require compilation and initialization, which varies across different GPU types. To achieve optimal and consistent performance, we recommend a one-time warm-up by running `model.generate(...)` an additional time before starting your actual generation tasks. If you're developing an application, you can incorporate this warmup step into start-up or loading routine to enhance the user experience.