-
Notifications
You must be signed in to change notification settings - Fork 1.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Quick start: Install bigdl-llm on windows gpu #10195
Conversation
|
* support name mapping for mixtral * support mixtral mixed quantization * fix style * fix
Sure, I will rename and move the figs. |
The size of the two visual studio figures (i.e. fig1 and fig2) are too large - rescale the figures and put them side-by-side. |
I don't think we need fig3. Add a figure to use windows task manager to check iGPU/GPU status, etc. |
1. Step 1: Run the commands below in Anaconda prompt. | ||
|
||
```bash | ||
conda create -n llm python=3.9 libuv # Already done in "Install conda" section |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
you already created llm env before, so just remove this line to avoid confusing.
```bash | ||
conda create -n llm python=3.9 libuv # Already done in "Install conda" section | ||
conda activate llm | ||
pip install dpcpp-cpp-rt==2024.0.2 mkl-dpcpp==2024.0.0 onednn==2024.0.0 # Already done in "Install oneAPI" section |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
remove one-api as it is done in previous section.
from transformers import AutoTokenizer, GenerationConfig | ||
``` | ||
|
||
Then we use phi-1.5 as an example to show how to run the model with bigdl-llm on windows. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
make the phi-1.5 example in a new section "A Quick Example"
generation_config = GenerationConfig(use_cache = True) | ||
|
||
if __name__ == '__main__': | ||
parser = argparse.ArgumentParser(description='Predict Tokens using `generate()` API for phi-1_5 model') |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Make this example as simple as possible without much code.
- remove arg parse section (put just arg values in code)
- remove timing code
- make the comments concise
* remove include and language option, select the corresponding dataset based on the model name in Run * change the nightly test time * change the nightly test time of harness and ppl
* Make Offline installer as default for win gpu doc for oneAPI * Small other fixes
…a/*/pom.xml (#10197) * Bump org.apache.commons:commons-compress in /scala/serving Bumps org.apache.commons:commons-compress from 1.21 to 1.26.0. --- updated-dependencies: - dependency-name: org.apache.commons:commons-compress dependency-type: direct:production ... Signed-off-by: dependabot[bot] <[email protected]> * Bumps org.apache.commons:commons-compress from 1.21 to 1.26.0. * Bumps org.apache.commons:commons-compress from 1.21 to 1.26.0. * Bumps org.apache.commons:commons-compress from 1.21 to 1.26.0. * Bumps org.apache.commons:commons-compress from 1.21 to 1.26.0. --------- Signed-off-by: dependabot[bot] <[email protected]> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> Co-authored-by: Shaojun Liu <[email protected]>
* Add c-eval workflow and modify running files * Modify the chatglm evaluator file * Modify the ceval workflow for triggering test * Modify the ceval workflow file * Modify the ceval workflow file * Modify ceval workflow * Adjust the ceval dataset download * Add ceval workflow dependencies * Modify ceval workflow dataset download * Add ceval test dependencies * Add ceval test dependencies * Correct the result print * Fix the nightly test trigger time * Fix ChatGLM loading issue
* add esimd sdp support * fix style
* add quantize kv_cache for baichuan2-13b * style fix
* add mlp layer unit tests * add download baichuan-13b * exclude llama for now * install additional packages * rename bash file * switch to Baichuan2 * delete attention related code * fix name errors in yml file
* Add model loading time record in csv for all-in-one benchmark * Small fix * Small fix to number after .
* add iq2 examples * small fix * meet code review * fix * meet review * small fix
* Run the commands below in Anaconda prompt. Please note that transformer version should match the model you want to use. For example, here we use transformers 4.37.0 to run the demo. | ||
|
||
```bash | ||
conda activate llm | ||
|
||
pip install --pre --upgrade bigdl-llm[xpu] -f https://developer.intel.com/ipex-whl-stable-xpu | ||
pip install transformers==4.37.0 | ||
``` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Move transformers
related info to the example section
* Now we can test whether all the components have been installed correctly. If we can import all the packages correctly following the python file below, then the installation is correct. | ||
```python | ||
import torch | ||
import time | ||
import argparse | ||
import numpy as np | ||
|
||
from bigdl.llm.transformers import AutoModel,AutoModelForCausalLM | ||
from transformers import AutoTokenizer, GenerationConfig | ||
``` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How does the user run this? Maybe it's easy to run it in the python prompt?
print('-'*20, 'Output', '-'*20) | ||
print(output_str) | ||
``` | ||
Here is the sample output on the laptop equipped with 11th Gen Intel(R) Core(TM) i7-1185G7 and Intel(R) Iris(R) Xe Graphics after running the example program above. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How does the user run this example?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We provide the contents of demo.py
and users could run it as python demo.py
.
Update IPEX to 2.2.0+cpu and refactor for _ipex_optimize.
* optimize * update * fix style & move use_fuse_rope * add ipex version check * fix style * update * fix style * meet comments * address comments * fix style
…nto install-win-gpu
clean up the unused files. |
Quick start: Install bigdl-ll on windows gpu