Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Quick start: Install bigdl-llm on windows gpu #10195

Closed
wants to merge 46 commits into from
Closed

Quick start: Install bigdl-llm on windows gpu #10195

wants to merge 46 commits into from

Conversation

ivy-lv11
Copy link
Contributor

Quick start: Install bigdl-ll on windows gpu

@jason-dai
Copy link
Contributor

  1. Change the file name to install_windows_gpu.md
  2. Do not put images on https://github.com/intel-analytics/BigDL/; put them in https://llm-assets.readthedocs.io/

ivy-lv11 and others added 2 commits February 21, 2024 15:56
* support name mapping for mixtral

* support mixtral mixed quantization

* fix style

* fix
@ivy-lv11
Copy link
Contributor Author

  1. Change the file name to install_windows_gpu.md
  2. Do not put images on https://github.com/intel-analytics/BigDL/; put them in https://llm-assets.readthedocs.io/

Sure, I will rename and move the figs.

@shane-huang
Copy link
Contributor

The size of the two visual studio figures (i.e. fig1 and fig2) are too large - rescale the figures and put them side-by-side.

@shane-huang
Copy link
Contributor

I don't think we need fig3. Add a figure to use windows task manager to check iGPU/GPU status, etc.

1. Step 1: Run the commands below in Anaconda prompt.

```bash
conda create -n llm python=3.9 libuv # Already done in "Install conda" section
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you already created llm env before, so just remove this line to avoid confusing.

```bash
conda create -n llm python=3.9 libuv # Already done in "Install conda" section
conda activate llm
pip install dpcpp-cpp-rt==2024.0.2 mkl-dpcpp==2024.0.0 onednn==2024.0.0 # Already done in "Install oneAPI" section
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

remove one-api as it is done in previous section.

from transformers import AutoTokenizer, GenerationConfig
```

Then we use phi-1.5 as an example to show how to run the model with bigdl-llm on windows.
Copy link
Contributor

@shane-huang shane-huang Feb 21, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

make the phi-1.5 example in a new section "A Quick Example"

generation_config = GenerationConfig(use_cache = True)

if __name__ == '__main__':
parser = argparse.ArgumentParser(description='Predict Tokens using `generate()` API for phi-1_5 model')
Copy link
Contributor

@shane-huang shane-huang Feb 21, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Make this example as simple as possible without much code.

  • remove arg parse section (put just arg values in code)
  • remove timing code
  • make the comments concise

plusbang and others added 14 commits February 21, 2024 16:40
* remove include and language option, select the corresponding dataset based on the model name in Run

* change the nightly test time

* change the nightly test time of harness and ppl
* Make Offline installer as default for win gpu doc for oneAPI

* Small other fixes
…a/*/pom.xml (#10197)

* Bump org.apache.commons:commons-compress in /scala/serving

Bumps org.apache.commons:commons-compress from 1.21 to 1.26.0.

---
updated-dependencies:
- dependency-name: org.apache.commons:commons-compress
  dependency-type: direct:production
...

Signed-off-by: dependabot[bot] <[email protected]>

* Bumps org.apache.commons:commons-compress from 1.21 to 1.26.0.

* Bumps org.apache.commons:commons-compress from 1.21 to 1.26.0.

* Bumps org.apache.commons:commons-compress from 1.21 to 1.26.0.

* Bumps org.apache.commons:commons-compress from 1.21 to 1.26.0.

---------

Signed-off-by: dependabot[bot] <[email protected]>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: Shaojun Liu <[email protected]>
NovTi and others added 14 commits February 22, 2024 10:00
* Add c-eval workflow and modify running files

* Modify the chatglm evaluator file

* Modify the ceval workflow for triggering test

* Modify the ceval workflow file

* Modify the ceval workflow file

* Modify ceval workflow

* Adjust the ceval dataset download

* Add ceval workflow dependencies

* Modify ceval workflow dataset download

* Add ceval test dependencies

* Add ceval test dependencies

* Correct the result print

* Fix the nightly test trigger time

* Fix ChatGLM loading issue
* add esimd sdp support

* fix style
* add quantize kv_cache for baichuan2-13b

* style fix
* add mlp layer unit tests

* add download baichuan-13b

* exclude llama for now

* install additional packages

* rename bash file

* switch to Baichuan2

* delete attention related code

* fix name errors in yml file
* Add model loading time record in csv for all-in-one benchmark

* Small fix

* Small fix to number after .
* add iq2 examples

* small fix

* meet code review

* fix

* meet review

* small fix
@jason-dai jason-dai changed the title Quick start: Install bigdl-ll on windows gpu Quick start: Install bigdl-llm on windows gpu Feb 22, 2024
Comment on lines 46 to 53
* Run the commands below in Anaconda prompt. Please note that transformer version should match the model you want to use. For example, here we use transformers 4.37.0 to run the demo.

```bash
conda activate llm

pip install --pre --upgrade bigdl-llm[xpu] -f https://developer.intel.com/ipex-whl-stable-xpu
pip install transformers==4.37.0
```
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Move transformers related info to the example section

Comment on lines 56 to 65
* Now we can test whether all the components have been installed correctly. If we can import all the packages correctly following the python file below, then the installation is correct.
```python
import torch
import time
import argparse
import numpy as np

from bigdl.llm.transformers import AutoModel,AutoModelForCausalLM
from transformers import AutoTokenizer, GenerationConfig
```
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How does the user run this? Maybe it's easy to run it in the python prompt?

print('-'*20, 'Output', '-'*20)
print(output_str)
```
Here is the sample output on the laptop equipped with 11th Gen Intel(R) Core(TM) i7-1185G7 and Intel(R) Iris(R) Xe Graphics after running the example program above.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How does the user run this example?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We provide the contents of demo.py and users could run it as python demo.py.

xiangyuT and others added 5 commits February 22, 2024 16:01
Update IPEX to 2.2.0+cpu and refactor for _ipex_optimize.
* optimize

* update

* fix style & move use_fuse_rope

* add ipex version check

* fix style

* update

* fix style

* meet comments

* address comments

* fix style
@shane-huang
Copy link
Contributor

clean up the unused files.

@ivy-lv11 ivy-lv11 closed this Feb 23, 2024
@ivy-lv11 ivy-lv11 deleted the install-win-gpu branch March 29, 2024 01:02
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.