-
Notifications
You must be signed in to change notification settings - Fork 1.3k
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
[LLM] Add qwen1.5-7B in iGPU perf (#10127)
* Add qwen1.5 test config yaml with transformers 4.37.0 * Update for yaml file
- Loading branch information
1 parent
e841b66
commit 5c42294
Showing
6 changed files
with
152 additions
and
4 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,13 @@ | ||
repo_id: | ||
- 'Qwen/Qwen1.5-7B-Chat' | ||
local_model_hub: 'path to your local model hub' | ||
warm_up: 1 | ||
num_trials: 3 | ||
num_beams: 1 # default to greedy search | ||
low_bit: 'sym_int4' # default to use 'sym_int4' (i.e. symmetric int4) | ||
batch_size: 1 # default to 1 | ||
in_out_pairs: | ||
- '1024-128' | ||
test_api: | ||
- "transformer_int4_gpu_win" # on Intel GPU for Windows (catch GPU peak memory) | ||
cpu_embedding: True # whether put embedding to CPU (only avaiable now for gpu win related test_api) |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,13 @@ | ||
repo_id: | ||
- 'Qwen/Qwen1.5-7B-Chat' | ||
local_model_hub: 'path to your local model hub' | ||
warm_up: 1 | ||
num_trials: 3 | ||
num_beams: 1 # default to greedy search | ||
low_bit: 'sym_int4' # default to use 'sym_int4' (i.e. symmetric int4) | ||
batch_size: 1 # default to 1 | ||
in_out_pairs: | ||
- '2048-256' | ||
test_api: | ||
- "transformer_int4_gpu_win" # on Intel GPU for Windows (catch GPU peak memory) | ||
cpu_embedding: True # whether put embedding to CPU (only avaiable now for gpu win related test_api) |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,13 @@ | ||
repo_id: | ||
- 'Qwen/Qwen1.5-7B-Chat' | ||
local_model_hub: 'path to your local model hub' | ||
warm_up: 3 | ||
num_trials: 5 | ||
num_beams: 1 # default to greedy search | ||
low_bit: 'sym_int4' # default to use 'sym_int4' (i.e. symmetric int4) | ||
batch_size: 1 # default to 1 | ||
in_out_pairs: | ||
- '32-32' | ||
test_api: | ||
- "transformer_int4_gpu_win" # on Intel GPU for Windows (catch GPU peak memory) | ||
cpu_embedding: True # whether put embedding to CPU (only avaiable now for gpu win related test_api) |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,13 @@ | ||
repo_id: | ||
- 'Qwen/Qwen1.5-7B-Chat' | ||
local_model_hub: 'path to your local model hub' | ||
warm_up: 1 | ||
num_trials: 3 | ||
num_beams: 1 # default to greedy search | ||
low_bit: 'sym_int4' # default to use 'sym_int4' (i.e. symmetric int4) | ||
batch_size: 1 # default to 1 | ||
in_out_pairs: | ||
- '32-512' | ||
test_api: | ||
- "transformer_int4_gpu_win" # on Intel GPU for Windows (catch GPU peak memory) | ||
cpu_embedding: True # whether put embedding to CPU (only avaiable now for gpu win related test_api) |