-
Notifications
You must be signed in to change notification settings - Fork 1.3k
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Add
gemma-2-2b-it
and gemma-2-9b-it
to igpu nightly performance t…
…est (#11778) * add yaml and modify `concat_csv.py` for `transformers` 4.43.1 (#11758) * add yaml and modify `concat_csv.py` for `transformers` 4.43.1 * remove 4.43 for arc; fix; * remove 4096-512 for 4.43 * comment some models * Small fix * uncomment models (#11777) --------- Co-authored-by: Ch1y0q <[email protected]>
- Loading branch information
1 parent
a88c132
commit ec184af
Showing
9 changed files
with
296 additions
and
6 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,14 @@ | ||
repo_id: | ||
- 'google/gemma-2-2b-it' | ||
- 'google/gemma-2-9b-it' | ||
local_model_hub: 'path to your local model hub' | ||
warm_up: 1 | ||
num_trials: 3 | ||
num_beams: 1 # default to greedy search | ||
low_bit: 'sym_int4' # default to use 'sym_int4' (i.e. symmetric int4) | ||
batch_size: 1 # default to 1 | ||
in_out_pairs: | ||
- '1024-128' | ||
test_api: | ||
- "transformer_int4_gpu_win" # on Intel GPU for Windows (catch GPU peak memory) | ||
cpu_embedding: True # whether put embedding to CPU (only avaiable now for gpu win related test_api) |
14 changes: 14 additions & 0 deletions
14
python/llm/test/benchmark/igpu-perf/1024-128_int4_fp16_443.yaml
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,14 @@ | ||
repo_id: | ||
- 'google/gemma-2-2b-it' | ||
- 'google/gemma-2-9b-it' | ||
local_model_hub: 'path to your local model hub' | ||
warm_up: 1 | ||
num_trials: 3 | ||
num_beams: 1 # default to greedy search | ||
low_bit: 'sym_int4' # default to use 'sym_int4' (i.e. symmetric int4) | ||
batch_size: 1 # default to 1 | ||
in_out_pairs: | ||
- '1024-128' | ||
test_api: | ||
- "transformer_int4_fp16_gpu_win" # on Intel GPU for Windows, use fp16 for non-linear layer | ||
cpu_embedding: True # whether put embedding to CPU (only avaiable now for gpu win related test_api) |
14 changes: 14 additions & 0 deletions
14
python/llm/test/benchmark/igpu-perf/1024-128_int4_fp16_loadlowbit_443.yaml
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,14 @@ | ||
repo_id: | ||
- 'google/gemma-2-2b-it' | ||
- 'google/gemma-2-9b-it' | ||
local_model_hub: 'path to your local model hub' | ||
warm_up: 1 | ||
num_trials: 3 | ||
num_beams: 1 # default to greedy search | ||
low_bit: 'sym_int4' # default to use 'sym_int4' (i.e. symmetric int4) | ||
batch_size: 1 # default to 1 | ||
in_out_pairs: | ||
- '1024-128' | ||
test_api: | ||
- "transformer_int4_fp16_loadlowbit_gpu_win" # on Intel GPU for Windows (catch GPU peak memory) | ||
cpu_embedding: True # whether put embedding to CPU (only avaiable now for gpu win related test_api) |
14 changes: 14 additions & 0 deletions
14
python/llm/test/benchmark/igpu-perf/2048-256_int4_fp16_443.yaml
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,14 @@ | ||
repo_id: | ||
- 'google/gemma-2-2b-it' | ||
- 'google/gemma-2-9b-it' | ||
local_model_hub: 'path to your local model hub' | ||
warm_up: 1 | ||
num_trials: 3 | ||
num_beams: 1 # default to greedy search | ||
low_bit: 'sym_int4' # default to use 'sym_int4' (i.e. symmetric int4) | ||
batch_size: 1 # default to 1 | ||
in_out_pairs: | ||
- '2048-256' | ||
test_api: | ||
- "transformer_int4_fp16_gpu_win" # on Intel GPU for Windows (catch GPU peak memory) | ||
cpu_embedding: True # whether put embedding to CPU (only avaiable now for gpu win related test_api) |
14 changes: 14 additions & 0 deletions
14
python/llm/test/benchmark/igpu-perf/3072-384_int4_fp16_443.yaml
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,14 @@ | ||
repo_id: | ||
- 'google/gemma-2-2b-it' | ||
- 'google/gemma-2-9b-it' | ||
local_model_hub: 'path to your local model hub' | ||
warm_up: 1 | ||
num_trials: 3 | ||
num_beams: 1 # default to greedy search | ||
low_bit: 'sym_int4' # default to use 'sym_int4' (i.e. symmetric int4) | ||
batch_size: 1 # default to 1 | ||
in_out_pairs: | ||
- '3072-384' | ||
test_api: | ||
- "transformer_int4_fp16_gpu_win" # on Intel GPU for Windows (catch GPU peak memory) | ||
cpu_embedding: True # whether put embedding to CPU (only avaiable now for gpu win related test_api) |
14 changes: 14 additions & 0 deletions
14
python/llm/test/benchmark/igpu-perf/32-32_int4_fp16_443.yaml
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,14 @@ | ||
repo_id: | ||
- 'google/gemma-2-2b-it' | ||
- 'google/gemma-2-9b-it' | ||
local_model_hub: 'path to your local model hub' | ||
warm_up: 3 | ||
num_trials: 5 | ||
num_beams: 1 # default to greedy search | ||
low_bit: 'sym_int4' # default to use 'sym_int4' (i.e. symmetric int4) | ||
batch_size: 1 # default to 1 | ||
in_out_pairs: | ||
- '32-32' | ||
test_api: | ||
- "transformer_int4_fp16_gpu_win" # on Intel GPU for Windows (catch GPU peak memory) | ||
cpu_embedding: True # whether put embedding to CPU (only avaiable now for gpu win related test_api) |
14 changes: 14 additions & 0 deletions
14
python/llm/test/benchmark/igpu-perf/4096-512_int4_fp16_443.yaml
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,14 @@ | ||
repo_id: | ||
- 'google/gemma-2-2b-it' | ||
- 'google/gemma-2-9b-it' | ||
local_model_hub: 'path to your local model hub' | ||
warm_up: 1 | ||
num_trials: 3 | ||
num_beams: 1 # default to greedy search | ||
low_bit: 'sym_int4' # default to use 'sym_int4' (i.e. symmetric int4) | ||
batch_size: 1 # default to 1 | ||
in_out_pairs: | ||
- '4096-512' | ||
test_api: | ||
- "transformer_int4_fp16_gpu_win" # on Intel GPU for Windows (catch GPU peak memory) | ||
cpu_embedding: True # whether put embedding to CPU (only avaiable now for gpu win related test_api) |