fix falcon-40b accuracy issue (microsoft#4895)

This [PR](microsoft#4721) added the "DecoderLayer":glmtype. It will cause the Falcon model to choose "glmtype" fused_qkv_type. Falcon model (including Falcondecoderlayer) needs to choose 'bloomtype' explicitly. Co-authored-by: Michael Wyatt <[email protected]>
HabanaAI · Jan 10, 2024 · 16c265c · 16c265c
1 parent 43eba77
commit 16c265c
Showing 1 changed file with 2 additions and 1 deletion.
diff --git a/deepspeed/module_inject/fusedqkv_utils.py b/deepspeed/module_inject/fusedqkv_utils.py
@@ -38,7 +38,8 @@ def prepare_tp_fused_qkvw(module_str, src, mp_size, gpu_index):
         "MptBlock": 'glmtype',
         "BaichuanLayer": 'glmtype',
         "DecoderLayer": 'glmtype',
-        "GPTBigCodeBlock": 'bigcodetype'
+        "FalconDecoderLayer": 'bloomtype',
+        "GPTBigCodeBlock": 'bigcodetype',
     }
 
     def _codegen_type_transpose(input, mp_size, codegen_mp_num=4):