Qwen layernorm as input #12309

hkvision · 2024-10-31T10:02:54Z

Seems layernorm as input is faster for qwen-cw

cyita · 2024-11-01T03:06:31Z

Also add parser.add_argument("--quantization_group_size", type=int, default=0) and quantization_group_size=args.quantization_group_size, in this PR?

hkvision · 2024-11-01T03:07:39Z

Also add parser.add_argument("--quantization_group_size", type=int, default=0) and quantization_group_size=args.quantization_group_size, in this PR?

Sure.

cyita

LGTM

qwen layernorm as input

8ab747e

hkvision force-pushed the qwen-perf branch from ec1e372 to 58ced9f Compare November 4, 2024 01:49

cyita approved these changes Nov 4, 2024

View reviewed changes

hkvision merged commit c8679ad into intel-analytics:main Nov 4, 2024
1 check passed

hkvision deleted the qwen-perf branch November 4, 2024 01:51

add group size

58ced9f

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Qwen layernorm as input #12309

Qwen layernorm as input #12309

hkvision commented Oct 31, 2024

cyita commented Nov 1, 2024 •

edited

Loading

hkvision commented Nov 1, 2024

cyita left a comment

Qwen layernorm as input #12309

Qwen layernorm as input #12309

Conversation

hkvision commented Oct 31, 2024

cyita commented Nov 1, 2024 • edited Loading

hkvision commented Nov 1, 2024

cyita left a comment

Choose a reason for hiding this comment

cyita commented Nov 1, 2024 •

edited

Loading