Fused Rotary Embeddings (fixed) #1108

yang · 2024-01-02T19:09:24Z

Orig PR started by Stella:

This is the start of an implementation of fused rotary embeddings for GPT-NeoX #1077. I assumed that it would be pretty straight forward despite my lack of familiarity with CUDA but I've been disabused of that notion.

Currently the issue is that there are errors when trying to compile the cuda extension.

Ports the fix from NVIDIA/apex#1750 into this branch.

Avoid .contiguous()

Just checked and this should work for bf16. Or, at least, the reason I originally thought it wouldn't doesn't apply.

update from main

Add `self.rope_fusion = neox_args.rope_fusion` so that `ParallelSelfAttention` knows if we're using rope fusion.

Just needed to bring in the latest headers/sources, and call into it the right way from transformers.py.

StellaAthena · 2024-01-02T19:31:14Z

Supersedes #1103

configs/125M_fused_rope.yml

StellaAthena and others added 27 commits November 15, 2023 11:38

Create fused_rotary_positional_embedding.cpp

4001ce1

Create fused_rotary_positional_embedding.h

54848a0

Create fused_rotary_positional_embedding_cuda.cu

e63242d

Update fused_rotary_positional_embedding.h

aaa000c

Ports the fix from NVIDIA/apex#1750 into this branch.

Update neox_args.py

4808ced

Update setup.py

6e8e92a

Update initialize.py

eed2204

Update setup.py

6ff6180

Update __init__.py

0bd0839

Update test_fused_kernels.py

11a3798

Update setup.py

8bf4bc4

Create fused_rope.py

d86c399

Update fused_rotary_positional_embedding.h

417f55c

Update fused_rotary_positional_embedding.cpp

5b1331d

Update fused_rotary_positional_embedding.cpp

f20f1f5

Merge pull request EleutherAI#1085 from EleutherAI/avoid-.contiguous()

af387b0

Avoid .contiguous()

Update transformer.py

e35955b

Update transformer.py

fc214d0

Just checked and this should work for bf16. Or, at least, the reason I originally thought it wouldn't doesn't apply.

Update transformer.py

f745997

Create 125M_fused_rope.yml

29f50ae

Update 125M_fused_rope.yml

491806b

Merge pull request EleutherAI#1089 from EleutherAI/main

feb1433

update from main

Update transformer.py

589323d

Add `self.rope_fusion = neox_args.rope_fusion` so that `ParallelSelfAttention` knows if we're using rope fusion.

Update NeoXArgs docs automatically

7f3ae33

Merge branch 'main' into fused-rope

a3572c7

Update NeoXArgs docs automatically

59f6f3b

Fix fused rope

3058110

Just needed to bring in the latest headers/sources, and call into it the right way from transformers.py.

yang requested a review from Quentin-Anthony as a code owner January 2, 2024 19:09

StellaAthena linked an issue Jan 2, 2024 that may be closed by this pull request

Apply new fused rotary embedding #1077

Closed

Merge branch 'main' into fused-rope-dev

2f180ee

Quentin-Anthony reviewed Jan 4, 2024

View reviewed changes

configs/125M_fused_rope.yml Outdated Show resolved Hide resolved

Add rope_fusion arg to all ymls

d60088e

Quentin-Anthony approved these changes Jan 5, 2024

View reviewed changes

Quentin-Anthony merged commit 77605ca into EleutherAI:main Jan 5, 2024
3 of 5 checks passed

Quentin-Anthony mentioned this pull request Jan 5, 2024

Fused Rotary Embeddings #1103

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fused Rotary Embeddings (fixed) #1108

Fused Rotary Embeddings (fixed) #1108

yang commented Jan 2, 2024 •

edited by StellaAthena

Loading

StellaAthena commented Jan 2, 2024 •

edited

Loading

Fused Rotary Embeddings (fixed) #1108

Fused Rotary Embeddings (fixed) #1108

Conversation

yang commented Jan 2, 2024 • edited by StellaAthena Loading

StellaAthena commented Jan 2, 2024 • edited Loading

yang commented Jan 2, 2024 •

edited by StellaAthena

Loading

StellaAthena commented Jan 2, 2024 •

edited

Loading