[Unity][BugFix] Fix a bug in relax gelu_tanh computation #298

rickzx · 2023-11-30T06:57:48Z

The gelu_tanh computation (e.g. the gelu_new activation in huggingface API) is incorrect in the current relax code. This PR fixes the bug.

Correctness testing passes (which fails previously):

inp = torch.randn((2, 4), dtype=torch.float32)

def tanh_gelu(input):
    return 0.5 * input * (1.0 + np.tanh(np.sqrt(2.0 / np.pi) * (input + 0.044715 * torch.pow(input, 3.0))))

out1 = tanh_gelu(inp)

class TanhGelu(nn.Module):
    def __init__(self):
        pass

    def forward(self, x: nn.Tensor):
        return op.gelu(x, "tanh")

forward_spec = {"forward": {"x": spec.Tensor([2, 4], dtype="float32")}}
gelu = TanhGelu().jit(forward_spec)
out2 = gelu['forward'](inp)

assert torch.allclose(out1, out2)

Updated unit tests

rickzx · 2023-11-30T15:37:04Z

As @CharlieFRuan suggested, opened another PR in tvm/unity branch: apache/tvm#16188. Will close this one

rickzx requested review from junrushao, CharlieFRuan and MasterJH5574 November 30, 2023 06:58

rickzx force-pushed the mlc branch from 9479d7f to 40ca2f6 Compare November 30, 2023 07:00

[Unity][BugFix] Fix a bug in relax gelu_tanh computation

c566d4f

rickzx force-pushed the mlc branch from 40ca2f6 to c566d4f Compare November 30, 2023 07:06

rickzx closed this Nov 30, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Unity][BugFix] Fix a bug in relax gelu_tanh computation #298

[Unity][BugFix] Fix a bug in relax gelu_tanh computation #298

rickzx commented Nov 30, 2023 •

edited

Loading

rickzx commented Nov 30, 2023

[Unity][BugFix] Fix a bug in relax gelu_tanh computation #298

[Unity][BugFix] Fix a bug in relax gelu_tanh computation #298

Conversation

rickzx commented Nov 30, 2023 • edited Loading

rickzx commented Nov 30, 2023

rickzx commented Nov 30, 2023 •

edited

Loading