Numerical inaccuracies in multi-device sharded toy-sized Llama #18687

sogartar · 2024-10-03T22:25:36Z

What happened?

I have this reproducer of numerical issues when executing for llvm-cpu a sharded Lllama model. It is a small size model with random weights and inputs. The numerics seem off compared to the PyTorch equivalent.
The prefill step result is almost correct ~1e-2 absolute error.
But the paged cache state is way off and everything in the decode step is wrong.

The reproducer with iree-run-module attached here does not check the cache state as it is an in-out argument.
The source code that was used to produce this data is here.
There also the in-out arguments are checked.

Steps to reproduce your issue

You would need this change to compile.

Download and extract sharded-toy-llama-inaccuracy-reproducer-2.zip.
./compile.sh
./run.sh

To verify the cache state you would need to run this test .

What component(s) does this issue relate to?

Compiler

Version information

To reproduce #18663 is required.

Additional context

The fix for the issue #18283 does not solve this issue.

The text was updated successfully, but these errors were encountered:

sogartar · 2024-10-03T22:28:03Z

FYI @IanNod.

sogartar · 2024-10-09T14:28:58Z

Here is an update to the reproducer sharded-toy-llama-inaccuracy-reproducer-2.zip.
This fix did not solve this issue.
The original reproducer is sharded-toy-llama-inaccuracy-reproducer.zip.
I have updated the description with the new reproducer.

sogartar · 2024-10-11T13:18:09Z

The paged cache is in-out arguments and needs to be updated in-place, but I don't see any usage of these arguments. It is the last 2 arguments in the prefill and decode functions.
My initial hypothesis is it got erroneously optimized out due to dead code elimination as we are not properly generating cache update code to be really in-place.

sogartar · 2024-10-11T13:26:17Z

I am closing this as it is an issue with model exportation. I will reopen if there are other problems after exportation has been fixed.

sogartar · 2024-10-11T15:07:59Z

This is the issue for the model exportation problem.

sogartar added bug 🐞 Something isn't working codegen/llvm LLVM code generation compiler backend labels Oct 3, 2024

sogartar closed this as not planned Won't fix, can't repro, duplicate, stale Oct 11, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Numerical inaccuracies in multi-device sharded toy-sized Llama #18687

Numerical inaccuracies in multi-device sharded toy-sized Llama #18687

sogartar commented Oct 3, 2024 •

edited

Loading

sogartar commented Oct 3, 2024

sogartar commented Oct 9, 2024

sogartar commented Oct 11, 2024

sogartar commented Oct 11, 2024

sogartar commented Oct 11, 2024

Numerical inaccuracies in multi-device sharded toy-sized Llama #18687

Numerical inaccuracies in multi-device sharded toy-sized Llama #18687

Comments

sogartar commented Oct 3, 2024 • edited Loading

What happened?

Steps to reproduce your issue

What component(s) does this issue relate to?

Version information

Additional context

sogartar commented Oct 3, 2024

sogartar commented Oct 9, 2024

sogartar commented Oct 11, 2024

sogartar commented Oct 11, 2024

sogartar commented Oct 11, 2024

sogartar commented Oct 3, 2024 •

edited

Loading