You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I have this reproducer of numerical issues when executing for llvm-cpu a sharded Lllama model. It is a small size model with random weights and inputs. The numerics seem off compared to the PyTorch equivalent.
The prefill step result is almost correct ~1e-2 absolute error.
But the paged cache state is way off and everything in the decode step is wrong.
The reproducer with iree-run-module attached here does not check the cache state as it is an in-out argument.
The source code that was used to produce this data is here.
There also the in-out arguments are checked.
The paged cache is in-out arguments and needs to be updated in-place, but I don't see any usage of these arguments. It is the last 2 arguments in the prefill and decode functions.
My initial hypothesis is it got erroneously optimized out due to dead code elimination as we are not properly generating cache update code to be really in-place.
What happened?
I have this reproducer of numerical issues when executing for llvm-cpu a sharded Lllama model. It is a small size model with random weights and inputs. The numerics seem off compared to the PyTorch equivalent.
The prefill step result is almost correct
~1e-2
absolute error.But the paged cache state is way off and everything in the decode step is wrong.
The reproducer with
iree-run-module
attached here does not check the cache state as it is an in-out argument.The source code that was used to produce this data is here.
There also the in-out arguments are checked.
Steps to reproduce your issue
You would need this change to compile.
Download and extract sharded-toy-llama-inaccuracy-reproducer-2.zip.
./compile.sh
./run.sh
To verify the cache state you would need to run this test .
What component(s) does this issue relate to?
Compiler
Version information
To reproduce #18663 is required.
Additional context
The fix for the issue #18283 does not solve this issue.
The text was updated successfully, but these errors were encountered: