Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

tf.aliasing support #1026

Open
steeve opened this issue Nov 4, 2024 · 4 comments
Open

tf.aliasing support #1026

steeve opened this issue Nov 4, 2024 · 4 comments
Labels
bug Something isn't working

Comments

@steeve
Copy link

steeve commented Nov 4, 2024

Hi,

We (@zml) found that tf.aliasing support seemed to be not working as expected, with the model producing garbage when used. In our case Llama 3.1 8B.
This is problematic for transformer models because we leverage donations for the KvCache.

For now we're not emitting those attributes when on neuron, but we're not sure what to do as we feel that if the SDK doesn't support them, it should just ignore them right ?

The llama implementation is attached.

Thank you !

llama.aliasing.mlir.txt

@nalwayaakshay
Copy link

Can you try to use jax.buffer_donor rather than tf.aliasing_output in order to annotate donated buffers.

For example:
%arg2: tensor<...> {jax.buffer_donor = true, mhlo.layout_mode = "default", mhlo.sharding = "{devices=[1,1,32,1]<=[32]}"} loc("state.kv_cache[0]['cached_key']")

From your .txt file:
%arg291: tensor<256xi32> {mhlo.layout_mode = "default", mhlo.sharding = "{replicated}", tf.aliasing_output = 0 : i32}

@steeve
Copy link
Author

steeve commented Nov 8, 2024

TIL jax.buffer_donor. Unfortunately, it doesn't work either.
That being said, while the output is wrong, the tok/s doesn't change with donation, which is weird ?

@aws-taylor aws-taylor added the bug Something isn't working label Nov 8, 2024
@devesr-amzn
Copy link
Contributor

Can you provide steps to reproduce the issue along with versions of dependencies in use (versions of neuronx-cc, libneuronxla)?

@steeve
Copy link
Author

steeve commented Nov 12, 2024

Packages:

neuronx-cc==2.15.141.0+d3cfc8ca
libneuronxla==2.0.4986.0

Checkout this branch: https://github.com/zml/zml/tree/steeve/synapse

Run the llama example with neuron:

$ cd zml/examples
$ ./bazel.sh run -c opt //llama:Llama-3.1-8B-Instruct --@zml//runtimes:cpu=false --@zml//runtimes:neuron=true

You can re-enable donations by commenting out those lines: https://github.com/zml/zml/blob/steeve/synapse/zml/module.zig#L301-L303

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

4 participants