Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

AssertionError: Dynamic dims not currently supported in tensor_slice function of IREE #268

Open
ammarhd opened this issue Nov 13, 2024 · 7 comments

Comments

@ammarhd
Copy link

ammarhd commented Nov 13, 2024

we actually have had this working and it was still working up till last week stateless_llama.py it was working using shark_turbine and all of a sudden we start getting an issue and unfortunately we had some libraries not locked with specific version and I guess there few updates released last week and it did not work anymore .... I tried to use older versions of iree-turbine and torch but could not make it work any more.

now we are getting the following error

... Traceback (most recent call last): File "/app/app/services/../pipelines/llama_2_pipeline/run_pipeline.py", line 92, in slice_up_to_step\n sliced = IREE.tensor_slice(\n ...

... File "/usr/local/lib/python3.10/dist-packages/iree/turbine/aot/support/procedural/iree_emitter.py", line 285, in tensor_slice\n result.set_dynamic_dim_values(result_dynamic_dims)\n ...

... File "/usr/local/lib/python3.10/dist-packages/iree/turbine/aot/support/procedural/iree_emitter.py", line 285, in tensor_slice\n result.set_dynamic_dim_values(result_dynamic_dims)\n ...

... File "/usr/local/lib/python3.10/dist-packages/iree/turbine/aot/support/procedural/primitives.py", line 168, in set_dynamic_dim_values\n assert len(values) == 0, "Dynamic dims not currently supported"\nAssertionError: Dynamic dims not currently supported\n'} ...

I have used the following and tested different versions

torch 2.5.1 also tested with 2.4.1
iree-base-compiler 2.9.0
iree-base-runtime 2.9.0
iree-turbine 2.9.0 also tried with 2.5 and 2.3
python 3.10 and 3.11
rocm/dev-ubuntu-22.04:6.0.2 and 6.1

@NoumanAmir657
Copy link

@ammarhd Not really related to your issue but did you run into this error?

error: "<eval_with_key>.0 from /home/nouman-10x/shark-model-dev/turbine_venv/lib/python3.10/site-packages/torch/fx/experimental/proxy_tensor.py:551 in wrapped":95:0: 'torch.aten.copy_' op operand #0 must be Multi-dimensional array modeling Torch's Tensor type, but got '!torch.vtensor<[?,?],f32>'

Also you might want to try

iree-compiler: 20240828.999
iree-runtime: 20240828.999
transformers: 4.37.1
torch: 2.3.0+cpu
shark-turbine: 2.4.1

@marbre
Copy link
Member

marbre commented Nov 18, 2024

As the error message states, dynamic dims is currently not supported as support was removed with commit 97e0517. The backing APIs have been deprecated for a year and torch.export.dynamic_dim() was removed with pytorch/pytorch@b454c51. Thus you will need to refactor your code if you want to use newer iree-turbine and / or torch versions.

@NoumanAmir657
Copy link

As the error message states, dynamic dims is currently not supported as support was removed with commit 97e0517. The backing APIs have been deprecated for a year and torch.export.dynamic_dim() was removed with pytorch/pytorch@b454c51. Thus you will need to refactor your code if you want to use newer iree-turbine and / or torch versions.

If the support for dynamic dims is removed, how is one supposed to provide dynamic input to an LLM? Sorry if this is a dumb question but would appreciate any pointers or already written examples.

@stellaraccident
Copy link
Collaborator

This comment is not correct. Support for the pytorch pre-release API for specifying dynamic shapes was removed because pytorch removed it. It had been deprecated for a long time with the recommendation to support the official API.

See pytorch's documentation: https://pytorch.org/docs/stable/export.html#expressing-dynamism

This is what we support.

@stellaraccident
Copy link
Collaborator

We don't have a replacement mechanism for specifying dynamic shapes across turbine functions except to use old versions. It is likely that stateless_llama can be reworked to the new API, but I havent looked at it for a long time since we have been using a more direct approach to torch.export llm models.

@NoumanAmir657
Copy link

we have been using a more direct approach to torch.export llm models.

Can you share any relevant examples?

@stellaraccident
Copy link
Collaborator

When turbine was first written, the export path did not support mutable arguments or buffers, and this required the meta programming Jack that stateless llama did to tie kv cache state to a global variable. It had a number of issues, not the least of which was that it used a progression of bigger-by-1 intermediate buffers to get across the torch barrier (this kills caching allocation schemes and costs a lot of perf). It also made unnecessary copies.

The new work that my group does uses mutable function arguments to pass a fixed size kv cache in for in place operation. I've also been told of another group who does not have their code accessible that is using torch module level buffers to the same effect.

Our canonical example is still pre release and more complicated than it should be but it's here: https://github.com/nod-ai/shark-ai/blob/main/sharktank/sharktank/examples/export_paged_llm_v1.py

Note that this uses a paged kv cache by default, which requires a much more complicated inference sequence intended for serving. The flags in there for "direct" cache approximate what stateless llama was doing with a single, linear cache.

For even simpler use (and what I think the other group did), you can just create a torch wrapper module that registers a buffer for the kv cache and then slices it to an expected size of interfacing to something like a transformers model. We don't have a public examine of that afaik.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants