Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merge bytecodealliance:main into wenyongh:fix_spec_on_nuttx_ci #951

Merged
merged 6 commits into from
Sep 10, 2024

Conversation

wenyongh
Copy link
Owner

No description provided.

loganek and others added 6 commits September 5, 2024 16:18
Those parameters can be used to reduce the size of the AOT code.

There's going to be more changes related to AOT code size reduction,
this is just the initial step.

p.s. #3758
Fixes to enable building iwasm_shared and iwasm_static libraries on win32.
Minimum support:
- [x] accept (WasmEdge) customized model parameters. metadata.
- [x] Target [wasmedge-ggml examples](https://github.com/second-state/WasmEdge-WASINN-examples/tree/master/wasmedge-ggml)
  - [x] basic
  - [x] chatml
  - [x] gemma
  - [x] llama
  - [x] qwen

---

In the future, to support if required:
- [ ] Target [wasmedge-ggml examples](https://github.com/second-state/WasmEdge-WASINN-examples/tree/master/wasmedge-ggml)
  - [ ] command-r. (>70G memory requirement)
  - [ ] embedding. (embedding mode)
  - [ ] grammar. (use the grammar option to constrain the model to generate the JSON output)
  - [ ] llama-stream. (new APIS `compute_single`, `get_output_single`, `fini_single`)
  - [ ] llava. (image representation)
  - [ ] llava-base64-stream. (image representation)
  - [ ] multimodel. (image representation)
- [ ] Target [llamaedge](https://github.com/LlamaEdge/LlamaEdge)
- Implement TINY / STANDARD frame modes - tiny mode is only able to keep track on the IP
  and func idx, STANDARD mode provides more capabilities (parameters, stack pointer etc.).
- Implement FRAME_PER_FUNCTION / FRAME_PER_CALL modes - frame per function adds
  code at the beginning and at the end of each function for allocating / deallocating stack frame,
  whereas in per-call mode the frame is allocated before each call. The exception is call to
  the imported function, where frame-per-function mode also allocates the stack before the
  `call` instruction (as it can't instrument the imported function).

At the moment TINY + FRAME_PER_FUNCTION is automatically enabled in case GC and perf
profiling are disabled and `values` call stack feature is not requested. In all the other cases
STANDARD + FRAME_PER_CALL is used.

STANDARD + FRAME_PER_FUNCTION and TINY + FRAME_PER_CALL are currently not
implemented but possible, and might be enabled in the future.

ps. #3758
@wenyongh wenyongh merged commit dcc7ec2 into wenyongh:fix_spec_on_nuttx_ci Sep 10, 2024
8 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants