[WASM] Implement concat embeddings #17404

CharlieFRuan · 2024-09-23T00:06:02Z

This PR implements tvmjs.runtime.ConcatEmbeddings in C++ and exposes it to runtime.ts's Instance class as concatEmbeddings().

The method takes in a vector of NDArray, each array i has shape (m_i, hidden_size), returning an NDArray of shape (\sum_{i} {m}, hidden_size).

Besides, this PR also:

Add reserved name std for WebGPU codegen
Allow copyFrom()'s data argument to be different types of typed array
Allow this.dtype === uint32 for copyFrom()

### Changes - The only change is the support of Phi-3.5-vision: - #563 - Added `Phi-3.5-vision-instruct-q4f16_1-MLC` and `Phi-3.5-vision-instruct-q4f32_1-MLC` to prebuilt model list - See `examples/vision-model` on how to use vision language model in WebLLM ### TVMjs - Compiled at apache/tvm@931efc7 - Cherry-picked apache/tvm#17404 on top - Note this does not require us to recompile non-vision models because text-only inputs will not need embeddings concatenation - WASMs: still the same `v0_2_48` WASMs

### Changes - The only change is the support of Phi-3.5-vision: - mlc-ai#563 - Added `Phi-3.5-vision-instruct-q4f16_1-MLC` and `Phi-3.5-vision-instruct-q4f32_1-MLC` to prebuilt model list - See `examples/vision-model` on how to use vision language model in WebLLM ### TVMjs - Compiled at apache/tvm@931efc7 - Cherry-picked apache/tvm#17404 on top - Note this does not require us to recompile non-vision models because text-only inputs will not need embeddings concatenation - WASMs: still the same `v0_2_48` WASMs

CharlieFRuan added 2 commits September 22, 2024 20:02

[WASM] Implement concat embeddings

d34e885

Make concatEmbeddings optional for backward compatibility

eedfeef

This was referenced Sep 23, 2024

[WASM] Add phi3.5-vision to webllm mlc-ai/binary-mlc-llm-libs#140

Merged

[Vision] Support Phi-3.5-vision, the first VLM in WebLLM mlc-ai/web-llm#563

Merged

[Version] Bump version to 0.2.65 mlc-ai/web-llm#569

Merged

tqchen approved these changes Sep 23, 2024

View reviewed changes

tqchen merged commit 44808b4 into apache:main Sep 23, 2024
15 checks passed

ysh329 mentioned this pull request Oct 16, 2024

[Release] v0.18.0 Release Candidate Notes #17468

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[WASM] Implement concat embeddings #17404

[WASM] Implement concat embeddings #17404

CharlieFRuan commented Sep 23, 2024

[WASM] Implement concat embeddings #17404

[WASM] Implement concat embeddings #17404

Conversation

CharlieFRuan commented Sep 23, 2024