feat: automatic batching #104

giladgd · 2023-11-26T17:13:30Z

Description of change

feat: evaluate multiple sequences in parallel with automatic batching
feat: improve automatic chat wrapper resolution
feat: smart context shifting
feat: improve TS types
refactor: improve API
build: support beta releases
build: improve dev configurations

BREAKING CHANGE: completely new API (docs will be updated before a stable version is released)

Closes #85
Fixes #102
Fixes #94
Fixes #93
Fixes #76

Things left to do (in other PRs)

Update documentation
Use the smart context shifting support in LlamaChatSession
Add contexts manager to automatically create more contexts as needed
Improve grammar support
Try to disable llama.cpp logs by default
Add migration guide from v2 to v3
Add more tests

Pull-Request Checklist

Code is up-to-date with the master branch
npm run format to apply eslint formatting
npm run test passes with this change
This pull request links relevant issues as Fixes #0000
There are new or updated unit tests validating the change
Documentation has been updated to reflect this change
The new commits and pull request title follow conventions explained in pull request guidelines (PRs that do not follow this convention will not be merged)

…, smart context shifting support, better automatic chat wrapper resolution, improved API, safer `Token` type

ido-pluto

LGTM

ido-pluto

LGTM

github-actions · 2023-11-26T19:39:45Z

🎉 This PR is included in version 3.0.0-beta.1 🎉

The release is available on:

Your semantic-release bot 📦🚀

Madd0g · 2024-04-17T10:00:24Z

is there a code snippet that shows how to correctly use batching?
I'm doing repetitive things in a loop and wondering how I might take advantage of this?

giladgd · 2024-04-19T16:50:17Z

@Madd0g There will be a better example in the documentation when version 3 leaves the beta status soon, but for now, here's a simple example:

import {fileURLToPath} from "url";
import path from "path";
import {getLlama, LlamaChatSession} from "node-llama-cpp";

const __dirname = path.dirname(fileURLToPath(import.meta.url));

const llama = await getLlama();
const model = await llama.loadModel({
    modelPath: path.join(__dirname, "models", "dolphin-2.1-mistral-7b.Q4_K_M.gguf")
});
const context = await model.createContext({
    sequences: 2
});

const sequence1 = context.getSequence();
const sequence2 = context.getSequence();

const session1 = new LlamaChatSession({
    contextSequence: sequence1
});
const session2 = new LlamaChatSession({
    contextSequence: sequence2
});

const q1 = "Hi there, how are you?";
const q2 = "How much is 6+6?";

const [
    a1,
    a2
] = await Promise.all([
    session1.prompt(q1),
    session2.prompt(q2)
]);

console.log("User: " + q1);
console.log("AI: " + a1);

console.log("User: " + q2);
console.log("AI: " + a2);

The batching is done automatically across sequences of the same context

github-actions · 2024-09-24T18:13:39Z

🎉 This PR is included in version 3.0.0 🎉

The release is available on:

Your semantic-release bot 📦🚀

giladgd added 10 commits November 15, 2023 15:37

feat: get train context size

6e83878

refactor: rename "LLAMASomething" addon classes to "AddonSomething"

217b7d4

fix: improve binding types

01a3055

style: format addon.cpp

99f8f45

chore: configurations

80106ee

build: support beta releases

3555bb8

style: improve eslint JSDoc config

2ad1a5b

feat: add lifecycle-utils

fa19008

feat: add parseModelFileName util

21a5c3b

feat: evaluate multiple sequences in parallel with automatic batching…

72c56af

…, smart context shifting support, better automatic chat wrapper resolution, improved API, safer `Token` type

giladgd self-assigned this Nov 26, 2023

giladgd requested a review from ido-pluto November 26, 2023 17:13

ido-pluto approved these changes Nov 26, 2023

View reviewed changes

fix: docs

2c12db5

ido-pluto approved these changes Nov 26, 2023

View reviewed changes

giladgd added 2 commits November 26, 2023 20:36

chore: improve documentation and update modules

a0c2983

build: fix package-lock.json

4592130

giladgd merged commit 4757af8 into beta Nov 26, 2023
14 checks passed

giladgd deleted the gilad/autoBatching branch November 26, 2023 19:29

github-actions bot added the released on @beta label Nov 26, 2023

giladgd mentioned this pull request Dec 6, 2023

feat: version 3.0 #105

Merged

17 tasks

giladgd added this to the v3.0.0 milestone Dec 16, 2023

giladgd linked an issue Dec 16, 2023 that may be closed by this pull request

feat: automatic batching #85

Closed

giladgd linked an issue Jan 12, 2024 that may be closed by this pull request

Could not find a KV slot #136

Closed

3 tasks

giladgd mentioned this pull request Mar 16, 2024

feat: async operations #178

Merged

7 tasks

giladgd mentioned this pull request Jul 28, 2024

feat: Llama 3.1 support, Phi-3 support #273

Merged

7 tasks

github-actions bot added the released label Sep 24, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: automatic batching #104

feat: automatic batching #104

giladgd commented Nov 26, 2023 •

edited

Loading

ido-pluto left a comment

ido-pluto left a comment

github-actions bot commented Nov 26, 2023

Madd0g commented Apr 17, 2024

giladgd commented Apr 19, 2024 •

edited

Loading

github-actions bot commented Sep 24, 2024 •

edited by giladgd

Loading

feat: automatic batching #104

feat: automatic batching #104

Conversation

giladgd commented Nov 26, 2023 • edited Loading

Description of change

Things left to do (in other PRs)

Pull-Request Checklist

ido-pluto left a comment

Choose a reason for hiding this comment

ido-pluto left a comment

Choose a reason for hiding this comment

github-actions bot commented Nov 26, 2023

Madd0g commented Apr 17, 2024

giladgd commented Apr 19, 2024 • edited Loading

github-actions bot commented Sep 24, 2024 • edited by giladgd Loading

giladgd commented Nov 26, 2023 •

edited

Loading

giladgd commented Apr 19, 2024 •

edited

Loading

github-actions bot commented Sep 24, 2024 •

edited by giladgd

Loading