Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: resolveModelFile method #351

Merged
merged 6 commits into from
Sep 29, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions .vitepress/config/apiReferenceSidebar.ts
Original file line number Diff line number Diff line change
Expand Up @@ -10,6 +10,7 @@ const categoryOrder = [

const functionsOrder = [
"getLlama",
"resolveModelFile",
"defineChatSessionFunction",
"createModelDownloader",
"resolveChatWrapper",
Expand Down
8 changes: 7 additions & 1 deletion docs/cli/pull.md
Original file line number Diff line number Diff line change
Expand Up @@ -13,10 +13,16 @@ const commandDoc = docs.pull;
A wrapper around [`ipull`](https://www.npmjs.com/package/ipull)
to download model files as fast as possible with parallel connections and other optimizations.

Automatically handles split and binary-split models files, so only pass the URL to the first file of a model.
Automatically handles split and binary-split models files, so only pass the URI to the first file of a model.

If a file already exists and its size matches the expected size, it will not be downloaded again unless the `--override` flag is used.

The supported URI schemes are:
- **HTTP:** `https://`, `http://`
- **Hugging Face:** `hf:<user>/<model>/<file-path>#<branch>` (`#<branch>` is optional)

Learn more about using model URIs in the [Downloading Models guide](../guide/downloading-models.md#model-uris).

> To programmatically download a model file in your code, use [`createModelDownloader()`](../api/functions/createModelDownloader.md)

## Usage
Expand Down
4 changes: 4 additions & 0 deletions docs/guide/choosing-a-model.md
Original file line number Diff line number Diff line change
Expand Up @@ -164,3 +164,7 @@ npx --no node-llama-cpp pull --dir ./models <model-file-url>
>
> If the model file URL is of a single part of a multi-part model (for example, [this model](https://huggingface.co/bartowski/Meta-Llama-3-70B-Instruct-GGUF/blob/main/Meta-Llama-3-70B-Instruct-Q5_K_L.gguf/Meta-Llama-3-70B-Instruct-Q5_K_L-00001-of-00002.gguf)),
> it will also download all the other parts as well into the same directory.

::: tip
Consider using [model URIs](./downloading-models.md#model-uris) to download and load models.
:::
49 changes: 47 additions & 2 deletions docs/guide/downloading-models.md
Original file line number Diff line number Diff line change
Expand Up @@ -69,16 +69,61 @@ This option is recommended for more advanced use cases, such as downloading mode
If you know the exact model URLs you're going to need every time in your project, it's better to download the models
automatically after running `npm install` as described in the [Using the CLI](#cli) section.

## Model URIs {#model-uris}
You can reference models using a URI instead of their full download URL when using the CLI and relevant methods.

When downloading a model from a URI, the model files will be prefixed with a corresponding adaptation of the URI.

To reference a model from Hugging Face, you can use the scheme
<br/>
`hf:<user>/<model>/<file-path>#<branch>` (`#<branch>` is optional).

Here's an example usage of the Hugging Face URI scheme:
```
hf:mradermacher/Meta-Llama-3.1-8B-Instruct-GGUF/Meta-Llama-3.1-8B-Instruct.Q4_K_M.gguf
```

When using a URI to reference a model,
it's recommended [to add it to your `package.json` file](#cli) to ensure it's downloaded when running `npm install`,
and also resolve it using the [`resolveModelFile`](../api/functions/resolveModelFile.md) method to get the full path of the resolved model file.

Here's and example usage of the [`resolveModelFile`](../api/functions/resolveModelFile.md) method:
```typescript
import {fileURLToPath} from "url";
import path from "path";
import {getLlama, resolveModelFile} from "node-llama-cpp";

const __dirname = path.dirname(fileURLToPath(import.meta.url));
const modelsDirectory = path.join(__dirname, "models");

const modelPath = await resolveModelFile(
"hf:user/model/model-file.gguf",
modelsDirectory
);

const llama = await getLlama();
const model = await llama.loadModel({modelPath});
```

::: tip NOTE
If a corresponding model file is not found in the given directory, the model will automatically be downloaded.

When a file is being downloaded, the download progress is shown in the console by default.
<br/>
Set the [`cli`](../api/type-aliases/ResolveModelFileOptions#cli) option to `false` to disable this behavior.
:::

## Downloading Gated Models From Hugging Face {#hf-token}
Some models on Hugging Face are "gated", meaning they require a manual consent from you before you can download them.

To download such models, after completing the consent form on the model card, you need to create a [Hugging Face token](https://huggingface.co/docs/hub/en/security-tokens) and set it in one of the following locations:
* Set an environment variable called `HF_TOKEN` the token
* Set the `~/.cache/huggingface/token` file content to the token

Now, using the CLI or the [`createModelDownloader`](../api/functions/createModelDownloader.md) method will automatically use the token to download gated models.
Now, using the CLI, the [`createModelDownloader`](../api/functions/createModelDownloader.md) method,
or the [`resolveModelFile`](../api/functions/resolveModelFile.md) method will automatically use the token to download gated models.

Alternatively, you can use the token in the [`tokens`](../api/type-aliases/ModelDownloaderOptions.md#tokens) option when using [`createModelDownloader`](../api/functions/createModelDownloader.md).
Alternatively, you can use the token in the [`tokens`](../api/type-aliases/ModelDownloaderOptions.md#tokens) option when using [`createModelDownloader`](../api/functions/createModelDownloader.md) or [`resolveModelFile`](../api/functions/resolveModelFile.md).

## Inspecting Remote Models
You can inspect the metadata of a remote model without downloading it by either using the [`inspect gguf` command](../cli/inspect/gguf.md) with a URL,
Expand Down
4 changes: 2 additions & 2 deletions docs/guide/index.md
Original file line number Diff line number Diff line change
Expand Up @@ -51,9 +51,9 @@ npx --no node-llama-cpp inspect gpu
```

## Getting a Model File
We recommend you to get a GGUF model from either [Michael Radermacher on Hugging Face](https://huggingface.co/mradermacher) or [search HuggingFace directly](https://huggingface.co/models?library=gguf) for a GGUF model.
We recommend getting a GGUF model from either [Michael Radermacher on Hugging Face](https://huggingface.co/mradermacher) or by [searching HuggingFace directly](https://huggingface.co/models?library=gguf) for a GGUF model.

We recommend you to start by getting a small model that doesn't have a lot of parameters just to ensure everything works, so try downloading a `7B`/`8B` parameters model first (search for models with both `7B`/`8B` and `GGUF` in their name).
We recommend starting by getting a small model that doesn't have a lot of parameters just to ensure everything works, so try downloading a `7B`/`8B` parameters model first (search for models with both `7B`/`8B` and `GGUF` in their name).

For improved download speeds, you can use the [`pull`](../cli/pull.md) command to download a model:
```shell
Expand Down
4 changes: 2 additions & 2 deletions scripts/scaffoldElectronExampleForCiBuild.ts
Original file line number Diff line number Diff line change
Expand Up @@ -40,8 +40,8 @@ await scaffoldProjectTemplate({
directoryPath: resolvedPackageFolderPath,
parameters: {
[ProjectTemplateParameter.ProjectName]: projectName,
[ProjectTemplateParameter.ModelUrl]: "https://github.com/withcatai/node-llama-cpp",
[ProjectTemplateParameter.ModelFilename]: "model.gguf",
[ProjectTemplateParameter.ModelUriOrUrl]: "https://github.com/withcatai/node-llama-cpp",
[ProjectTemplateParameter.ModelUriOrFilename]: "model.gguf",
[ProjectTemplateParameter.CurrentModuleVersion]: packageVersion
}
});
Expand Down
16 changes: 9 additions & 7 deletions src/chatWrappers/Llama3_1ChatWrapper.ts
Original file line number Diff line number Diff line change
Expand Up @@ -36,13 +36,7 @@ export class Llama3_1ChatWrapper extends ChatWrapper {
/**
* @param options
*/
public constructor({
cuttingKnowledgeDate = new Date("2023-12-01T00:00:00Z"),
todayDate = () => new Date(),
noToolInstructions = false,

_specialTokensTextForPreamble = false
}: {
public constructor(options: {
/**
* Set to `null` to disable
*
Expand All @@ -64,6 +58,14 @@ export class Llama3_1ChatWrapper extends ChatWrapper {
} = {}) {
super();

const {
cuttingKnowledgeDate = new Date("2023-12-01T00:00:00Z"),
todayDate = () => new Date(),
noToolInstructions = false,

_specialTokensTextForPreamble = false
} = options;

this.cuttingKnowledgeDate = cuttingKnowledgeDate == null
? null
: cuttingKnowledgeDate instanceof Function
Expand Down
4 changes: 2 additions & 2 deletions src/cli/commands/ChatCommand.ts
Original file line number Diff line number Diff line change
Expand Up @@ -77,9 +77,9 @@ export const ChatCommand: CommandModule<object, ChatCommand> = {

return yargs
.option("modelPath", {
alias: ["m", "model", "path", "url"],
alias: ["m", "model", "path", "url", "uri"],
type: "string",
description: "Model file to use for the chat. Can be a path to a local file or a URL of a model file to download. Leave empty to choose from a list of recommended models"
description: "Model file to use for the chat. Can be a path to a local file or a URI of a model file to download. Leave empty to choose from a list of recommended models"
})
.option("header", {
alias: ["H"],
Expand Down
4 changes: 2 additions & 2 deletions src/cli/commands/CompleteCommand.ts
Original file line number Diff line number Diff line change
Expand Up @@ -57,9 +57,9 @@ export const CompleteCommand: CommandModule<object, CompleteCommand> = {
builder(yargs) {
return yargs
.option("modelPath", {
alias: ["m", "model", "path", "url"],
alias: ["m", "model", "path", "url", "uri"],
type: "string",
description: "Model file to use for the chat. Can be a path to a local file or a URL of a model file to download. Leave empty to choose from a list of recommended models"
description: "Model file to use for the completion. Can be a path to a local file or a URI of a model file to download. Leave empty to choose from a list of recommended models"
})
.option("header", {
alias: ["H"],
Expand Down
4 changes: 2 additions & 2 deletions src/cli/commands/InfillCommand.ts
Original file line number Diff line number Diff line change
Expand Up @@ -59,9 +59,9 @@ export const InfillCommand: CommandModule<object, InfillCommand> = {
builder(yargs) {
return yargs
.option("modelPath", {
alias: ["m", "model", "path", "url"],
alias: ["m", "model", "path", "url", "uri"],
type: "string",
description: "Model file to use for the chat. Can be a path to a local file or a URL of a model file to download. Leave empty to choose from a list of recommended models"
description: "Model file to use for the infill. Can be a path to a local file or a URI of a model file to download. Leave empty to choose from a list of recommended models"
})
.option("header", {
alias: ["H"],
Expand Down
59 changes: 40 additions & 19 deletions src/cli/commands/InitCommand.ts
Original file line number Diff line number Diff line change
Expand Up @@ -6,7 +6,6 @@ import logSymbols from "log-symbols";
import validateNpmPackageName from "validate-npm-package-name";
import fs from "fs-extra";
import {consolePromptQuestion} from "../utils/consolePromptQuestion.js";
import {isUrl} from "../../utils/isUrl.js";
import {basicChooseFromListConsoleInteraction} from "../utils/basicChooseFromListConsoleInteraction.js";
import {splitAnsiToLines} from "../utils/splitAnsiToLines.js";
import {arrowChar} from "../../consts.js";
Expand All @@ -21,6 +20,7 @@ import {ProjectTemplateOption, projectTemplates} from "../projectTemplates.js";
import {getReadablePath} from "../utils/getReadablePath.js";
import {createModelDownloader} from "../../utils/createModelDownloader.js";
import {withCliCommandDescriptionDocsUrl} from "../utils/withCliCommandDescriptionDocsUrl.js";
import {resolveModelDestination} from "../../utils/resolveModelDestination.js";

type InitCommand = {
name?: string,
Expand Down Expand Up @@ -93,7 +93,7 @@ export async function InitCommandHandler({name, template, gpu}: InitCommand) {
logLevel: LlamaLogLevel.error
});

const modelUrl = await interactivelyAskForModel({
const modelUri = await interactivelyAskForModel({
llama,
allowLocalModels: false,
downloadIntent: false
Expand All @@ -113,29 +113,53 @@ export async function InitCommandHandler({name, template, gpu}: InitCommand) {

await fs.ensureDir(targetDirectory);

const modelDownloader = await createModelDownloader({
modelUrl,
showCliProgress: false,
deleteTempFileOnCancel: false
});
const modelEntrypointFilename = modelDownloader.entrypointFilename;
async function resolveModelInfo() {
const resolvedModelDestination = resolveModelDestination(modelUri);

if (resolvedModelDestination.type === "uri")
return {
modelUriOrUrl: resolvedModelDestination.uri,
modelUriOrFilename: resolvedModelDestination.uri,
cancelDownloader: async () => void 0
};

if (resolvedModelDestination.type === "file")
throw new Error("Unexpected file model destination");

const modelDownloader = await createModelDownloader({
modelUri: resolvedModelDestination.url,
showCliProgress: false,
deleteTempFileOnCancel: false
});
const modelEntrypointFilename = modelDownloader.entrypointFilename;

return {
modelUriOrUrl: resolvedModelDestination.url,
modelUriOrFilename: modelEntrypointFilename,
async cancelDownloader() {
try {
await modelDownloader.cancel();
} catch (err) {
// do nothing
}
}
};
}

const {modelUriOrFilename, modelUriOrUrl, cancelDownloader} = await resolveModelInfo();

await scaffoldProjectTemplate({
template,
directoryPath: targetDirectory,
parameters: {
[ProjectTemplateParameter.ProjectName]: projectName,
[ProjectTemplateParameter.ModelUrl]: modelUrl,
[ProjectTemplateParameter.ModelFilename]: modelEntrypointFilename,
[ProjectTemplateParameter.ModelUriOrUrl]: modelUriOrUrl,
[ProjectTemplateParameter.ModelUriOrFilename]: modelUriOrFilename,
[ProjectTemplateParameter.CurrentModuleVersion]: await getModuleVersion()
}
});

try {
await modelDownloader.cancel();
} catch (err) {
// do nothing
}
await cancelDownloader();

await new Promise((resolve) => setTimeout(resolve, Math.max(0, minScaffoldTime - (Date.now() - startTime))));
});
Expand Down Expand Up @@ -213,10 +237,7 @@ async function askForProjectName(currentDirectory: string) {
if (item == null)
return "";

if (isUrl(item, false))
return logSymbols.success + " Entered project name " + chalk.blue(item);
else
return logSymbols.success + " Entered project name " + chalk.blue(item);
return logSymbols.success + " Entered project name " + chalk.blue(item);
}
});

Expand Down
21 changes: 10 additions & 11 deletions src/cli/commands/PullCommand.ts
Original file line number Diff line number Diff line change
Expand Up @@ -34,13 +34,13 @@ export const PullCommand: CommandModule<object, PullCommand> = {
return yargs
.option("urls", {
type: "string",
alias: ["url"],
alias: ["url", "uris", "uri"],
array: true,
description: [
"A `.gguf` model URL to pull.",
!isInDocumentationMode && "Automatically handles split and binary-split models files, so only pass the URL to the first file of a model.",
"A `.gguf` model URI to pull.",
!isInDocumentationMode && "Automatically handles split and binary-split models files, so only pass the URI to the first file of a model.",
!isInDocumentationMode && "If a file already exists and its size matches the expected size, it will not be downloaded again unless the `--override` flag is used.",
"Pass multiple URLs to download multiple models at once."
"Pass multiple URIs to download multiple models at once."
].filter(Boolean).join(
isInDocumentationMode
? "\n"
Expand Down Expand Up @@ -104,13 +104,13 @@ export const PullCommand: CommandModule<object, PullCommand> = {
const headers = resolveHeaderFlag(headerArg);

if (urls.length === 0)
throw new Error("At least one URL must be provided");
throw new Error("At least one URI must be provided");
else if (urls.length > 1 && filename != null)
throw new Error("The `--filename` flag can only be used when a single URL is passed");
throw new Error("The `--filename` flag can only be used when a single URI is passed");

if (urls.length === 1) {
const downloader = await createModelDownloader({
modelUrl: urls[0]!,
modelUri: urls[0]!,
dirPath: directory,
headers,
showCliProgress: !noProgress,
Expand Down Expand Up @@ -155,14 +155,13 @@ export const PullCommand: CommandModule<object, PullCommand> = {
console.info(`Downloaded to ${chalk.yellow(getReadablePath(downloader.entrypointFilePath))}`);
} else {
const downloader = await combineModelDownloaders(
urls.map((url) => createModelDownloader({
modelUrl: url,
urls.map((uri) => createModelDownloader({
modelUri: uri,
dirPath: directory,
headers,
showCliProgress: false,
deleteTempFileOnCancel: noTempFile,
skipExisting: !override,
fileName: filename || undefined
skipExisting: !override
})),
{
showCliProgress: !noProgress,
Expand Down
Loading
Loading