Engines

Currently the built-in inference engines are node-llama-cpp, gpt4all and transformers-js (highly experimental). Install the corresponding peer dependency before using an engine.

node-llama-cpp

Can be used for text-completion and embedding tasks. See the node-llama-cpp docs for more information.

Find available GGUF models on huggingface.co.

gpt4all

Can be used for text-completion and embedding tasks. You can find parameter docs here.

You can find available models here

transformers-js

Currently supporting speech-to-text and image-to-text tasks. See tests.

node-stable-diffusion-cpp

WIP. See tests.

Custom Engines

You can also write your own engine implementation. See ./src/engines for how the built-in engines are implemented and here for examples of how to utilize custom engines to combine models and add multimodality to your chat completion endpoint. (Or to any other consumer of the ModelServer class.) Multiple ModelServers are allowed and can also be nested to create more complex pipelines.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

engines.md

engines.md

Engines

node-llama-cpp

gpt4all

transformers-js

node-stable-diffusion-cpp

Custom Engines

Files

engines.md

Latest commit

History

engines.md

File metadata and controls

Engines

node-llama-cpp

gpt4all

transformers-js

node-stable-diffusion-cpp

Custom Engines