Skip to content

Commit

Permalink
docs: add a tutorial for running model locally in a Python extension (#…
Browse files Browse the repository at this point in the history
…90)

Co-authored-by: Hu Yueh-Wei <[email protected]>
  • Loading branch information
sunxilin and halajohn authored Oct 10, 2024
1 parent 75ec478 commit 3be172e
Showing 1 changed file with 108 additions and 0 deletions.
108 changes: 108 additions & 0 deletions docs/tutorials/how_to_run_local_model_in_python_extensions.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,108 @@
---
layout:
title:
visible: true
description:
visible: false
tableOfContents:
visible: true
outline:
visible: true
pagination:
visible: true
---

# How to Run a Local AI Model in Python Extensions

In the TEN framework, extensions can utilize third-party AI services or run AI models locally to improve performance and reduce costs. This tutorial explains how to run a local AI model in a Python extension and how to interact with it within the extension.

## Step 1: Check Hardware Requirements

Before running an AI model locally, ensure that your hardware meets the necessary requirements. Key components to verify include:

- **CPU/GPU**: Check if the model requires specific processing power.
- **Memory**: Ensure sufficient memory to load and run the model.

Verify that your system can support the model’s demands to ensure smooth operation.

## Step 2: Install Necessary Software and Dependencies

Once your hardware is ready, install the required software and dependencies. Follow these steps:

1. **Operating System**: Ensure compatibility with your model. Most AI frameworks support Windows, macOS, and Linux, though specific versions may be required.
2. **Python Version**: Ensure compatibility with the TEN Python runtime and the model.
3. **Required Libraries**: Install necessary libraries such as:

- TensorFlow
- PyTorch
- Numpy
- vllm

You can list the required dependencies in a `requirements.txt` file for easy installation.

4. **Download the Model**: Obtain the local version of the AI model you plan to run.

## Step 3: Implement Your Python Extension

Below is an example of how to implement a basic text generation feature using the `vllm` inference engine in a Python extension.

First, initialize the local model within the extension:

```python
from ten import (
Extension,
TenEnv,
Cmd,
CmdResult,
)
from vllm import LLM

class TextGenerationExtension(Extension):
def on_init(self, ten_env: TenEnv) -> None:
self.llm = LLM(model="<model_path>")
ten_env.on_init_done()
```

Next, implement the `on_cmd` method to handle text generation based on the provided input:

```python
def on_cmd(self, ten_env: TenEnv, cmd: Cmd) -> None:
prompt = cmd.get_property_string("prompt")

outputs = self.llm.generate(prompt)
generated_text = outputs[0].outputs[0].text

cmd_result = CmdResult.create(StatusCode.OK)
cmd_result.set_property_string("result", generated_text)
ten_env.return_result(cmd_result, cmd)
```

In this code, the `on_cmd` method retrieves the `prompt`, generates text using the model, and returns the generated text as the command result.

You can adapt this approach to implement other functionalities such as image recognition or speech-to-text by processing the relevant input types.

## Step 4: Unload the Model

It’s important to unload the model during extension cleanup to free resources:

```python
import gc
import torch

class TextGenerationExtension(Extension):
...

def on_deinit(self, ten_env: TenEnv) -> None:
del self.llm
gc.collect()
torch.cuda.empty_cache()
torch.distributed.destroy_process_group()
print("Successfully deleted the LLM pipeline and freed GPU memory!")
ten_env.on_deinit_done()
```

This ensures efficient memory management, especially when working with GPU resources.

## Summary

Running a local model in a TEN Python extension is similar to native Python development. By loading and unloading the model in the appropriate extension lifecycle methods, you can easily integrate local AI models and interact with them effectively.

0 comments on commit 3be172e

Please sign in to comment.