Skip to content

Commit

Permalink
Merge branch 'fresh-human-tool-new' into human-intervention
Browse files Browse the repository at this point in the history
  • Loading branch information
matiasmolinas committed Dec 18, 2024
2 parents 831aea6 + bb55941 commit 6c8c1e8
Show file tree
Hide file tree
Showing 13 changed files with 68 additions and 89 deletions.
1 change: 1 addition & 0 deletions .github/PULL_REQUEST_TEMPLATE.md
Original file line number Diff line number Diff line change
Expand Up @@ -19,6 +19,7 @@ Closes: #
<!-- For completed items, change [ ] to [x]. -->

- [ ] I have read the [contributor guide](https://github.com/i-am-bee/bee-agent-framework/blob/main/CONTRIBUTING.md)
- [ ] I have [signed off](https://github.com/i-am-bee/bee-agent-framework/blob/main/CONTRIBUTING.md#developer-certificate-of-origin-dco) on my commit
- [ ] Linting passes: `yarn lint` or `yarn lint:fix`
- [ ] Formatting is applied: `yarn format` or `yarn format:fix`
- [ ] Unit tests pass: `yarn test:unit`
Expand Down
2 changes: 1 addition & 1 deletion examples/agents/experimental/human.ts
Original file line number Diff line number Diff line change
Expand Up @@ -99,4 +99,4 @@ try {
} finally {
// Gracefully close the reader when exiting the app
reader.close();
}
}
14 changes: 7 additions & 7 deletions examples/agents/granite/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,23 +2,23 @@

The [IBM Granite](https://www.ibm.com/granite) family of models can be used as the underlying LLM within Bee Agents. Granite™ is IBM's family of open, performant, and trusted AI models tailored for business and optimized to scale your AI applications.

This guide and the associated examples will help you get started with creating Bee Agents using Granite.
This guide and the associated examples will help you get started with creating Bee Agents using Granite 3.1.

## 📦 Prerequisites

### LLM Services

IBM Granite is supported by [watsonx.ai](https://www.ibm.com/products/watsonx-ai) and [Ollama](https://ollama.com/). Watsonx.ai will allow you to run models in the cloud. Ollama will allow you to download and run models locally.
IBM Granite 3.1 is supported by [watsonx.ai](https://www.ibm.com/products/watsonx-ai) and [Ollama](https://ollama.com/). Watsonx.ai will allow you to run models in the cloud. Ollama will allow you to download and run models locally.

> [!TIP]
> Better performance will be achieved by using larger Granite models.
> [!NOTE]
> If you work for IBM there are additional options to run IBM Granite models with VLLM or RITS.
> If you work for IBM there are additional options to run IBM Granite 3.1 models with VLLM or RITS.
#### Ollama

There are guides available for running Granite with Ollama on [Linux](https://www.ibm.com/granite/docs/run/granite-on-linux/granite/), [Mac](https://www.ibm.com/granite/docs/run/granite-on-mac/granite/) or [Windows](https://www.ibm.com/granite/docs/run/granite-on-windows/granite/).
There are guides available for running Granite 3.1 with Ollama on [Linux](https://www.ibm.com/granite/docs/run/granite-on-linux/granite/), [Mac](https://www.ibm.com/granite/docs/run/granite-on-mac/granite/) or [Windows](https://www.ibm.com/granite/docs/run/granite-on-windows/granite/).

#### Watsonx

Expand Down Expand Up @@ -88,18 +88,18 @@ In this example the wikipedia tool interface is extended so that the agent can s

This example uses Ollama exclusively.

To get started you will need to pull `granite3-dense:8b` and `nomic-embed-text` (to perform text embedding). If you are unfamiliar with using Ollama then check out instructions for getting up and running at the the [Ollama Github repo](https://github.com/ollama/ollama).
To get started you will need to pull `granite3.1-dense:8b` and `nomic-embed-text` (to perform text embedding). If you are unfamiliar with using Ollama then check out instructions for getting up and running at the the [Ollama Github repo](https://github.com/ollama/ollama).

```shell
ollama pull granite3-dense:8b
ollama pull granite3.1-dense:8b
ollama pull nomic-embed-text
ollama serve
```

Run the [granite_wiki_bee](/examples/agents/granite/granite_wiki_bee.ts) agent:

```shell
yarn run start examples/agents/granite/granite_wiki_bee.ts <<< "Who were the authors of the paper 'Attention is all you need' and how many citations does it have?"
yarn run start examples/agents/granite/granite_wiki_bee.ts <<< "Who were the authors of the research paper 'Attention is all you need', how many citations does it have?"
```

You will see the agent reasoning, calling the WikipediaTool and producing a final answer similar to the following:
Expand Down
6 changes: 3 additions & 3 deletions examples/agents/granite/granite_bee.ts
Original file line number Diff line number Diff line change
Expand Up @@ -29,7 +29,7 @@ function getChatLLM(provider?: Provider): ChatLLM<ChatLLMOutput> {
const LLMFactories: Record<Provider, () => ChatLLM<ChatLLMOutput>> = {
[Providers.OLLAMA]: () =>
new OllamaChatLLM({
modelId: getEnv("OLLAMA_MODEL") || "granite3-dense:8b",
modelId: getEnv("OLLAMA_MODEL") || "granite3.1-dense:8b",
parameters: {
temperature: 0,
repeat_penalty: 1,
Expand All @@ -45,7 +45,7 @@ function getChatLLM(provider?: Provider): ChatLLM<ChatLLMOutput> {
projectId: getEnv("WATSONX_PROJECT_ID"),
region: getEnv("WATSONX_REGION"),
}),
[Providers.IBMVLLM]: () => IBMVllmChatLLM.fromPreset(IBMVllmModel.GRANITE_3_0_8B_INSTRUCT),
[Providers.IBMVLLM]: () => IBMVllmChatLLM.fromPreset(IBMVllmModel.GRANITE_3_1_8B_INSTRUCT),
[Providers.IBMRITS]: () =>
new OpenAIChatLLM({
client: new OpenAI({
Expand All @@ -55,7 +55,7 @@ function getChatLLM(provider?: Provider): ChatLLM<ChatLLMOutput> {
RITS_API_KEY: process.env.IBM_RITS_API_KEY,
},
}),
modelId: getEnv("IBM_RITS_MODEL") || "ibm-granite/granite-3.0-8b-instruct",
modelId: getEnv("IBM_RITS_MODEL") || "ibm-granite/granite-3.1-8b-instruct",
parameters: {
temperature: 0,
max_tokens: 2048,
Expand Down
7 changes: 3 additions & 4 deletions examples/agents/granite/granite_wiki_bee.ts
Original file line number Diff line number Diff line change
Expand Up @@ -76,18 +76,17 @@ function wikipediaRetrivalTool(passageSize: number, overlap: number, maxResults:

// Agent LLM
const llm = new OllamaChatLLM({
modelId: "granite3-dense:8b",
modelId: "granite3.1-dense:8b",
parameters: {
temperature: 0,
num_ctx: 4096,
num_predict: 512,
num_predict: 2048,
},
});

const agent = new BeeAgent({
llm,
memory: new TokenMemory({ llm }),
tools: [wikipediaRetrivalTool(200, 50, 3)],
tools: [wikipediaRetrivalTool(400, 50, 3)],
});

const reader = createConsoleReader();
Expand Down
2 changes: 1 addition & 1 deletion examples/helpers/io.ts
Original file line number Diff line number Diff line change
Expand Up @@ -84,4 +84,4 @@ export function createConsoleReader({
}
},
};
}
}
2 changes: 1 addition & 1 deletion examples/tools/experimental/human.ts
Original file line number Diff line number Diff line change
Expand Up @@ -81,4 +81,4 @@ export class HumanTool extends Tool<StringToolOutput> {

return new StringToolOutput(formattedOutput);
}
}
}
22 changes: 2 additions & 20 deletions src/adapters/ibm-vllm/chatPreset.ts
Original file line number Diff line number Diff line change
Expand Up @@ -28,7 +28,6 @@ export const IBMVllmModel = {
LLAMA_3_1_405B_INSTRUCT_FP8: "meta-llama/llama-3-1-405b-instruct-fp8",
LLAMA_3_1_70B_INSTRUCT: "meta-llama/llama-3-1-70b-instruct",
LLAMA_3_1_8B_INSTRUCT: "meta-llama/llama-3-1-8b-instruct",
GRANITE_3_0_8B_INSTRUCT: "ibm-granite/granite-3-0-8b-instruct",
GRANITE_3_1_8B_INSTRUCT: "ibm-granite/granite-3-1-8b-instruct",
} as const;
export type IBMVllmModel = (typeof IBMVllmModel)[keyof typeof IBMVllmModel];
Expand Down Expand Up @@ -119,26 +118,8 @@ export const IBMVllmChatLLMPreset = {
},
};
},
[IBMVllmModel.GRANITE_3_0_8B_INSTRUCT]: (): IBMVllmChatLLMPreset => {
const { template, parameters, messagesToPrompt } = LLMChatTemplates.get("granite3Instruct");
return {
base: {
modelId: IBMVllmModel.GRANITE_3_0_8B_INSTRUCT,
parameters: {
method: "GREEDY",
stopping: {
stop_sequences: [...parameters.stop_sequence],
include_stop_sequence: false,
},
},
},
chat: {
messagesToPrompt: messagesToPrompt(template),
},
};
},
[IBMVllmModel.GRANITE_3_1_8B_INSTRUCT]: (): IBMVllmChatLLMPreset => {
const { template, parameters, messagesToPrompt } = LLMChatTemplates.get("granite3Instruct");
const { template, parameters, messagesToPrompt } = LLMChatTemplates.get("granite3.1-Instruct");
return {
base: {
modelId: IBMVllmModel.GRANITE_3_1_8B_INSTRUCT,
Expand All @@ -147,6 +128,7 @@ export const IBMVllmChatLLMPreset = {
stopping: {
stop_sequences: [...parameters.stop_sequence],
include_stop_sequence: false,
max_new_tokens: 2048,
},
},
},
Expand Down
14 changes: 7 additions & 7 deletions src/adapters/shared/llmChatTemplates.ts
Original file line number Diff line number Diff line change
Expand Up @@ -116,21 +116,21 @@ const llama3: LLMChatTemplate = {
},
};

const granite3Instruct: LLMChatTemplate = {
const granite31Instruct: LLMChatTemplate = {
template: new PromptTemplate({
schema: templateSchemaFactory([
"system",
"user",
"assistant",
"available_tools",
"tools",
"tool_call",
"tool_response",
] as const),
template: `{{#messages}}{{#system}}<|start_of_role|>system<|end_of_role|>
{{system}}<|end_of_text|>
{{ end }}{{/system}}{{#available_tools}}<|start_of_role|>available_tools<|end_of_role|>
{{available_tools}}<|end_of_text|>
{{ end }}{{/available_tools}}{{#user}}<|start_of_role|>user<|end_of_role|>
{{ end }}{{/system}}{{#tools}}<|start_of_role|>tools<|end_of_role|>
{{tools}}<|end_of_text|>
{{ end }}{{/tools}}{{#user}}<|start_of_role|>user<|end_of_role|>
{{user}}<|end_of_text|>
{{ end }}{{/user}}{{#assistant}}<|start_of_role|>assistant<|end_of_role|>
{{assistant}}<|end_of_text|>
Expand All @@ -142,7 +142,7 @@ const granite3Instruct: LLMChatTemplate = {
`,
}),
messagesToPrompt: messagesToPromptFactory({
available_tools: "available_tools",
tools: "tools",
tool_response: "tool_response",
tool_call: "tool_call",
}),
Expand All @@ -156,7 +156,7 @@ export class LLMChatTemplates {
"llama3.3": llama33,
"llama3.1": llama31,
"llama3": llama3,
"granite3Instruct": granite3Instruct,
"granite3.1-Instruct": granite31Instruct,
};

static register(model: string, template: LLMChatTemplate, override = false) {
Expand Down
23 changes: 2 additions & 21 deletions src/adapters/watsonx/chatPreset.ts
Original file line number Diff line number Diff line change
Expand Up @@ -43,12 +43,12 @@ export const WatsonXChatLLMPreset = {
};
},
"ibm/granite-3-8b-instruct": (): WatsonXChatLLMPreset => {
const { template, parameters, messagesToPrompt } = LLMChatTemplates.get("granite3Instruct");
const { template, parameters, messagesToPrompt } = LLMChatTemplates.get("granite3.1-Instruct");
return {
base: {
parameters: {
decoding_method: "greedy",
max_new_tokens: 512,
max_new_tokens: 2048,
include_stop_sequence: false,
stop_sequences: [...parameters.stop_sequence],
},
Expand All @@ -61,25 +61,6 @@ export const WatsonXChatLLMPreset = {
"ibm/granite-3-2b-instruct"() {
return WatsonXChatLLMPreset["ibm/granite-3-8b-instruct"]();
},
"ibm/granite-3-1-8b-instruct": (): WatsonXChatLLMPreset => {
const { template, parameters, messagesToPrompt } = LLMChatTemplates.get("granite3Instruct");
return {
base: {
parameters: {
decoding_method: "greedy",
max_new_tokens: 512,
include_stop_sequence: false,
stop_sequences: [...parameters.stop_sequence],
},
},
chat: {
messagesToPrompt: messagesToPrompt(template),
},
};
},
"ibm/granite-3-1-2b-instruct"() {
return WatsonXChatLLMPreset["ibm/granite-3-8b-instruct"]();
},
"meta-llama/llama-3-1-70b-instruct": (): WatsonXChatLLMPreset => {
const { template, messagesToPrompt, parameters } = LLMChatTemplates.get("llama3.1");

Expand Down
32 changes: 17 additions & 15 deletions src/agents/bee/runners/granite/prompts.ts
Original file line number Diff line number Diff line change
Expand Up @@ -45,26 +45,26 @@ export const GraniteBeeSystemPrompt = BeeSystemPrompt.fork((config) => ({
}).format(date);
},
},
template: `# Setting
You are an AI assistant.
template: `You are an AI assistant.
When the user sends a message figure out a solution and provide a final answer.
{{#tools.length}}
You have access to a set of available tools that can be used to retrieve information and perform actions.
You have access to a set of tools that can be used to retrieve information and perform actions.
Pay close attention to the tool description to determine if a tool is useful in a particular context.
{{/tools.length}}
# Communication structure:
- Line starting 'Message: ' The user's question or instruction. This is provided by the user, the assistant does not produce this.
- Line starting 'Thought: ' The assistant's response always starts with a thought, this is free text where the assistant thinks about the user's message and describes in detail what it should do next.
# Communication structure
You communicate only in instruction lines. Valid instruction lines are 'Thought' followed by 'Tool Name' and then 'Tool Input', or 'Thought' followed by 'Final Answer'
Line starting 'Thought: ' The assistant's response always starts with a thought, this is a single line where the assistant thinks about the user's message and describes in detail what it should do next.
{{#tools.length}}
- In a 'Thought', the assistant should determine if a Tool Call is necessary to get more information or perform an action, or if the available information is sufficient to provide the Final Answer.
- If a tool needs to be called and is available, the assistant will produce a tool call:
- Line starting 'Tool Name: ' name of the tool that you want to use.
- Line starting 'Tool Input: ' JSON formatted tool arguments adhering to the selected tool parameters schema i.e. {"arg1":"value1", "arg2":"value2"}.
- Line starting 'Thought: ', followed by free text where the assistant thinks about the all the information it has available, and what it should do next (e.g. try the same tool with a different input, try a different tool, or proceed with answering the original user question).
In a 'Thought: ', the assistant should determine if a Tool Call is necessary to get more information or perform an action, or if the available information is sufficient to provide the Final Answer.
If a tool needs to be called and is available, the assistant will produce a tool call:
Line starting 'Tool Name: ' name of the tool that you want to use.
Line starting 'Tool Input: ' JSON formatted tool arguments adhering to the selected tool parameters schema i.e. {"arg1":"value1", "arg2":"value2"}.
After a 'Tool Input: ' the next message will contain a tool response. The next output should be a 'Thought: ' where the assistant thinks about the all the information it has available, and what it should do next (e.g. try the same tool with a different input, try a different tool, or proceed with answering the original user question).
{{/tools.length}}
- Once enough information is available to provide the Final Answer, the last line in the message needs to be:
- Line starting 'Final Answer: ' followed by a answer to the original message.
Once enough information is available to provide the Final Answer, the last line in the message needs to be:
Line starting 'Final Answer: ' followed by a concise and clear answer to the original message.
# Best practices
- Use markdown syntax for formatting code snippets, links, JSON, tables, images, files.
Expand All @@ -81,8 +81,10 @@ The current date and time is: {{formatDate}}
You do not need a tool to get the current Date and Time. Use the information available here.
{{/tools.length}}
{{#instructions}}
# Additional instructions
{{instructions}}
{{.}}
{{/instructions}}
`,
}));

Expand All @@ -94,7 +96,7 @@ You communicate only in instruction lines. Valid instruction lines are 'Thought'

export const GraniteBeeUserPrompt = BeeUserPrompt.fork((config) => ({
...config,
template: `Message: {{input}}`,
template: `{{input}}`,
}));

export const GraniteBeeToolNotFoundPrompt = BeeToolNotFoundPrompt.fork((config) => ({
Expand Down
2 changes: 1 addition & 1 deletion src/agents/bee/runners/granite/runner.ts
Original file line number Diff line number Diff line change
Expand Up @@ -71,7 +71,7 @@ export class GraniteRunner extends DefaultRunner {
const index = memory.messages.findIndex((msg) => msg.role === Role.SYSTEM) + 1;
await memory.add(
BaseMessage.of({
role: "available_tools",
role: "tools",
text: JSON.stringify(
(await this.renderers.system.variables.tools()).map((tool) => ({
name: tool.name,
Expand Down
30 changes: 22 additions & 8 deletions src/agents/experimental/streamlit/prompts.ts
Original file line number Diff line number Diff line change
Expand Up @@ -39,6 +39,17 @@ You are Bee App Builder, a friendly and creative assistant designed by IBM to bu
- If the user asks for a "generator", "writer" or similar, assume that they want the app to be LLM powered.
- If you make an error, apologize and explain that you are still learning. Reiterate your commitment to improving.
## Role
- You are the Bee App Builder assistant.
- Bee App Builder is an assistant that builds code autonomously. It always performs all of the needed code edits itself and returns a full, runnable code block.
- Outside of the knowledge of the Bee App Builder assistant, the code is executed using a Pyodide runtime. While the code is part of the message sent to the user, the message is intercepted and the code is rendered into a fully interactive app.
- The user is not a programmer, and does not know how to code. They don't actually see the code blocks returned by the Bee App Builder assistant -- they only see the rendered app.
- You must always refer to the code in a passive sense, and prefer the term "app". Never attribute the code to the user. For example, do NOT say "I have edited your code", instead say "I have edited the app".
- When there's an error in the code, assume that it is the fault of Bee App Builder, not the user. Apologize for the error and perform a fix.
- Never ask the user to do any work. Do not ask them to fix code, make edits, or to perform any other tasks with the source code.
- On the other hand, if there is an error in the app, you should ask the user for details. The Bee App Builder assistant is not aware of the current runtime state of the app, as it only sees the source code, not the rendered app. Thus, you should ask the user about specifics when the error cause is not clear.
---
## Properly embedding code in messages
Expand Down Expand Up @@ -74,19 +85,21 @@ If you realize that you have made a mistake or that you can write the app in a b
- The main method is called \`async def main()\`. This is the executed entrypoint. The execution environment will run the app by running the \`main()\` function. DO NOT attempt to run \`main()\` manually, the execution environment will do it!
- For HTTP requests, use \`pyodide.http.pyfetch\`. \`pyodide.http.pyfetch\` is asynchronous and has the same interface as JS \`fetch\`. Example:
\`\`\`
import streamlit as st
import pyodide.http
import json
async def main():
response = pyodide.http.pyfetch(
"http://example.com",
method="POST",
body=json.dumps({"query": query}),
headers={"Content-Type": "application/json"},
)
json = await response.json()
# ...
response = pyodide.http.pyfetch(
"https://example.com",
method="POST",
body=json.dumps({"query": query}),
headers={"Content-Type": "application/json"},
)
st.json(await response.json())
\`\`\`
- DO NOT use \`requests\`, \`httpx\` or other HTTP libraries.
- Only call \`fetch\` using the secure \`https:\` schema. Plaintext requests will fail due to security policy.
### User interface
Expand Down Expand Up @@ -135,6 +148,7 @@ json = await response.json()
- Do not simultaneously set value of input element using \`st.session_state.<key>\` and directly using \`st.text_input(key="<key>", value="...")\`. This results in an error.
- If you need to clear an input field after submitting, use \`with st.form("form_name", clear_on_submit=True):\` to wrap the input elements. Do not modify \`st.session_state.<key>\` after rendering the element, as that will result in an error.
- When a button that is **not** part of the form modifies \`st.session_state\`, it has to call \`st.rerun()\` afterwards to ensure proper UI refresh.
- Do not call or await \`main()\` in your code. \`main\` is a special function that will be called by the Pyodide runtime automatically.
---
Expand Down

0 comments on commit 6c8c1e8

Please sign in to comment.