Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Agent streaming - stream final output and steps #3398

Closed
david1542 opened this issue Nov 24, 2023 · 7 comments
Closed

Agent streaming - stream final output and steps #3398

david1542 opened this issue Nov 24, 2023 · 7 comments
Labels
auto:improvement Medium size change to existing code to handle new use-cases auto:question A specific question about the codebase, product, project, or how to use a feature

Comments

@david1542
Copy link

david1542 commented Nov 24, 2023

Hi everyone. I have 2 questions related to agents and streaming:

  1. Stream steps
    I'm trying to listen to my agent's actions in real time and notify the user about them in my UI. I noticed a new PR with agent streaming capabilities has been merged to main in last couple of days, that allows consumers to receive agent's actions in real time.

This is my agent's code:

const memory = new BufferMemory({
  returnMessages: true,
  memoryKey: "history",
});
const model = new ChatOpenAI({
  modelName: "gpt-3.5-turbo-1106",
  temperature: 0,
  openAIApiKey: getConfigValue("openaiToken"),
  verbose: true,
}).bind({ tools: tools.map(formatToOpenAITool) });

const prompt = ChatPromptTemplate.fromMessages([
  ["ai", PREFIX],
  new MessagesPlaceholder("history"),
  ["human", "{input}"],
  new MessagesPlaceholder("agent_scratchpad"),
]);

const agent = RunnableSequence.from([
  {
    input: (i: { input: string; steps: ToolsAgentStep[] }) => i.input,
    agent_scratchpad: (i: { input: string; steps: ToolsAgentStep[] }) =>
      formatToOpenAIToolMessages(i.steps),
    history: () => memory.chatHistory.getMessages(),
  },
  prompt,
  model,
  new OpenAIToolsAgentOutputParser(),
]).withConfig({ runName: "OpenAIToolsAgent" });

const executor = AgentExecutor.fromAgentAndTools({
  agent,
  tools,
  memory,
});

And this is how I use it:

const response = await executor.stream(
  { input: prompt },
  {
    callbacks: [new VSCodeCallbacks()],
  }
);

let output;
for await (const chunk of response) {
  console.log(`Streamed chunk: ${JSON.stringify(chunk)}`);
  output = chunk.output;
}

However, when I look at the logs, the only time I see the console.log statement is when the agent finishes and provides the final answer. The intermediate steps (tool callings) are not given to me. This is the only log statement I see:

Streamed chunk: {"output":"some chatgpt response"}

I was wondering how can I achieve this kind of functionaliy. I know I can attach a callback manager to the .call and listen to handleAgentAction but I was wondering if there is a more neat way of doing that.

  1. Stream final output
    Is there a way to stream the final output in chunks? Sometimes, the final output is long, and I'd like to show it to the user chunk by chunk, like using the native OpenAI SDK. I noticed there is a solution in the Python version. Is there a solution in the JS version as well?
@dosubot dosubot bot added auto:improvement Medium size change to existing code to handle new use-cases auto:question A specific question about the codebase, product, project, or how to use a feature labels Nov 24, 2023
Copy link

dosubot bot commented Nov 24, 2023

🤖

Hi @david1542, it's nice to see you again! Hope you're doing well.

Based on the context provided, it seems like you're trying to stream both the intermediate steps of your agent's actions and the final output in real time.

For the first part of your question, the stream method of the chain object in LangChainJS can be used to stream the intermediate steps of your agent's actions. This method returns an asynchronous iterable that you can loop over using a for await...of loop. Each iteration of the loop will yield a chunk of the output, which you can then log to the console or process in any other way you see fit. Here is a code snippet that demonstrates this:

const stream = await chain.stream({
  foo: "bears",
});

// Stream a diff as JSON patch operations
for await (const chunk of stream) {
  console.log(chunk);
}

You can find more details in the examples/src/prompts/json_structured_output_parser_streaming.ts file in the LangChainJS repository.

For the second part of your question, you can use the stream method of the chain2 object to stream the final output in chunks. This method also returns an asynchronous iterable that you can loop over using a for await...of loop. Each iteration of the loop will yield a chunk of the final output, which you can then log to the console or process in any other way you see fit. Here is a code snippet that demonstrates this:

const stream2 = await chain2.stream({
  foo: "beets",
});

// Stream the entire aggregated JSON object
for await (const chunk of stream2) {
  console.log(chunk);
}

In both cases, the stream method is called with an object that specifies the parameters for the language model. The foo property of this object is used to customize the prompt that is sent to the language model.

Regarding the stream method in the AgentExecutor class, it's not directly implemented in the provided context. However, there is a streamIterator method in the AgentExecutorIterator class which is related to the AgentExecutor class. This method is an asynchronous generator that handles the iteration of the agent execution process. You can find more details in the langchain/src/agents/executor.ts file in the LangChainJS repository.

I hope this helps! If you have any other questions or need further clarification, feel free to ask.

Sources


This response is meant to be useful and save you time. It is not meant to be a precise solution, but rather a starting point for your own research.

Help me be more useful! Please leave a 👍 if this is helpful and 👎 if it is irrelevant.

If you want to continue the conversation, start your reply with @dosu-bot.

@Falven
Copy link

Falven commented Dec 12, 2023

Yes, it seems like OpenAI tool calling agent streaming is broken using LCEL right now. Using handleLLMNewToken to log also does not work. While not ideal, initializeAgentExecutorWithOptions seems to work...
Maybe related?

@jacoblee93
Copy link
Collaborator

Hey, apologies for missing this! Will have a look as soon as I can (hopefully within the next few days)!

@danny-avila
Copy link

Hey, apologies for missing this! Will have a look as soon as I can (hopefully within the next few days)!

Thanks so much!

@kevsjh
Copy link

kevsjh commented Jan 30, 2024

I notice this as well that agent streaming does not seems to work. But streamLog works ok

    const stream = await agentExecutor.streamLog({
      input: "what is the weather in SF and then LA",
      chatHistory: previousMessages,
    });

    for await (const chunk of stream) {
      console.log(JSON.stringify(chunk, null, 2));
      console.log("------");
    }

@bracesproul
Copy link
Member

Hey @david1542 sorry for taking so long on this, unfortunately you can't use the .stream method to stream back chunks of the final response. This is because for agents we aren't able to tell what actually is the final response until the entire response has come in, and we can parse it. However, we do have a section in the docs which tries to account for this using .streamLog:

https://js.langchain.com/docs/modules/agents/how_to/streaming#:~:text=With%20some%20creative%20parsing%2C%20this%20can%20be%20useful%20for%20e.g.%20streaming%20back%20just%20the%20final%20response%20from%20the%20agent%3A

@jacoblee93
Copy link
Collaborator

Closing for now, we are working on a new and easier to use API for streaming these internal agent steps.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
auto:improvement Medium size change to existing code to handle new use-cases auto:question A specific question about the codebase, product, project, or how to use a feature
Projects
None yet
Development

No branches or pull requests

6 participants