Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Human-in-the-loop #117

Open
wants to merge 10 commits into
base: main
Choose a base branch
from
Open

Conversation

tgm185z
Copy link

@tgm185z tgm185z commented Dec 5, 2024

First version to add Human-in-the-loop.
For stream only.
I've created a new class InterruptMessage because it seems appropriate to treat it as a completely different entity from chat message, although it might be possible to integrate it as a custom type.
It will probably need some cleanup.
Open to improvements and suggestions.


# Handle interruptions
snapshot = await agent.aget_state(kwargs["config"])
if snapshot.next:
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you add a comment explaining what this check does and why?

@@ -94,6 +110,49 @@ def pretty_repr(self) -> str:
def pretty_print(self) -> None:
print(self.pretty_repr()) # noqa: T201

class InterruptMessage(BaseModel):
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does this actually need to be a whole new class? Why not just make a new interrupt type in ChatMessage? It already has a lot of the relevant fields. And could use custom_data attribute for any non-general stuff.

if snapshot.next:
try:
ai_message = langchain_to_chat_message(snapshot.values["messages"][-1])
ichat_message = interrupt_from_call_tool(call_tool=ai_message.tool_calls[0])
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

An AIMessage can include multiple tool calls. This would drop the additional ones. That seems bad.

Comment on lines +33 to +42
tool_call_id: str | None = Field(
description="Tool call that this message is responding to after an interruption.",
default=None,
examples=["call_Jja7J89XsjrOLA5r!MEOW!SL"],
)
run_id: str | None = Field(
description="Run ID of the message to continue after interruption.",
default=None,
examples=["847c6285-8fc9-4560-a83f-4e6285809254"],
)
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why does the client need to store and send back the tool_call_id(s) and run_id ? Couldn't it just be tracked in the server (or maybe the checkpointer already does it for us?) for the relevant thread_id instead of returning it, to keep the interface simpler?

except Exception as e:
yield f"data: {json.dumps({'type': 'error', 'content': f'Error parsing interrupt message: {e}'})}\n\n"

yield f"data: {json.dumps({'type': 'interrupt', 'content': ichat_message.model_dump()})}\n\n"
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This only handles the /stream endpoint. What happens if I call /invoke on an interrupting agent?



def interrupt_from_call_tool(call_tool: dict) -> InterruptMessage:
# Crear instancia de InterruptMessage a partir del diccionario call_tool
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would you mind putting code comments in english to be consistent with the rest of the repo?

Comment on lines 301 to +304
if msg.tool_calls:
# Create a status container for each tool call and store the
# status container by ID to ensure results are mapped to the
# correct status container.
call_results = {}
for tool_call in msg.tool_calls:
status = st.status(
f"""Tool Call: {tool_call["name"]}""",
state="running" if is_new else "complete",
status = st.status(
f"""Tool Call: {msg.tool_calls[0]["name"]}""",
state="running",
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I suspect this will break the normal non-interrupt use case ? No?

"config": RunnableConfig(
configurable={"thread_id": thread_id, "model": user_input.model}, run_id=run_id
),
}
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You'll want to set up the pre-commit hooks and run the linter and tests locally. Otherwise the CI will keep failing and won't be able to merge this. See the contributing instructions in the README.

@tgm185z
Copy link
Author

tgm185z commented Dec 10, 2024

I'm going to start working on the problems you mention. I'll post here in case I need to expand on any issues.

@JoshuaC215
Copy link
Owner

Great! I just merged in some significant refactors so you’ll probably want to rebase your branch as well. I don’t think it really conflicts much with what you have here. I’m not planning any other big refactors or anything for a while.

@soysal
Copy link

soysal commented Dec 27, 2024

An optimal human-in-the-loop solution should seamlessly integrate both structured and unstructured forms of human input. Yes, a prime example is user confirmation, which serves as a practical use case. However, in a more generalized framework, the system should possess the capability to solicit user input through various modalities, including multiple-choice options (e.g., radio buttons, checkboxes, or checkbox lists) or free-text responses.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants