Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

docs[patch]: Adds structured output section to concepts #5750

Merged
merged 6 commits into from
Jun 13, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
114 changes: 108 additions & 6 deletions docs/core_docs/docs/concepts.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -144,7 +144,7 @@ Chat models support the assignment of distinct roles to conversation messages, h

Although the underlying models are messages in, message out, the LangChain wrappers also allow these models to take a string as input.
This gives them the same interface as LLMs (and simpler to use).
When a string is passed in as input, it will be converted to a HumanMessage under the hood before being passed to the underlying model.
When a string is passed in as input, it will be converted to a `HumanMessage` under the hood before being passed to the underlying model.

LangChain does not host any Chat Models, rather we rely on third party integrations.

Expand Down Expand Up @@ -751,7 +751,105 @@ You can roughly think of it as an iterator over callback events (though the form

See [this guide](/docs/how_to/streaming/#using-stream-events) for more detailed information on how to use `.streamEvents()`.

### Function/tool calling
### Structured output

LLMs are capable of generating arbitrary text. This enables the model to respond appropriately to a wide
range of inputs, but for some use-cases, it can be useful to constrain the LLM's output
to a specific format or structure. This is referred to as **structured output**.

For example, if the output is to be stored in a relational database,
it is much easier if the model generates output that adheres to a defined schema or format.
[Extracting specific information](/docs/tutorials/extraction/) from unstructured text is another
case where this is particularly useful. Most commonly, the output format will be JSON,
though other formats such as [XML](/docs/how_to/output_parser_xml/) can be useful too. Below, we'll discuss
a few ways to get structured output from models in LangChain.

#### `.withStructuredOutput()`

For convenience, some LangChain chat models support a `.withStructuredOutput()` method.
This method only requires a schema as input, and returns an object matching the requested schema.
Generally, this method is only present on models that support one of the more advanced methods described below,
and will use one of them under the hood. It takes care of importing a suitable output parser and
formatting the schema in the right format for the model.

For more information, check out this [how-to guide](/docs/how_to/structured_output/#the-.withstructuredoutput-method).

#### Raw prompting

The most intuitive way to get a model to structure output is to ask nicely.
In addition to your query, you can give instructions describing what kind of output you'd like, then
parse the output using an [output parser](/docs/concepts/#output-parsers) to convert the raw
model message or string output into something more easily manipulated.

The biggest benefit to raw prompting is its flexibility:

- Raw prompting does not require any special model features, only sufficient reasoning capability to understand
the passed schema.
- You can prompt for any format you'd like, not just JSON. This can be useful if the model you
are using is more heavily trained on a certain type of data, such as XML or YAML.

However, there are some drawbacks too:

- LLMs are non-deterministic, and prompting a LLM to consistently output data in the exactly correct format
for smooth parsing can be surprisingly difficult and model-specific.
- Individual models have quirks depending on the data they were trained on, and optimizing prompts can be quite difficult.
Some may be better at interpreting [JSON schema](https://json-schema.org/), others may be best with TypeScript definitions,
and still others may prefer XML.

While we'll next go over some ways that you can take advantage of features offered by
model providers to increase reliability, prompting techniques remain important for tuning your
results no matter what method you choose.

#### JSON mode

<span data-heading-keywords="json mode"></span>

Some models, such as [Mistral](/docs/integrations/chat/mistral/), [OpenAI](/docs/integrations/chat/openai/),
[Together AI](/docs/integrations/chat/togetherai/) and [Ollama](/docs/integrations/chat/ollama/),
support a feature called **JSON mode**, usually enabled via config.

When enabled, JSON mode will constrain the model's output to always be some sort of valid JSON.
Often they require some custom prompting, but it's usually much less burdensome and along the lines of,
`"you must always return JSON"`, and the [output is easier to parse](/docs/how_to/output_parser_json/).

It's also generally simpler and more commonly available than tool calling.

Here's an example:

```ts
import { JsonOutputParser } from "@langchain/core/output_parsers";
import { ChatPromptTemplate } from "@langchain/core/prompts";
import { ChatOpenAI } from "@langchain/openai";

const model = new ChatOpenAI({
model: "gpt-4o",
modelKwargs: {
response_format: { type: "json_object" },
},
});

const TEMPLATE = `Answer the user's question to the best of your ability.
You must always output a JSON object with an "answer" key and a "followup_question" key.

{question}`;

const prompt = ChatPromptTemplate.fromTemplate(TEMPLATE);

const chain = prompt.pipe(model).pipe(new JsonOutputParser());

await chain.invoke({ question: "What is the powerhouse of the cell?" });
```

```
{
answer: "The powerhouse of the cell is the mitochondrion.",
followup_question: "Would you like to learn more about the functions of mitochondria?"
}
```

For a full list of model providers that support JSON mode, see [this table](/docs/integrations/chat/).

#### Function/tool calling

:::info
We use the term tool calling interchangeably with function calling. Although
Expand All @@ -769,8 +867,10 @@ from unstructured text, you could give the model an "extraction" tool that takes
parameters matching the desired schema, then treat the generated output as your final
result.

A tool call includes a name, arguments dict, and an optional identifier. The
arguments dict is structured `{argument_name: argument_value}`.
For models that support it, tool calling can be very convenient. It removes the
guesswork around how best to prompt schemas in favor of a built-in model feature. It can also
more naturally support agentic flows, since you can just pass multiple tool schemas instead
of fiddling with enums or unions.

Many LLM providers, including [Anthropic](https://www.anthropic.com/),
[Cohere](https://cohere.com/), [Google](https://cloud.google.com/vertex-ai),
Expand All @@ -787,14 +887,16 @@ LangChain provides a standardized interface for tool calling that is consistent

The standard interface consists of:

- `ChatModel.bindTools()`: a method for specifying which tools are available for a model to call.
- `ChatModel.bindTools()`: a method for specifying which tools are available for a model to call. This method accepts [LangChain tools](/docs/concepts/#tools).
- `AIMessage.toolCalls`: an attribute on the `AIMessage` returned from the model for accessing the tool calls requested by the model.

There are two main use cases for function/tool calling:
The following how-to guides are good practical resources for using function/tool calling:

- [How to return structured data from an LLM](/docs/how_to/structured_output/)
- [How to use a model to call tools](/docs/how_to/tool_calling/)

For a full list of model providers that support tool calling, [see this table](/docs/integrations/chat/).

### Retrieval

LangChain provides several advanced retrieval types. A full list is below, along with the following information:
Expand Down
3 changes: 3 additions & 0 deletions docs/core_docs/docs/how_to/structured_output.ipynb
Original file line number Diff line number Diff line change
Expand Up @@ -16,6 +16,9 @@
"metadata": {},
"source": [
"# How to return structured data from a model\n",
"```{=mdx}\n",
"<span data-heading-keywords=\"with_structured_output\"></span>\n",
"```\n",
"\n",
"It is often useful to have a model return output that matches some specific schema. One common use-case is extracting data from arbitrary text to insert into a traditional database or use with some other downstrem system. This guide will show you a few different strategies you can use to do this.\n",
"\n",
Expand Down
48 changes: 23 additions & 25 deletions docs/core_docs/docs/integrations/chat/index.mdx
Original file line number Diff line number Diff line change
@@ -1,6 +1,7 @@
---
sidebar_position: 1
sidebar_class_name: hidden
hide_table_of_contents: true
---

# Chat models
Expand All @@ -11,36 +12,33 @@ All ChatModels implement the Runnable interface, which comes with default implem

- _Streaming_ support defaults to returning an `AsyncIterator` of a single value, the final result returned by the underlying ChatModel provider. This obviously doesn't give you token-by-token streaming, which requires native support from the ChatModel provider, but ensures your code that expects an iterator of tokens can work for any of our ChatModel integrations.
- _Batch_ support defaults to calling the underlying ChatModel in parallel for each input. The concurrency can be controlled with the `maxConcurrency` key in `RunnableConfig`.
- _Map_ support defaults to calling `.invoke` across all instances of the array which it was called on.

Each ChatModel integration can optionally provide native implementations to truly enable invoke, streaming or batching requests.

Additionally, some chat models support additional ways of guaranteeing structure in their outputs by allowing you to pass in a defined schema.
[Function calling and parallel function calling](/docs/how_to/tool_calling) (tool calling) are two common ones, and those capabilities allow you to use the chat model as the LLM in certain types of agents.
[Tool calling](/docs/how_to/tool_calling) (tool calling) is one capability, and allows you to use the chat model as the LLM in certain types of agents.
Some models in LangChain have also implemented a `withStructuredOutput()` method that unifies many of these different ways of constraining output to a schema.

The table shows, for each integration, which features have been implemented with native support. Yellow circles (🟡) indicates partial support - for example, if the model supports tool calling but not tool messages for agents.

| Model | Invoke | Stream | Batch | Function Calling | Tool Calling | `withStructuredOutput()` |
| :---------------------- | :----: | :----: | :---: | :--------------: | :-------------------------: | :----------------------: |
| BedrockChat | ✅ | ✅ | ✅ | ❌ | 🟡 (Bedrock Anthropic only) | ❌ |
| ChatAlibabaTongyi | ✅ | ❌ | ✅ | ❌ | ❌ | ❌ |
| ChatAnthropic | ✅ | ✅ | ✅ | ❌ | ✅ | ✅ |
| ChatBaiduWenxin | ✅ | ❌ | ✅ | ❌ | ❌ | ❌ |
| ChatCloudflareWorkersAI | ✅ | ✅ | ✅ | ❌ | ❌ | ❌ |
| ChatCohere | ✅ | ✅ | ✅ | ❌ | ❌ | ❌ |
| ChatFireworks | ✅ | ✅ | ✅ | ✅ | ✅ | ❌ |
| ChatGoogleGenerativeAI | ✅ | ✅ | ✅ | ❌ | ❌ | ❌ |
| ChatGoogleVertexAI | ✅ | ✅ | ✅ | ❌ | ❌ | ❌ |
| ChatVertexAI | ✅ | ✅ | ✅ | ❌ | ✅ | ✅ |
| ChatGooglePaLM | ✅ | ❌ | ✅ | ❌ | ❌ | ❌ |
| ChatGroq | ✅ | ✅ | ✅ | ❌ | 🟡 | ✅ |
| ChatLlamaCpp | ✅ | ✅ | ✅ | ❌ | ❌ | ❌ |
| ChatMinimax | ✅ | ❌ | ✅ | ✅ | ❌ | ❌ |
| ChatMistralAI | ✅ | ❌ | ✅ | ❌ | ✅ | ✅ |
| ChatOllama | ✅ | ✅ | ✅ | ❌ | ❌ | ❌ |
| ChatOpenAI | ✅ | ✅ | ✅ | ✅ | ✅ | ✅ |
| ChatTencentHunyuan | ✅ | ✅ | ✅ | ❌ | ❌ | ❌ |
| ChatTogetherAI | ✅ | ✅ | ✅ | ❌ | ❌ | ❌ |
| ChatYandexGPT | ✅ | ❌ | ✅ | ❌ | ❌ | ❌ |
| ChatZhipuAI | ✅ | ❌ | ✅ | ❌ | ❌ | ❌ |
| Model | Stream | JSON mode | [Tool Calling](/docs/how_to/tool_calling/) | [`withStructuredOutput()`](/docs/how_to/structured_output/#the-.withstructuredoutput-method) | [Multimodal](/docs/how_to/multimodal_inputs/) |
| :---------------------- | :----: | :-------: | :----------------------------------------: | :------------------------------------------------------------------------------------------: | :-------------------------------------------: |
| BedrockChat | ✅ | ❌ | 🟡 (Bedrock Anthropic only) | ❌ | ❌ |
| ChatAlibabaTongyi | ❌ | ❌ | ❌ | ❌ | ❌ |
| ChatAnthropic | ✅ | ❌ | ✅ | ✅ | ✅ |
| ChatBaiduWenxin | ❌ | ❌ | ❌ | ❌ | ❌ |
| ChatCloudflareWorkersAI | ✅ | ❌ | ❌ | ❌ | ❌ |
| ChatCohere | ✅ | ❌ | ❌ | ❌ | ❌ |
| ChatFireworks | ✅ | ✅ | ✅ | ❌ | ❌ |
| ChatGoogleGenerativeAI | ✅ | ❌ | ❌ | ❌ | ✅ |
| ChatVertexAI | ✅ | ❌ | ✅ | ✅ | ✅ |
| ChatGroq | ✅ | ✅ | 🟡 | ✅ | ❌ |
| ChatLlamaCpp | ✅ | ❌ | ❌ | ❌ | ❌ |
| ChatMinimax | ❌ | ❌ | ❌ | ❌ | ❌ |
| ChatMistralAI | ❌ | ✅ | ✅ | ✅ | ❌ |
| ChatOllama | ✅ | ✅ | ❌ | ❌ | ❌ |
| ChatOpenAI | ✅ | ✅ | ✅ | ✅ | ✅ |
| ChatTencentHunyuan | ✅ | ❌ | ❌ | ❌ | ❌ |
| ChatTogetherAI | ✅ | ✅ | ❌ | ❌ | ❌ |
| ChatYandexGPT | ❌ | ❌ | ❌ | ❌ | ❌ |
| ChatZhipuAI | ❌ | ❌ | ❌ | ❌ | ❌ |
Loading