diff --git a/docs/core_docs/docs/concepts.mdx b/docs/core_docs/docs/concepts.mdx index 56078f972c73..5a310bb33189 100644 --- a/docs/core_docs/docs/concepts.mdx +++ b/docs/core_docs/docs/concepts.mdx @@ -144,7 +144,7 @@ Chat models support the assignment of distinct roles to conversation messages, h Although the underlying models are messages in, message out, the LangChain wrappers also allow these models to take a string as input. This gives them the same interface as LLMs (and simpler to use). -When a string is passed in as input, it will be converted to a HumanMessage under the hood before being passed to the underlying model. +When a string is passed in as input, it will be converted to a `HumanMessage` under the hood before being passed to the underlying model. LangChain does not host any Chat Models, rather we rely on third party integrations. @@ -751,7 +751,105 @@ You can roughly think of it as an iterator over callback events (though the form See [this guide](/docs/how_to/streaming/#using-stream-events) for more detailed information on how to use `.streamEvents()`. -### Function/tool calling +### Structured output + +LLMs are capable of generating arbitrary text. This enables the model to respond appropriately to a wide +range of inputs, but for some use-cases, it can be useful to constrain the LLM's output +to a specific format or structure. This is referred to as **structured output**. + +For example, if the output is to be stored in a relational database, +it is much easier if the model generates output that adheres to a defined schema or format. +[Extracting specific information](/docs/tutorials/extraction/) from unstructured text is another +case where this is particularly useful. Most commonly, the output format will be JSON, +though other formats such as [XML](/docs/how_to/output_parser_xml/) can be useful too. Below, we'll discuss +a few ways to get structured output from models in LangChain. + +#### `.withStructuredOutput()` + +For convenience, some LangChain chat models support a `.withStructuredOutput()` method. +This method only requires a schema as input, and returns an object matching the requested schema. +Generally, this method is only present on models that support one of the more advanced methods described below, +and will use one of them under the hood. It takes care of importing a suitable output parser and +formatting the schema in the right format for the model. + +For more information, check out this [how-to guide](/docs/how_to/structured_output/#the-.withstructuredoutput-method). + +#### Raw prompting + +The most intuitive way to get a model to structure output is to ask nicely. +In addition to your query, you can give instructions describing what kind of output you'd like, then +parse the output using an [output parser](/docs/concepts/#output-parsers) to convert the raw +model message or string output into something more easily manipulated. + +The biggest benefit to raw prompting is its flexibility: + +- Raw prompting does not require any special model features, only sufficient reasoning capability to understand + the passed schema. +- You can prompt for any format you'd like, not just JSON. This can be useful if the model you + are using is more heavily trained on a certain type of data, such as XML or YAML. + +However, there are some drawbacks too: + +- LLMs are non-deterministic, and prompting a LLM to consistently output data in the exactly correct format + for smooth parsing can be surprisingly difficult and model-specific. +- Individual models have quirks depending on the data they were trained on, and optimizing prompts can be quite difficult. + Some may be better at interpreting [JSON schema](https://json-schema.org/), others may be best with TypeScript definitions, + and still others may prefer XML. + +While we'll next go over some ways that you can take advantage of features offered by +model providers to increase reliability, prompting techniques remain important for tuning your +results no matter what method you choose. + +#### JSON mode + + + +Some models, such as [Mistral](/docs/integrations/chat/mistral/), [OpenAI](/docs/integrations/chat/openai/), +[Together AI](/docs/integrations/chat/togetherai/) and [Ollama](/docs/integrations/chat/ollama/), +support a feature called **JSON mode**, usually enabled via config. + +When enabled, JSON mode will constrain the model's output to always be some sort of valid JSON. +Often they require some custom prompting, but it's usually much less burdensome and along the lines of, +`"you must always return JSON"`, and the [output is easier to parse](/docs/how_to/output_parser_json/). + +It's also generally simpler and more commonly available than tool calling. + +Here's an example: + +```ts +import { JsonOutputParser } from "@langchain/core/output_parsers"; +import { ChatPromptTemplate } from "@langchain/core/prompts"; +import { ChatOpenAI } from "@langchain/openai"; + +const model = new ChatOpenAI({ + model: "gpt-4o", + modelKwargs: { + response_format: { type: "json_object" }, + }, +}); + +const TEMPLATE = `Answer the user's question to the best of your ability. +You must always output a JSON object with an "answer" key and a "followup_question" key. + +{question}`; + +const prompt = ChatPromptTemplate.fromTemplate(TEMPLATE); + +const chain = prompt.pipe(model).pipe(new JsonOutputParser()); + +await chain.invoke({ question: "What is the powerhouse of the cell?" }); +``` + +``` +{ + answer: "The powerhouse of the cell is the mitochondrion.", + followup_question: "Would you like to learn more about the functions of mitochondria?" +} +``` + +For a full list of model providers that support JSON mode, see [this table](/docs/integrations/chat/). + +#### Function/tool calling :::info We use the term tool calling interchangeably with function calling. Although @@ -769,8 +867,10 @@ from unstructured text, you could give the model an "extraction" tool that takes parameters matching the desired schema, then treat the generated output as your final result. -A tool call includes a name, arguments dict, and an optional identifier. The -arguments dict is structured `{argument_name: argument_value}`. +For models that support it, tool calling can be very convenient. It removes the +guesswork around how best to prompt schemas in favor of a built-in model feature. It can also +more naturally support agentic flows, since you can just pass multiple tool schemas instead +of fiddling with enums or unions. Many LLM providers, including [Anthropic](https://www.anthropic.com/), [Cohere](https://cohere.com/), [Google](https://cloud.google.com/vertex-ai), @@ -787,14 +887,16 @@ LangChain provides a standardized interface for tool calling that is consistent The standard interface consists of: -- `ChatModel.bindTools()`: a method for specifying which tools are available for a model to call. +- `ChatModel.bindTools()`: a method for specifying which tools are available for a model to call. This method accepts [LangChain tools](/docs/concepts/#tools). - `AIMessage.toolCalls`: an attribute on the `AIMessage` returned from the model for accessing the tool calls requested by the model. -There are two main use cases for function/tool calling: +The following how-to guides are good practical resources for using function/tool calling: - [How to return structured data from an LLM](/docs/how_to/structured_output/) - [How to use a model to call tools](/docs/how_to/tool_calling/) +For a full list of model providers that support tool calling, [see this table](/docs/integrations/chat/). + ### Retrieval LangChain provides several advanced retrieval types. A full list is below, along with the following information: diff --git a/docs/core_docs/docs/how_to/structured_output.ipynb b/docs/core_docs/docs/how_to/structured_output.ipynb index 56dfa24f315f..2173bb92ecfa 100644 --- a/docs/core_docs/docs/how_to/structured_output.ipynb +++ b/docs/core_docs/docs/how_to/structured_output.ipynb @@ -16,6 +16,9 @@ "metadata": {}, "source": [ "# How to return structured data from a model\n", + "```{=mdx}\n", + "\n", + "```\n", "\n", "It is often useful to have a model return output that matches some specific schema. One common use-case is extracting data from arbitrary text to insert into a traditional database or use with some other downstrem system. This guide will show you a few different strategies you can use to do this.\n", "\n", diff --git a/docs/core_docs/docs/integrations/chat/index.mdx b/docs/core_docs/docs/integrations/chat/index.mdx index c73f748e0bd0..8daf811e1dab 100644 --- a/docs/core_docs/docs/integrations/chat/index.mdx +++ b/docs/core_docs/docs/integrations/chat/index.mdx @@ -1,6 +1,7 @@ --- sidebar_position: 1 sidebar_class_name: hidden +hide_table_of_contents: true --- # Chat models @@ -11,36 +12,33 @@ All ChatModels implement the Runnable interface, which comes with default implem - _Streaming_ support defaults to returning an `AsyncIterator` of a single value, the final result returned by the underlying ChatModel provider. This obviously doesn't give you token-by-token streaming, which requires native support from the ChatModel provider, but ensures your code that expects an iterator of tokens can work for any of our ChatModel integrations. - _Batch_ support defaults to calling the underlying ChatModel in parallel for each input. The concurrency can be controlled with the `maxConcurrency` key in `RunnableConfig`. -- _Map_ support defaults to calling `.invoke` across all instances of the array which it was called on. Each ChatModel integration can optionally provide native implementations to truly enable invoke, streaming or batching requests. Additionally, some chat models support additional ways of guaranteeing structure in their outputs by allowing you to pass in a defined schema. -[Function calling and parallel function calling](/docs/how_to/tool_calling) (tool calling) are two common ones, and those capabilities allow you to use the chat model as the LLM in certain types of agents. +[Tool calling](/docs/how_to/tool_calling) (tool calling) is one capability, and allows you to use the chat model as the LLM in certain types of agents. Some models in LangChain have also implemented a `withStructuredOutput()` method that unifies many of these different ways of constraining output to a schema. The table shows, for each integration, which features have been implemented with native support. Yellow circles (🟡) indicates partial support - for example, if the model supports tool calling but not tool messages for agents. -| Model | Invoke | Stream | Batch | Function Calling | Tool Calling | `withStructuredOutput()` | -| :---------------------- | :----: | :----: | :---: | :--------------: | :-------------------------: | :----------------------: | -| BedrockChat | ✅ | ✅ | ✅ | ❌ | 🟡 (Bedrock Anthropic only) | ❌ | -| ChatAlibabaTongyi | ✅ | ❌ | ✅ | ❌ | ❌ | ❌ | -| ChatAnthropic | ✅ | ✅ | ✅ | ❌ | ✅ | ✅ | -| ChatBaiduWenxin | ✅ | ❌ | ✅ | ❌ | ❌ | ❌ | -| ChatCloudflareWorkersAI | ✅ | ✅ | ✅ | ❌ | ❌ | ❌ | -| ChatCohere | ✅ | ✅ | ✅ | ❌ | ❌ | ❌ | -| ChatFireworks | ✅ | ✅ | ✅ | ✅ | ✅ | ❌ | -| ChatGoogleGenerativeAI | ✅ | ✅ | ✅ | ❌ | ❌ | ❌ | -| ChatGoogleVertexAI | ✅ | ✅ | ✅ | ❌ | ❌ | ❌ | -| ChatVertexAI | ✅ | ✅ | ✅ | ❌ | ✅ | ✅ | -| ChatGooglePaLM | ✅ | ❌ | ✅ | ❌ | ❌ | ❌ | -| ChatGroq | ✅ | ✅ | ✅ | ❌ | 🟡 | ✅ | -| ChatLlamaCpp | ✅ | ✅ | ✅ | ❌ | ❌ | ❌ | -| ChatMinimax | ✅ | ❌ | ✅ | ✅ | ❌ | ❌ | -| ChatMistralAI | ✅ | ❌ | ✅ | ❌ | ✅ | ✅ | -| ChatOllama | ✅ | ✅ | ✅ | ❌ | ❌ | ❌ | -| ChatOpenAI | ✅ | ✅ | ✅ | ✅ | ✅ | ✅ | -| ChatTencentHunyuan | ✅ | ✅ | ✅ | ❌ | ❌ | ❌ | -| ChatTogetherAI | ✅ | ✅ | ✅ | ❌ | ❌ | ❌ | -| ChatYandexGPT | ✅ | ❌ | ✅ | ❌ | ❌ | ❌ | -| ChatZhipuAI | ✅ | ❌ | ✅ | ❌ | ❌ | ❌ | +| Model | Stream | JSON mode | [Tool Calling](/docs/how_to/tool_calling/) | [`withStructuredOutput()`](/docs/how_to/structured_output/#the-.withstructuredoutput-method) | [Multimodal](/docs/how_to/multimodal_inputs/) | +| :---------------------- | :----: | :-------: | :----------------------------------------: | :------------------------------------------------------------------------------------------: | :-------------------------------------------: | +| BedrockChat | ✅ | ❌ | 🟡 (Bedrock Anthropic only) | ❌ | ❌ | +| ChatAlibabaTongyi | ❌ | ❌ | ❌ | ❌ | ❌ | +| ChatAnthropic | ✅ | ❌ | ✅ | ✅ | ✅ | +| ChatBaiduWenxin | ❌ | ❌ | ❌ | ❌ | ❌ | +| ChatCloudflareWorkersAI | ✅ | ❌ | ❌ | ❌ | ❌ | +| ChatCohere | ✅ | ❌ | ❌ | ❌ | ❌ | +| ChatFireworks | ✅ | ✅ | ✅ | ❌ | ❌ | +| ChatGoogleGenerativeAI | ✅ | ❌ | ❌ | ❌ | ✅ | +| ChatVertexAI | ✅ | ❌ | ✅ | ✅ | ✅ | +| ChatGroq | ✅ | ✅ | 🟡 | ✅ | ❌ | +| ChatLlamaCpp | ✅ | ❌ | ❌ | ❌ | ❌ | +| ChatMinimax | ❌ | ❌ | ❌ | ❌ | ❌ | +| ChatMistralAI | ❌ | ✅ | ✅ | ✅ | ❌ | +| ChatOllama | ✅ | ✅ | ❌ | ❌ | ❌ | +| ChatOpenAI | ✅ | ✅ | ✅ | ✅ | ✅ | +| ChatTencentHunyuan | ✅ | ❌ | ❌ | ❌ | ❌ | +| ChatTogetherAI | ✅ | ✅ | ❌ | ❌ | ❌ | +| ChatYandexGPT | ❌ | ❌ | ❌ | ❌ | ❌ | +| ChatZhipuAI | ❌ | ❌ | ❌ | ❌ | ❌ |