Zod to JSONSchema conversion doesn't use openai custom logic for openai #6479

airhorns · 2024-08-09T15:16:34Z

Checked other resources

I added a very descriptive title to this issue.
I searched the LangChain.js documentation with the integrated search.
I used the GitHub search to find a similar question and didn't find it.
I am sure that this is a bug in LangChain.js rather than my code.
The bug is not resolved by updating to the latest stable version of LangChain (or the specific integration package).

Example Code

OpenAI's new structured output support requires compiling zod to JSONSchema, which is something that LangChain already knows how to do. However -- OpenAI's code client uses a specialized, forked version of zod-to-json-schema internally to do this conversion in order to improve the compatibility of the generated JSON schemas with what is supported server side in their API. Specifically, they don't use relative references, which LangChain does, because they aren't supported server side. OpenAI's client also handles nullability differently.

You can see their custom converter here: https://github.com/openai/openai-node/blob/cc13af9fa7e76e774d3132bba2427bb0176bd622/src/helpers/zod.ts#L11-L19

and some commits that added the custom logic: openai/openai-node@fc22483 and openai/openai-node@fc22483

As it stands now, LangChain's conversation of tool or structured output schemas doesn't use this fancy logic, and breaks in many more cases. If you re-use a chunk of schema within a bigger, outer schema, LangChain triggers an OpenAI 400 error, and the OpenAI client doesn't.

It'd be great if LangChain could be a client of OpenAI's zodResponseFormat or zodFunction helpers instead of doing its own conversion, or something like that.

Error Message and Stack Trace (if applicable)

No response

Description

I'm trying to use LangChain to a call a tool with the same inner bit of Zod schema repeated more than once, and it fails with OpenAI's new strict: true parameter when OpenAI's client doesn't.

System Info

@langchain/core 0.2.21
openai 4.55.3
node v22.2.0
pnpm v8.12.0

The text was updated successfully, but these errors were encountered:

dosubot · 2024-08-09T15:21:05Z

Hey there, @airhorns! I'm here to help you with any bugs, questions, or contributions. Let's tackle this issue together.

To address the issue of LangChain's Zod to JSON Schema conversion not using OpenAI's custom logic, you can update LangChain to utilize OpenAI's custom Zod to JSON Schema converter. This will improve compatibility and reduce errors when using OpenAI's new structured output support.

Here's how you can modify the createStructuredOutputChainFromZod function to use the zodToJsonSchema function for converting Zod schemas to JSON schemas:

/** @deprecated Use {@link https://v02.api.js.langchain.com/functions/langchain_chains_openai_functions.createStructuredOutputRunnable.html | createStructuredOutputRunnable} instead */
export function createStructuredOutputChainFromZod<T extends z.AnyZodObject>(
  zodSchema: T,
  input: Omit<StructuredOutputChainInput<T>, "outputSchema">
) {
  return createStructuredOutputChain<T>({
    ...input,
    outputSchema: zodToJsonSchema(zodSchema),
    zodSchema,
  });
}

Additionally, you can refer to the createMetadataTaggerFromZod function in the langchain/src/document_transformers/openai_functions.ts file, which demonstrates another example of using the zodToJsonSchema function to convert Zod schemas to JSON schemas:

export function createMetadataTaggerFromZod(
  schema: z.AnyZodObject,
  options: TaggingChainOptions & { llm?: ChatOpenAI }
) {
  return createMetadataTagger(
    zodToJsonSchema(schema) as JsonSchema7ObjectType,
    options
  );
}

For a more comprehensive setup, you can look at the createStructuredOutputRunnable function in the langchain/src/chains/openai_functions/base.ts file, which supports both JSON Schema and Zod schema:

export function createStructuredOutputRunnable<
  RunInput extends Record<string, any> = Record<string, any>,
  RunOutput extends Record<string, any> = Record<string, any>
>(
  config: CreateStructuredOutputRunnableConfig<RunInput, RunOutput>
): Runnable<RunInput, RunOutput> {
  const { outputSchema, llm, prompt, outputParser } = config;
  const jsonSchema = isZodSchema(outputSchema)
    ? zodToJsonSchema(outputSchema)
    : outputSchema;
  const oaiFunction: FunctionDefinition = {
    name: "outputFormatter",
    description:
      "Output formatter. Should always be used to format your response to the user",
    parameters: jsonSchema,
  };
  return createOpenAIFnRunnable({
    functions: [oaiFunction],
    llm,
    prompt,
    enforceSingleFunctionUsage: true,
    outputParser,
  });
}

This function checks if the provided schema is a Zod schema and converts it to JSON schema if necessary, ensuring compatibility with OpenAI's structured output support ^[1]^[2]^[3].

To continue talking to Dosu, mention @dosu.

bracesproul · 2024-08-09T16:17:53Z

Hey, we're implementing this in #6438

airhorns · 2024-08-09T17:49:54Z

Great! That looks like it addresses the response format for top-level structured output, but there's the same problem for function calling too for the function's input schema. Is that on y'alls radar as well?

bracesproul · 2024-08-09T20:02:26Z

Ah interesting, I wasn't aware they made other changes to how they handle schemas for normal tool calling. This is now on my radar and I'll get it in soon (likely sometime next week). Thank you!

airhorns · 2024-08-16T20:52:08Z

Can we keep this open until langchain tool calling supports the feature as well? For me at least that's the important one because it allows selection among multiple tools!

airhorns · 2024-08-23T15:57:25Z

ping @bracesproul sorry to bother but I do think there's a missing bit for proper support for strict OpenAI functions using their zod helpers for their special handling!

bracesproul · 2024-08-23T16:17:46Z

@airhorns what is the missing feature you're referring to?

airhorns · 2024-08-23T16:39:35Z

OpenAI uses a special zod-to-json-schema converter for both top-level model structured outputs as well as tool calling. AFAICT, Langchain now uses OpenAI's zodResponseFormat to do this conversion for top level .withStructuredOutput calls, which works great for that. But, Langchain doesn't use OpenAI's zodFunction for tool calls, like model.bindTools(...).invoke(...). That breaks for various tools in strict mode because langchain's internal zod-to-json-schema behaves differently and emits JSON schemas that aren't compatible with OpenAI.

I think the most reliable fix would be to make the same change and be a client of OpenAI's special zod-to-json-schema logic when converting zod schemas to tool calls, via zodFunction.

I've been using this utility function to convert LangChain tools to openai's internal function helper objects:

export const openaiFunctionForLangchainTool = memoize((tool: StructuredTool<any>) => {
  return zodFunction({
    name: tool.name,
    parameters: tool.schema!,
    function: (args) => tool.invoke(args),
    description: tool.description,
  });
});

and you folks could do the same thing and get their special schema with zodFunction({name: tool.name, parameters: tool.schema}).function.parameters

bracesproul · 2024-08-23T17:27:28Z

@airhorns we are using OpenAIs zodFunction util to convert tools passed in if they contain a Zod schema. See the _convertToOpenAITool function which is called from the _convertChatOpenAIToolTypeToOpenAITool function.

Could you provide me with some code to reproduce the error you're referencing?

airhorns · 2024-08-23T20:05:49Z

Ah, sorry I missed that, I only saw the other PR! Never mind me then, thanks!

dosubot bot added the auto:bug Related to a bug, vulnerability, unexpected error with an existing feature label Aug 9, 2024

bracesproul mentioned this issue Aug 9, 2024

openai[minor]: Add support for json schema response format #6438

Merged

bracesproul closed this as completed in #6438 Aug 16, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Zod to JSONSchema conversion doesn't use openai custom logic for openai #6479

Zod to JSONSchema conversion doesn't use openai custom logic for openai #6479

airhorns commented Aug 9, 2024

dosubot bot commented Aug 9, 2024

bracesproul commented Aug 9, 2024

airhorns commented Aug 9, 2024

bracesproul commented Aug 9, 2024 •

edited

Loading

airhorns commented Aug 16, 2024 •

edited

Loading

airhorns commented Aug 23, 2024

bracesproul commented Aug 23, 2024

airhorns commented Aug 23, 2024 •

edited

Loading

bracesproul commented Aug 23, 2024

airhorns commented Aug 23, 2024

Zod to JSONSchema conversion doesn't use openai custom logic for openai #6479

Zod to JSONSchema conversion doesn't use openai custom logic for openai #6479

Comments

airhorns commented Aug 9, 2024

Checked other resources

Example Code

Error Message and Stack Trace (if applicable)

Description

System Info

dosubot bot commented Aug 9, 2024

bracesproul commented Aug 9, 2024

airhorns commented Aug 9, 2024

bracesproul commented Aug 9, 2024 • edited Loading

airhorns commented Aug 16, 2024 • edited Loading

airhorns commented Aug 23, 2024

bracesproul commented Aug 23, 2024

airhorns commented Aug 23, 2024 • edited Loading

bracesproul commented Aug 23, 2024

airhorns commented Aug 23, 2024

bracesproul commented Aug 9, 2024 •

edited

Loading

airhorns commented Aug 16, 2024 •

edited

Loading

airhorns commented Aug 23, 2024 •

edited

Loading