Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Zod to JSONSchema conversion doesn't use openai custom logic for openai #6479

Closed
5 tasks done
airhorns opened this issue Aug 9, 2024 · 10 comments · Fixed by #6438
Closed
5 tasks done

Zod to JSONSchema conversion doesn't use openai custom logic for openai #6479

airhorns opened this issue Aug 9, 2024 · 10 comments · Fixed by #6438
Labels
auto:bug Related to a bug, vulnerability, unexpected error with an existing feature

Comments

@airhorns
Copy link

airhorns commented Aug 9, 2024

Checked other resources

  • I added a very descriptive title to this issue.
  • I searched the LangChain.js documentation with the integrated search.
  • I used the GitHub search to find a similar question and didn't find it.
  • I am sure that this is a bug in LangChain.js rather than my code.
  • The bug is not resolved by updating to the latest stable version of LangChain (or the specific integration package).

Example Code

OpenAI's new structured output support requires compiling zod to JSONSchema, which is something that LangChain already knows how to do. However -- OpenAI's code client uses a specialized, forked version of zod-to-json-schema internally to do this conversion in order to improve the compatibility of the generated JSON schemas with what is supported server side in their API. Specifically, they don't use relative references, which LangChain does, because they aren't supported server side. OpenAI's client also handles nullability differently.

You can see their custom converter here: https://github.com/openai/openai-node/blob/cc13af9fa7e76e774d3132bba2427bb0176bd622/src/helpers/zod.ts#L11-L19

and some commits that added the custom logic: openai/openai-node@fc22483 and openai/openai-node@fc22483

As it stands now, LangChain's conversation of tool or structured output schemas doesn't use this fancy logic, and breaks in many more cases. If you re-use a chunk of schema within a bigger, outer schema, LangChain triggers an OpenAI 400 error, and the OpenAI client doesn't.

It'd be great if LangChain could be a client of OpenAI's zodResponseFormat or zodFunction helpers instead of doing its own conversion, or something like that.

Error Message and Stack Trace (if applicable)

No response

Description

I'm trying to use LangChain to a call a tool with the same inner bit of Zod schema repeated more than once, and it fails with OpenAI's new strict: true parameter when OpenAI's client doesn't.

System Info

@langchain/core 0.2.21
openai 4.55.3
node v22.2.0
pnpm v8.12.0

@dosubot dosubot bot added the auto:bug Related to a bug, vulnerability, unexpected error with an existing feature label Aug 9, 2024
Copy link

dosubot bot commented Aug 9, 2024

Hey there, @airhorns! I'm here to help you with any bugs, questions, or contributions. Let's tackle this issue together.

To address the issue of LangChain's Zod to JSON Schema conversion not using OpenAI's custom logic, you can update LangChain to utilize OpenAI's custom Zod to JSON Schema converter. This will improve compatibility and reduce errors when using OpenAI's new structured output support.

Here's how you can modify the createStructuredOutputChainFromZod function to use the zodToJsonSchema function for converting Zod schemas to JSON schemas:

/** @deprecated Use {@link https://v02.api.js.langchain.com/functions/langchain_chains_openai_functions.createStructuredOutputRunnable.html | createStructuredOutputRunnable} instead */
export function createStructuredOutputChainFromZod<T extends z.AnyZodObject>(
  zodSchema: T,
  input: Omit<StructuredOutputChainInput<T>, "outputSchema">
) {
  return createStructuredOutputChain<T>({
    ...input,
    outputSchema: zodToJsonSchema(zodSchema),
    zodSchema,
  });
}

Additionally, you can refer to the createMetadataTaggerFromZod function in the langchain/src/document_transformers/openai_functions.ts file, which demonstrates another example of using the zodToJsonSchema function to convert Zod schemas to JSON schemas:

export function createMetadataTaggerFromZod(
  schema: z.AnyZodObject,
  options: TaggingChainOptions & { llm?: ChatOpenAI }
) {
  return createMetadataTagger(
    zodToJsonSchema(schema) as JsonSchema7ObjectType,
    options
  );
}

For a more comprehensive setup, you can look at the createStructuredOutputRunnable function in the langchain/src/chains/openai_functions/base.ts file, which supports both JSON Schema and Zod schema:

export function createStructuredOutputRunnable<
  RunInput extends Record<string, any> = Record<string, any>,
  RunOutput extends Record<string, any> = Record<string, any>
>(
  config: CreateStructuredOutputRunnableConfig<RunInput, RunOutput>
): Runnable<RunInput, RunOutput> {
  const { outputSchema, llm, prompt, outputParser } = config;
  const jsonSchema = isZodSchema(outputSchema)
    ? zodToJsonSchema(outputSchema)
    : outputSchema;
  const oaiFunction: FunctionDefinition = {
    name: "outputFormatter",
    description:
      "Output formatter. Should always be used to format your response to the user",
    parameters: jsonSchema,
  };
  return createOpenAIFnRunnable({
    functions: [oaiFunction],
    llm,
    prompt,
    enforceSingleFunctionUsage: true,
    outputParser,
  });
}

This function checks if the provided schema is a Zod schema and converts it to JSON schema if necessary, ensuring compatibility with OpenAI's structured output support [1][2][3].

To continue talking to Dosu, mention @dosu.

@bracesproul
Copy link
Collaborator

Hey, we're implementing this in #6438

@airhorns
Copy link
Author

airhorns commented Aug 9, 2024

Great! That looks like it addresses the response format for top-level structured output, but there's the same problem for function calling too for the function's input schema. Is that on y'alls radar as well?

@bracesproul
Copy link
Collaborator

bracesproul commented Aug 9, 2024

Ah interesting, I wasn't aware they made other changes to how they handle schemas for normal tool calling. This is now on my radar and I'll get it in soon (likely sometime next week). Thank you!

@airhorns
Copy link
Author

airhorns commented Aug 16, 2024

Can we keep this open until langchain tool calling supports the feature as well? For me at least that's the important one because it allows selection among multiple tools!

@airhorns
Copy link
Author

ping @bracesproul sorry to bother but I do think there's a missing bit for proper support for strict OpenAI functions using their zod helpers for their special handling!

@bracesproul
Copy link
Collaborator

@airhorns what is the missing feature you're referring to?

@airhorns
Copy link
Author

airhorns commented Aug 23, 2024

OpenAI uses a special zod-to-json-schema converter for both top-level model structured outputs as well as tool calling. AFAICT, Langchain now uses OpenAI's zodResponseFormat to do this conversion for top level .withStructuredOutput calls, which works great for that. But, Langchain doesn't use OpenAI's zodFunction for tool calls, like model.bindTools(...).invoke(...). That breaks for various tools in strict mode because langchain's internal zod-to-json-schema behaves differently and emits JSON schemas that aren't compatible with OpenAI.

I think the most reliable fix would be to make the same change and be a client of OpenAI's special zod-to-json-schema logic when converting zod schemas to tool calls, via zodFunction.

I've been using this utility function to convert LangChain tools to openai's internal function helper objects:

export const openaiFunctionForLangchainTool = memoize((tool: StructuredTool<any>) => {
  return zodFunction({
    name: tool.name,
    parameters: tool.schema!,
    function: (args) => tool.invoke(args),
    description: tool.description,
  });
});

and you folks could do the same thing and get their special schema with zodFunction({name: tool.name, parameters: tool.schema}).function.parameters

@bracesproul
Copy link
Collaborator

@airhorns we are using OpenAIs zodFunction util to convert tools passed in if they contain a Zod schema. See the _convertToOpenAITool function which is called from the _convertChatOpenAIToolTypeToOpenAITool function.

Could you provide me with some code to reproduce the error you're referencing?

@airhorns
Copy link
Author

Ah, sorry I missed that, I only saw the other PR! Never mind me then, thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
auto:bug Related to a bug, vulnerability, unexpected error with an existing feature
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants