Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

openai[patch]: ChatOpenAI.with_structured_output json_schema support #25123

Merged
merged 13 commits into from
Aug 7, 2024

Conversation

baskaryan
Copy link
Collaborator

No description provided.

@efriis efriis added the partner label Aug 7, 2024
@efriis efriis self-assigned this Aug 7, 2024
Copy link

vercel bot commented Aug 7, 2024

The latest updates on your projects. Learn more about Vercel for Git ↗︎

Name Status Preview Comments Updated (UTC)
langchain ✅ Ready (Inspect) Visit Preview 💬 Add feedback Aug 7, 2024 2:19am

@dosubot dosubot bot added size:L This PR changes 100-499 lines, ignoring generated files. 🔌: openai Primarily related to OpenAI integrations 🤖:improvement Medium size change to existing code to handle new use-cases labels Aug 7, 2024
@baskaryan baskaryan changed the title WIP: ChatOpenAI.with_structured_output json_schema support openai[patch]: ChatOpenAI.with_structured_output json_schema support Aug 7, 2024
@@ -298,6 +302,8 @@ class _AllReturnType(TypedDict):
class BaseChatOpenAI(BaseChatModel):
client: Any = Field(default=None, exclude=True) #: :meta private:
async_client: Any = Field(default=None, exclude=True) #: :meta private:
root_client: Any = Field(default=None, exclude=True) #: :meta private:
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

curious why is this necessary?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

to access the beta apis

Copy link
Collaborator

@ccurme ccurme left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks great. Is the change to root poetry.lock related?

"schema must be specified when method is not 'json_mode'. "
"Received None."
)
strict = strict if strict is not None else True
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

to confirm: does json_schema with strict=False work?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yep

@dosubot dosubot bot added the lgtm PR looks good. Use to confirm that a PR is ready for merging. label Aug 7, 2024
@baskaryan baskaryan merged commit 09fbce1 into master Aug 7, 2024
32 checks passed
@baskaryan baskaryan deleted the bagatur/oai_json_schema branch August 7, 2024 15:09
@ricnunespt
Copy link

In previous versions, I wanted to force the llm's response to be formatted as JSON. For that, I used the bind method to define the response format.
With the new update, I am getting an error that it cannot access beta.
Is there a way to solve this?

Here is the error:

File "/utils.py", line 114, in call_azure_openai
    res = llm.invoke(messages).content
          ^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/lib/python3.11/site-packages/langchain_core/runnables/base.py", line 5094, in invoke
    return self.bound.invoke(
           ^^^^^^^^^^^^^^^^^^
  File "/usr/lib/python3.11/site-packages/langchain_core/language_models/chat_models.py", line 284, in invoke
    self.generate_prompt(
  File "/usr/lib/python3.11/site-packages/langchain_core/language_models/chat_models.py", line 756, in generate_prompt
    return self.generate(prompt_messages, stop=stop, callbacks=callbacks, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/lib/python3.11/site-packages/langchain_core/language_models/chat_models.py", line 613, in generate
    raise e
  File "/usr/lib/python3.11/site-packages/langchain_core/language_models/chat_models.py", line 603, in generate
    self._generate_with_cache(
  File "/usr/lib/python3.11/site-packages/langchain_core/language_models/chat_models.py", line 825, in _generate_with_cache
    result = self._generate(
             ^^^^^^^^^^^^^^^
  File "/usr/lib/python3.11/site-packages/langchain_openai/chat_models/base.py", line 629, in _generate
    response = self.root_client.beta.chat.completions.parse(**payload)
               ^^^^^^^^^^^^^^^^^^^^^
AttributeError: 'NoneType' object has no attribute 'beta'

Here is the code example I am using:

import langchain_openai 

LLM= langchain_openai.AzureChatOpenAI(
    openai_api_key=os.getenv("OPENAI_KEY"),
    azure_endpoint=os.getenv("OPENAI_ENDPOINT"),
    azure_deployment=os.getenv("OPENAI_DEPLOYMENT"),
    api_version="2024-02-01",
    model="gpt-4o"
)

def call_azure_openai(messages, json_output=True):
    if json_output:
        llm= LLM.bind(response_format={"type": "json_object"})
    res = llm.invoke(messages).content
    return json.loads(res) if json_output else res

@baskaryan
Copy link
Collaborator Author

In previous versions, I wanted to force the llm's response to be formatted as JSON. For that, I used the bind method to define the response format. With the new update, I am getting an error that it cannot access beta. Is there a way to solve this?

Here is the error:

File "/utils.py", line 114, in call_azure_openai
    res = llm.invoke(messages).content
          ^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/lib/python3.11/site-packages/langchain_core/runnables/base.py", line 5094, in invoke
    return self.bound.invoke(
           ^^^^^^^^^^^^^^^^^^
  File "/usr/lib/python3.11/site-packages/langchain_core/language_models/chat_models.py", line 284, in invoke
    self.generate_prompt(
  File "/usr/lib/python3.11/site-packages/langchain_core/language_models/chat_models.py", line 756, in generate_prompt
    return self.generate(prompt_messages, stop=stop, callbacks=callbacks, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/lib/python3.11/site-packages/langchain_core/language_models/chat_models.py", line 613, in generate
    raise e
  File "/usr/lib/python3.11/site-packages/langchain_core/language_models/chat_models.py", line 603, in generate
    self._generate_with_cache(
  File "/usr/lib/python3.11/site-packages/langchain_core/language_models/chat_models.py", line 825, in _generate_with_cache
    result = self._generate(
             ^^^^^^^^^^^^^^^
  File "/usr/lib/python3.11/site-packages/langchain_openai/chat_models/base.py", line 629, in _generate
    response = self.root_client.beta.chat.completions.parse(**payload)
               ^^^^^^^^^^^^^^^^^^^^^
AttributeError: 'NoneType' object has no attribute 'beta'

Here is the code example I am using:

import langchain_openai 

LLM= langchain_openai.AzureChatOpenAI(
    openai_api_key=os.getenv("OPENAI_KEY"),
    azure_endpoint=os.getenv("OPENAI_ENDPOINT"),
    azure_deployment=os.getenv("OPENAI_DEPLOYMENT"),
    api_version="2024-02-01",
    model="gpt-4o"
)

def call_azure_openai(messages, json_output=True):
    if json_output:
        llm= LLM.bind(response_format={"type": "json_object"})
    res = llm.invoke(messages).content
    return json.loads(res) if json_output else res

hm which version of the openai sdk do you have?

@ricnunespt
Copy link

checking the openai version I get:

Name: openai
Version: 1.40.3

I assumed the sdk version is obtained the same way as libraries versions. Feel free to correct me if there is another way.

@Dahimi
Copy link

Dahimi commented Aug 16, 2024

In previous versions, I wanted to force the llm's response to be formatted as JSON. For that, I used the bind method to define the response format. With the new update, I am getting an error that it cannot access beta. Is there a way to solve this?

Here is the error:

File "/utils.py", line 114, in call_azure_openai
    res = llm.invoke(messages).content
          ^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/lib/python3.11/site-packages/langchain_core/runnables/base.py", line 5094, in invoke
    return self.bound.invoke(
           ^^^^^^^^^^^^^^^^^^
  File "/usr/lib/python3.11/site-packages/langchain_core/language_models/chat_models.py", line 284, in invoke
    self.generate_prompt(
  File "/usr/lib/python3.11/site-packages/langchain_core/language_models/chat_models.py", line 756, in generate_prompt
    return self.generate(prompt_messages, stop=stop, callbacks=callbacks, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/lib/python3.11/site-packages/langchain_core/language_models/chat_models.py", line 613, in generate
    raise e
  File "/usr/lib/python3.11/site-packages/langchain_core/language_models/chat_models.py", line 603, in generate
    self._generate_with_cache(
  File "/usr/lib/python3.11/site-packages/langchain_core/language_models/chat_models.py", line 825, in _generate_with_cache
    result = self._generate(
             ^^^^^^^^^^^^^^^
  File "/usr/lib/python3.11/site-packages/langchain_openai/chat_models/base.py", line 629, in _generate
    response = self.root_client.beta.chat.completions.parse(**payload)
               ^^^^^^^^^^^^^^^^^^^^^
AttributeError: 'NoneType' object has no attribute 'beta'

Here is the code example I am using:

import langchain_openai 

LLM= langchain_openai.AzureChatOpenAI(
    openai_api_key=os.getenv("OPENAI_KEY"),
    azure_endpoint=os.getenv("OPENAI_ENDPOINT"),
    azure_deployment=os.getenv("OPENAI_DEPLOYMENT"),
    api_version="2024-02-01",
    model="gpt-4o"
)

def call_azure_openai(messages, json_output=True):
    if json_output:
        llm= LLM.bind(response_format={"type": "json_object"})
    res = llm.invoke(messages).content
    return json.loads(res) if json_output else res

I got the same error

@PaLoic1
Copy link

PaLoic1 commented Aug 19, 2024

Hi @baskaryan, thanks for adding the support of JSON Schema to OpenAI, is planned to add it also to the AzureChat class ?

@baskaryan
Copy link
Collaborator Author

the above azure openai error and general azure support should be included in langchain-openai==0.1.22, would love to know if that resolves the issues / works for you @Dahimi @PaLoic1 @ricnunespt

@PaLoic1
Copy link

PaLoic1 commented Aug 20, 2024

Thanks @baskaryan for your feedback, I check the code of with_structured_output and it seems does not support json_schema here like in the ChatOpenAI class here

@baskaryan
Copy link
Collaborator Author

you're right, fixing here #25591

baskaryan added a commit that referenced this pull request Aug 23, 2024
…strict (#25169)

Hello. 
First of all, thank you for maintaining such a great project.

## Description
In #25123, support for
structured_output is added. However, `"additionalProperties": false`
needs to be included at all levels when a nested object is generated.

error from current code:
https://gist.github.com/fufufukakaka/e9b475300e6934853d119428e390f204
```
BadRequestError: Error code: 400 - {'error': {'message': "Invalid schema for response_format 'JokeWithEvaluation': In context=('properties', 'self_evaluation'), 'additionalProperties' is required to be supplied and to be false", 'type': 'invalid_request_error', 'param': 'response_format', 'code': None}}
```

Reference: [Introducing Structured Outputs in the
API](https://openai.com/index/introducing-structured-outputs-in-the-api/)

```json
{
  "model": "gpt-4o-2024-08-06",
  "messages": [
    {
      "role": "system",
      "content": "You are a helpful math tutor."
    },
    {
      "role": "user",
      "content": "solve 8x + 31 = 2"
    }
  ],
  "response_format": {
    "type": "json_schema",
    "json_schema": {
      "name": "math_response",
      "strict": true,
      "schema": {
        "type": "object",
        "properties": {
          "steps": {
            "type": "array",
            "items": {
              "type": "object",
              "properties": {
                "explanation": {
                  "type": "string"
                },
                "output": {
                  "type": "string"
                }
              },
              "required": ["explanation", "output"],
              "additionalProperties": false
            }
          },
          "final_answer": {
            "type": "string"
          }
        },
        "required": ["steps", "final_answer"],
        "additionalProperties": false
      }
    }
  }
}
```

In the current code, `"additionalProperties": false` is only added at
the last level.
This PR introduces the `_add_additional_properties_key` function, which
recursively adds `"additionalProperties": false` to the entire JSON
schema for the request.

Twitter handle: `@fukkaa1225`

Thank you!

---------

Co-authored-by: Bagatur <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
🤖:improvement Medium size change to existing code to handle new use-cases lgtm PR looks good. Use to confirm that a PR is ready for merging. 🔌: openai Primarily related to OpenAI integrations partner size:L This PR changes 100-499 lines, ignoring generated files.
Projects
Archived in project
Development

Successfully merging this pull request may close these issues.

7 participants