Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

400 Invalid JSON payload while sending an image message to Google Gemini 1.5 model #103

Open
nimish-gupta opened this issue Oct 16, 2024 · 7 comments

Comments

@nimish-gupta
Copy link

nimish-gupta commented Oct 16, 2024

Description

We encountered an issue when sending an image as part of a message to the Google Gemini 1.5 Flash model using the Braintrust Proxy. The request fails with the following error:

400 Invalid JSON payload received. 
Unknown name "text" at 'contents[1].parts[0]': Proto field is not repeating, cannot start list.

Steps to Reproduce:

  1. Set up the Braintrust Proxy with Google Gemini 1.5 Flash API using the following code snippet:
const client = new OpenAI({
  baseURL: "https://api.braintrust.dev/v1/proxy",
  apiKey: process.env.BRAINTRUST_API_KEY,
});

const response = await client.chat.completions.create({
  model: "gemini-1.5-flash-latest",
  messages: [
    { role: "user", content: "What do you think of this image?" },
    { role: "user", content: [{ type: "image_url", image_url: { url: "https://example.com/image.jpg" } }] }
  ],
});
console.log(response);

  1. The request fails with the 400 error mentioned above.

Expected Behaviour:

  1. The message containing the image should be processed successfully by the Gemini 1.5 Flash model via the Braintrust Proxy.
  2. The API should return a valid response describing the image or engaging with the user prompt accordingly.

Impact:

  1. This issue blocks us from sending media (specifically images) along with text messages to the Gemini 1.5 model.
  2. It is affecting the functionality needed for applications involving image and text-based conversation flows.
@ankrgyl
Copy link
Contributor

ankrgyl commented Oct 16, 2024

Looks like you might be missing an array around the second content block, i.e.

    { role: "user", content: { type: "image_url", image_url: { url: "https://example.com/image.jpg" } } }

should be

    { role: "user", content: [{ type: "image_url", image_url: { url: "https://example.com/image.jpg" } }] }

@nimish-gupta
Copy link
Author

nimish-gupta commented Oct 16, 2024

Looks like you might be missing an array around the second content block, i.e.

    { role: "user", content: { type: "image_url", image_url: { url: "https://example.com/image.jpg" } } }

should be

    { role: "user", content: [{ type: "image_url", image_url: { url: "https://example.com/image.jpg" } }] }

my bad in the issue description (updated the description), I am sending the array.

@ankrgyl
Copy link
Contributor

ankrgyl commented Oct 16, 2024

Can you send an exact repro then? I just tried using the above, and it worked just fine:

In [18]: client = OpenAI(base_url='https://api.braintrust.dev/v1/proxy', api_key=os.environ['BRAINTRUST_API_KEY'])

In [19]: client.chat.completions.create(  model="gemini-1.5-flash-latest",
    ...:   messages=[
    ...:     { "role": "user", "content": "What do you think of this image?" },
    ...:     { "role": "user", "content": [{ "type": "image_url", "image_url": {"url": "https://cdn.prod.website-files.com/64949e4863d96e26a1da8386/64f5f56c78d05cf501922f99_64a2ef9774661044d9755e98_URL%2520-%2520Glossary.png" }}] }
    ...:   ],
    ...: )

Returned

Out[19]: ChatCompletion(id='dbc961c9-b492-4cda-beb1-630ac1b94a4c', choices=[Choice(finish_reason='stop', index=0, logprobs=None, message=ChatCompletionMessage(content='The image is a good visual representation of the different parts of a URL. It clearly labels the protocol, domain name, and extension, making it easy to understand how they work together. The example URL is also relevant and helpful. Overall, it is a simple and effective way to illustrate the concept of URLs.\n\nOne suggestion for improvement could be to use a more visually appealing font or color scheme.', refusal=None, role='assistant', function_call=None, tool_calls=None))], created=1729060733, model='gemini-1.5-flash-latest', object='chat.completion', service_tier=None, system_fingerprint=None, usage=CompletionUsage(completion_tokens=80, prompt_tokens=268, total_tokens=348))

@nimish-gupta
Copy link
Author

So this is the code that i am using. I am this open ai node package

  const messages = [ {
    role: "user" as const,
    content: [
      {
        type: "image_url" as const,
        image_url: {
          url: "https://cdn.prod.website-files.com/64949e4863d96e26a1da8386/64f5f56c78d05cf501922f99_64a2ef9774661044d9755e98_URL%2520-%2520Glossary.png",
        },
      },
    ],
  }];

  const completion = await client.chat.completions.create({
    model: "gemini-1.5-flash-latest",
    messages,
    logprobs: true,
    temperature: 0,
  });

  console.log(completion);

After this i am getting the error:

[2024-10-16 12:20:39] ERROR: Unhandled rejection
    err: {
      "type": "BadRequestError",
      "message": "400 Invalid JSON payload received. Unknown name \"text\" at 'contents[0].parts[0]': Proto field is not repeating, cannot start list.",
      "stack":
          Error: 400 Invalid JSON payload received. Unknown name "text" at 'contents[0].parts[0]': Proto field is not repeating, cannot start list.
              at Function.generate (~/node_modules/.pnpm/[email protected][email protected]/node_modules/openai/error.js:45:20)
              at OpenAI.makeStatusError (~/node_modules/.pnpm/[email protected][email protected]/node_modules/openai/core.js:275:33)
              at OpenAI.makeRequest (~/node_modules/.pnpm/[email protected][email protected]/node_modules/openai/core.js:318:30)
              at process.processTicksAndRejections (node:internal/process/task_queues:95:5)
      "status": 400,
      "headers": {
        "connection": "keep-alive",
        "content-type": "application/json; charset=UTF-8",
        "date": "Wed, 16 Oct 2024 06:50:39 GMT",
        "server-timing": "gfet4t7; dur=59",
        "transfer-encoding": "chunked",
        "x-amz-cf-pop": "DEL54-P6",
        "x-amzn-requestid": "b5178772-ad22-4a3d-9379-a2d100e263f0",
        "x-amzn-trace-id": "Root=1-670f623e-7aacdbad3c9efa58021df12b;Parent=63eae876af5b71e1;Sampled=0;Lineage=1:a00f27e0:0",
        "x-bt-cached": "MISS",
        "x-cache": "Error from cloudfront",
        "x-cached": "false",
        "x-content-type-options": "nosniff",
        "x-frame-options": "SAMEORIGIN",
        "x-xss-protection": "0"
      },
      "error": {
        "type": "Object",
        "message": "Invalid JSON payload received. Unknown name \"text\" at 'contents[0].parts[0]': Proto field is not repeating, cannot start list.",
        "stack":

        "code": 400,
        "status": "INVALID_ARGUMENT",
        "details": [
          {
            "@type": "type.googleapis.com/google.rpc.BadRequest",
            "fieldViolations": [
              {
                "field": "contents[0].parts[0]",
                "description": "Invalid JSON payload received. Unknown name \"text\" at 'contents[0].parts[0]': Proto field is not repeating, cannot start list."
              }
            ]
          }
        ]
      },
      "code": 400
    }

@ankrgyl
Copy link
Contributor

ankrgyl commented Oct 16, 2024

Hmm it's unfortunately not reproducing for me. Do you have a support agent on the Google side? It may require their help (e.g. maybe your API key is going to a different model than mine):

In [7]: client.chat.completions.create(  model="gemini-1.5-flash-latest",
   ...:   messages=[
   ...:
   ...:     { "role": "user", "content": [{ "type": "image_url", "image_url": {"url": "https://cdn.prod.website-files.com/64949e4863d96e26a1da8386/64f5f56c78d05cf501922f99_64a2ef9774661044d9755e98_URL%2520-%2520Glossary.png" }}] }
   ...:   ],
   ...:   logprobs=True,
   ...:   temperature=0,
   ...: )
Out[7]: ChatCompletion(id='b48b36b2-2a30-420f-80a6-97d00bc44c6f', choices=[Choice(finish_reason='stop', index=0, logprobs=None, message=ChatCompletionMessage(content='Protocol: **http://**\nDomain name: **www.example.com**\nExtension: **blog/what-is-a-url**', refusal=None, role='assistant', function_call=None, tool_calls=None))], created=1729061879, model='gemini-1.5-flash-latest', object='chat.completion', service_tier=None, system_fingerprint=None, usage=CompletionUsage(completion_tokens=31, prompt_tokens=259, total_tokens=290))

@nimish-gupta
Copy link
Author

e.g. maybe your API key is going to a different model than mine

Thanks for pointing out to this. I changed to gemini-1.5-flash and it starts working.

@ankrgyl
Copy link
Contributor

ankrgyl commented Oct 16, 2024

Chatted a bit with the gemini team, and they suggest using gemini-1.5-pro and gemini-1.5-flash without the -latest suffix.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants