Knowledge - Not able to ingest pdf files with images using Azure OpenAI provider. #648

sangee2004 · 2024-11-20T23:07:17Z

Steps to reproduce the problem:

Started otto server with env variable needed for Azure OpenAI provider. (Note - I have an invalid key set for OPENAI_API_KEY)
Changed the provider for existing gpt-4o and text-embedding-3-small models to Azure OpenAI provider
Create an agent with local knowledge file - Holiday List_2024.pdf
Ingestion of this file succeeded.
Add another local knowledge file which has images - Reunion-Under-The-Stars.pdf

Ingestion of this file, fails with following error which shows OPENAI_API_KEY being used for loading the documents.
failed to load documents from file "ws://Reunion-Under-The-Stars.pdf" using loader "": error sending image to OpenAI: OpenAI OCR error sending request(s): retry limit (5) exceeded or failed with non-retriable error(s): #1/5: 401 <{ "error": { "message": "Incorrect API key provided: sk-test. You can find your API key at https://platform.openai.com/account/api-keys.", "type": "invalid_request_error", "param": null, "code": "invalid_api_key" } } > (err: )

Expected Behavior:
Ingestion of file should succeed by using Azure OpenAI provider model.

The text was updated successfully, but these errors were encountered:

cjellick · 2024-12-03T23:01:10Z

@thedadams This should be addressed by the revamp to model listing that you did the backend for and @ryanhopperlowe is doing the frontend for?

thedadams · 2024-12-03T23:18:48Z

Yes, that's correct.

ryanhopperlowe · 2024-12-03T23:18:51Z

UI portion should be fixed by #744

ryanhopperlowe · 2024-12-03T23:22:37Z

#744 is merged to main now.

@thedadams - can you move this to testing when the backend portion is finished? (same goes for #702)

cjellick · 2024-12-03T23:24:57Z

@ryanhopperlowe backend is already in, right? you would not have been able to do frontend without it

sangee2004 · 2024-12-04T19:01:35Z

Tested with latest version

  "github.com/otto8-ai/tools": "4e06ab18c812fddc91fb8ccedfe22ade1d641118",
  "otto": "v0.0.0-dev-13e341a9-dirty"

This issue is still reproducible.
I have create models with azure-openai-model-provider for text-embedding-3-small and gpt-4o-deployment models and set them as defaults.

{
  "id": "m164kmb",
  "created": "2024-12-04T18:16:44Z",
  "revision": "14",
  "type": "model",
  "name": "gpt-4o-deployment",
  "targetModel": "gpt-4o-deployment",
  "modelProvider": "azure-openai-model-provider",
  "active": true,
  "usage": "llm",
  "configured": true,
  "requiredConfigurationParameters": [
    "OTTO8_AZURE_OPENAI_MODEL_PROVIDER_ENDPOINT",
    "OTTO8_AZURE_OPENAI_MODEL_PROVIDER_CLIENT_ID",
    "OTTO8_AZURE_OPENAI_MODEL_PROVIDER_CLIENT_SECRET",
    "OTTO8_AZURE_OPENAI_MODEL_PROVIDER_TENANT_ID",
    "OTTO8_AZURE_OPENAI_MODEL_PROVIDER_SUBSCRIPTION_ID",
    "OTTO8_AZURE_OPENAI_MODEL_PROVIDER_RESOURCE_GROUP"
  ],
  "aliasAssigned": false
}

{
  "id": "m1pbdcc",
  "created": "2024-12-04T18:20:21Z",
  "revision": "20",
  "type": "model",
  "name": "text-embedding-3-small",
  "targetModel": "text-embedding-3-small",
  "modelProvider": "azure-openai-model-provider",
  "active": true,
  "usage": "text-embedding",
  "configured": true,
  "requiredConfigurationParameters": [
    "OTTO8_AZURE_OPENAI_MODEL_PROVIDER_ENDPOINT",
    "OTTO8_AZURE_OPENAI_MODEL_PROVIDER_CLIENT_ID",
    "OTTO8_AZURE_OPENAI_MODEL_PROVIDER_CLIENT_SECRET",
    "OTTO8_AZURE_OPENAI_MODEL_PROVIDER_TENANT_ID",
    "OTTO8_AZURE_OPENAI_MODEL_PROVIDER_SUBSCRIPTION_ID",
    "OTTO8_AZURE_OPENAI_MODEL_PROVIDER_RESOURCE_GROUP"
  ],
  "aliasAssigned": false
}

I see this error when ingesting knowledge file - Reunion-Under-The-Stars.pdf

failed to load documents from file "ws://Reunion-Under-The-Stars.pdf" using loader "": error sending image to OpenAI: OpenAI OCR error sending request(s): retry limit (5) exceeded or failed with non-retriable error(s): #1/5: 401 <{ "error": { "message": "Incorrect API key provided: sk-test. You can find your API key at https://platform.openai.com/account/api-keys.", "type": "invalid_request_error", "param": null, "code": "invalid_api_key" } } > (err: <nil>)

Note - I am starting the otto server with invalid openai api key in this case.

Ingesting Holiday List_2024.pdf succeeds.

thedadams · 2024-12-05T02:15:36Z

The final fix was: obot-platform/tools#261

sangee2004 · 2024-12-05T20:14:51Z

This issue is not seen anymore when testing with latest builds:

Default models was set to Azure OpenAI provider model for all model Aliases.

Able to successfully Ingest pdf files with images.

Was also able to ingest knowledge files from website successfully.

sangee2004 added bug Something isn't working knowledge labels Nov 20, 2024

sangee2004 changed the title ~~Knowledge - Not able to ingest pdf files (with images) with Azure OpenAI provider.~~ Knowledge - Not able to ingest pdf files with images using Azure OpenAI provider. Nov 20, 2024

thedadams self-assigned this Nov 21, 2024

cjellick assigned ryanhopperlowe Dec 3, 2024

sangee2004 closed this as completed Dec 5, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Knowledge - Not able to ingest pdf files with images using Azure OpenAI provider. #648

Knowledge - Not able to ingest pdf files with images using Azure OpenAI provider. #648

sangee2004 commented Nov 20, 2024 •

edited

Loading

cjellick commented Dec 3, 2024

thedadams commented Dec 3, 2024

ryanhopperlowe commented Dec 3, 2024 •

edited

Loading

ryanhopperlowe commented Dec 3, 2024 •

edited

Loading

cjellick commented Dec 3, 2024

sangee2004 commented Dec 4, 2024 •

edited

Loading

thedadams commented Dec 5, 2024

sangee2004 commented Dec 5, 2024

Knowledge - Not able to ingest pdf files with images using Azure OpenAI provider. #648

Knowledge - Not able to ingest pdf files with images using Azure OpenAI provider. #648

Comments

sangee2004 commented Nov 20, 2024 • edited Loading

cjellick commented Dec 3, 2024

thedadams commented Dec 3, 2024

ryanhopperlowe commented Dec 3, 2024 • edited Loading

ryanhopperlowe commented Dec 3, 2024 • edited Loading

cjellick commented Dec 3, 2024

sangee2004 commented Dec 4, 2024 • edited Loading

thedadams commented Dec 5, 2024

sangee2004 commented Dec 5, 2024

sangee2004 commented Nov 20, 2024 •

edited

Loading

ryanhopperlowe commented Dec 3, 2024 •

edited

Loading

ryanhopperlowe commented Dec 3, 2024 •

edited

Loading

sangee2004 commented Dec 4, 2024 •

edited

Loading