DALL-E Tool not responding with image #2246

perzeuss · 2024-01-26T17:22:27Z

Self Checks

I have searched for existing issues search for existing issues, including closed ones.
I confirm that I am using English to file this report (我已阅读并同意 Language Policy).

Dify version

0.5.2

Cloud or Self Hosted

Self Hosted (Docker)

Steps to reproduce

Create an Agent Assistant
Add the Tool DALL-E 3 (Setup auth if you have not)
- Same issue with DALLE-2, so you can also choose this to reproduce the bug
Define the system prompt "You are an Agent capable of generating images with the tool DALL-E. Generated images will be sent to the user as an answer and not via mail."
- you can keep the prompt empty, I did this just for debugging
Choose the model "gpt-4-turbo-preview" and select "Precise" preset
- Same issue with other models e.g. gpt-3.5-turbo
Send the message "generate an image with a desk"

✔️ Expected Behavior

Agent responds with an image generated by DALL-E 3

❌ Actual Behavior

Agend responds with an empty image.

Tool data:

Request to dalle3
{"dalle3": {"prompt": "desk"}}
Response from dalle3
image has been created and sent to user already, you should tell user to check it now.

Hint: I can see in https://platform.openai.com/usage that DALL-E api has been called and images have been generated.

The text was updated successfully, but these errors were encountered:

dosubot · 2024-01-26T17:23:00Z

🤖

Hey @perzeuss, great to cross paths with you again! 🚀 Thanks for providing such a detailed report about the DALL-E issue you're experiencing. I'm diving into it now and will be back with you in a jiffy with some answers. 😺 Hang tight!

perzeuss · 2024-01-26T17:27:17Z

Another user faced this issue, see https://discord.com/channels/1082486657678311454/1107330913139970128/1200485873318383677

perzeuss · 2024-01-26T17:28:49Z

I tested it on cloud and the tool does respond with the same text, but the image is visible here:

perzeuss · 2024-01-26T17:44:35Z

I did more debugging.

API Logs:

INFO:httpx:HTTP Request: POST https://api.openai.com/v1/images/generations "HTTP/1.1 200 OK"
INFO:httpx:HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 200 OK"

Looks like the image has been generated. I also see that the frontend tries to fetch the image /files/tools/76e861ee-45b4-42d0-b412-dee37dea5f51.png?timestamp=1706289633&nonce=...&sign=... but the server responds with 404.
I do not see files in the configured s3, should the tool place DALL-E images there?

dosubot · 2024-01-26T18:18:08Z

🤖

Based on the information you've provided, it seems like the issue might be related to how the DALL-E tool is handling the response from the OpenAI API. In the provided code, the response from the OpenAI API is expected to be in 'b64_json' format. However, the 'b64_json' response format is not a standard response format for the OpenAI API.

The OpenAI API typically returns the generated image in a binary format, which is then base64 encoded by the client. In the provided code, it seems like the response is expected to be a base64 encoded JSON object, which might not be the case.

Here's the problematic part of the code:

response = client.images.generate(
    prompt=prompt,
    model='dall-e-3',
    size=size,
    n=n,
    style=style,
    quality=quality,
    response_format='b64_json'  # This might not be a valid response format
)

result = []
for image in response.data:
    result.append(self.create_blob_message(blob=b64decode(image.b64_json), 
                                           meta={ 'mime_type': 'image/png' },
                                            save_as=self.VARIABLE_KEY.IMAGE.value))

The 'b64_json' in the 'b64decode(image.b64_json)' line might not be the correct attribute to access the base64 encoded image data. You might need to check the actual structure of the response object to find the correct attribute.

Please note that this is just a hypothesis based on the provided code and the typical behavior of the OpenAI API. The actual issue might be different based on the specific implementation of the DALL-E tool and the OpenAI API client.

As for the expected format of the response from the DALL-E tool in the Dify codebase, it is a list of ToolInvokeMessage objects. Each ToolInvokeMessage object in the list represents an image generated by the DALL-E tool. The image data is stored in the blob field of the ToolInvokeMessage object as a base64 decoded binary data. The meta field of the ToolInvokeMessage object contains a dictionary with a mime_type key that has a value of 'image/png'. This indicates that the image data is in PNG format.

Here is the relevant code that constructs the response:

result = []

for image in response.data:
    result.append(self.create_blob_message(blob=b64decode(image.b64_json), 
                                           meta={ 'mime_type': 'image/png' },
                                            save_as=self.VARIABLE_KEY.IMAGE.value))

return result

In this code, response.data is the response from the OpenAI DALL-E tool. For each image in the response, a ToolInvokeMessage object is created with the base64 decoded image data and the mime type 'image/png'. These ToolInvokeMessage objects are then added to the result list which is returned as the final response.

You can find more details in the Dify codebase, specifically in the dalle2.py and dalle3.py files.

I hope this helps in identifying the issue. If you need further assistance, please provide more details about the specific implementation of the DALL-E tool and the OpenAI API client in your setup.

Sources

This response is meant to be useful and save you time. It is not meant to be a precise solution, but rather a starting point for your own research.

Help me be more useful! Please leave a 👍 if this is helpful and 👎 if it is irrelevant.

If you want to continue the conversation, start your reply with @dosu-bot.

perzeuss · 2024-01-26T18:43:06Z

Fixed for me by adding a missing entry in the nginx conf (etc/nginx/conf.d/default.conf).

    location /files {
      proxy_pass http://api:5001;
      include proxy.conf;
    }

I used https://github.com/BorisPolonsky/dify-helm to set up dify and the nginx setup is out of date there. I opened a PR to resolve this: BorisPolonsky/dify-helm#29

Kvieta1990 · 2024-01-26T18:54:58Z

Thank you so much for the quick fix and I can confirm that in my case it is working like a charm!

FYI, I was using the docker compose way to fire up the server and added in the missing section as suggested by @perzeuss in the docker/nginx/conf.d/default.conf file (following the directory structure of the main repo) and reran the docker composing.

crazywoola · 2024-01-27T03:55:11Z

Define the system prompt "You are an Agent capable of generating images with the tool DALL-E. Generated images will be sent to the user as an answer and not via mail."

How about remove this from the system prompt, this seems unnecessary to me.

perzeuss · 2024-01-29T15:09:10Z

Define the system prompt "You are an Agent capable of generating images with the tool DALL-E. Generated images will be sent to the user as an answer and not via mail."

How about remove this from the system prompt, this seems unnecessary to me.

I just did that for testing, because I initially thought the image does the wrong thing because of the tool response "image has been created and sent to user already, you should tell user to check it now."

However, the issue can be closed, we just had to update the nginx config and there was no migration hint about that.

perzeuss added the 🐞 bug Something isn't working label Jan 26, 2024

perzeuss closed this as completed Jan 29, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

DALL-E Tool not responding with image #2246

DALL-E Tool not responding with image #2246

perzeuss commented Jan 26, 2024 •

edited

Loading

dosubot bot commented Jan 26, 2024

perzeuss commented Jan 26, 2024

perzeuss commented Jan 26, 2024

perzeuss commented Jan 26, 2024

dosubot bot commented Jan 26, 2024

perzeuss commented Jan 26, 2024 •

edited

Loading

Kvieta1990 commented Jan 26, 2024

crazywoola commented Jan 27, 2024

perzeuss commented Jan 29, 2024

DALL-E Tool not responding with image #2246

DALL-E Tool not responding with image #2246

Comments

perzeuss commented Jan 26, 2024 • edited Loading

Self Checks

Dify version

Cloud or Self Hosted

Steps to reproduce

✔️ Expected Behavior

❌ Actual Behavior

dosubot bot commented Jan 26, 2024

perzeuss commented Jan 26, 2024

perzeuss commented Jan 26, 2024

perzeuss commented Jan 26, 2024

dosubot bot commented Jan 26, 2024

Sources

perzeuss commented Jan 26, 2024 • edited Loading

Kvieta1990 commented Jan 26, 2024

crazywoola commented Jan 27, 2024

perzeuss commented Jan 29, 2024

perzeuss commented Jan 26, 2024 •

edited

Loading

perzeuss commented Jan 26, 2024 •

edited

Loading