Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ERROR Image size, could be decompression bomb DOS attack #217

Open
undernightcore opened this issue Sep 20, 2023 · 3 comments
Open

ERROR Image size, could be decompression bomb DOS attack #217

undernightcore opened this issue Sep 20, 2023 · 3 comments
Assignees

Comments

@undernightcore
Copy link

I've apparently surpassed the image size limit that PIL has by default.
image

It can be configurable as indicated in PIL docs.
image

@cragwolfe
Copy link
Contributor

@undernightcore , thanks for reporting the trace and highlighting the issue. it looks like one of the PDF pages exceeded this limit in pixels. as this is a pretty high limit, can you share more about the PDF?

@undernightcore
Copy link
Author

Sure, it does include pretty big images. This is the PDF it fails to parse - Mercadona Anual Report

@christinestraub christinestraub self-assigned this Oct 16, 2023
@christinestraub
Copy link
Contributor

Hi @undernightcore I tried to reproduce this error with the doc - Mercadona Anual Report but I couldn't reproduce this error. Works fine for me.

Local Testing Environment

OS: macOS intel x64
unstructured: 0.10.24
unstructured-inference: 0.7.5 

API Testing

import requests

url = "https://api.unstructured.io/general/v0/general"

headers = {
    "accept": "application/json",
    "Content-Type": "multipart/form-data",
    "unstructured-api-key": "<YOUR API KEY>"
}

data = {
    "coordinates": "true"
}

file_path = "/Path/To/File"
file_data = {'files': open(file_path, 'rb')}

response = requests.post(url, headers=headers, files=file_data, data=data)

file_data['files'].close()

json_response = response.json()

Can you please try again? If you still get the same error on the doc, can you please share your working environment? So I can investigate this issue for you.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants