Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Loading a pdf results in a StopIteration error #383

Open
charlescearl opened this issue Nov 19, 2024 · 5 comments
Open

Loading a pdf results in a StopIteration error #383

charlescearl opened this issue Nov 19, 2024 · 5 comments
Assignees
Labels
bug Something isn't working

Comments

@charlescearl
Copy link

Bug

Running spacy-layout on a Apple M3 Pro with 36GB memory.
Python version 3.11.7

The following code is invoked in a python Jupyter notebook:

from docling.document_converter import DocumentConverter

converter = DocumentConverter()
result = converter.convert("a4b3a1f45daf416a950584c918f0a007.pdf", max_file_size=10)

Where a4b3a1f45daf416a950584c918f0a007.pdf is a 33 page 1.6M pdf containing text and pictures and tables.

The following error occurs

Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File ".venv/lib/python3.11/site-packages/pydantic/validate_call_decorator.py", line 60, in wrapper_function
    return validate_call_wrapper(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File ".venv/lib/python3.11/site-packages/pydantic/_internal/_validate_call.py", line 96, in __call__
    res = self.__pydantic_validator__.validate_python(pydantic_core.ArgsKwargs(args, kwargs))
          ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File ".venv/lib/python3.11/site-packages/docling/document_converter.py", line 161, in convert
    return next(all_res)
           ^^^^^^^^^^^^^
StopIteration

Steps to reproduce

Start a python interpreter in a Python 3.11.7 in which docling has been installed.
run the following

from docling.document_converter import DocumentConverter

converter = DocumentConverter()
result = converter.convert("a4b3a1f45daf416a950584c918f0a007.pdf", max_file_size=10)

Docling version

Docling version: 2.5.2
Docling Core version: 2.4.0
Docling IBM Models version: 2.0.3
Docling Parse version: 2.0.4

Python version

Python 3.11.7

@charlescearl charlescearl added the bug Something isn't working label Nov 19, 2024
@dolfim-ibm
Copy link
Contributor

Thanks for the report, we are not able to reproduce the same error. On the other hand, while trying out, we found another small bug fixed in #388.

Could you please check again after the fix in #388? In case, can you share the document?

@cau-git
Copy link
Contributor

cau-git commented Nov 20, 2024

@charlescearl could you share the document you see trouble with, or is that private/confidential data?

@charlescearl
Copy link
Author

Hi @cau-git. I am unfortunately seeing the same error with the new version:

Python 3.12.0 (main, Oct  2 2023, 20:56:14) [Clang 16.0.3 ] on darwin
Type "help", "copyright", "credits" or "license" for more information.
>>> from docling.document_converter import DocumentConverter
>>> converter = DocumentConverter()
>>> result = converter.convert("a4b3a1f45daf416a950584c918f0a007.pdf")
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File ".venv/lib/python3.12/site-packages/pydantic/validate_call_decorator.py", line 60, in wrapper_function
    return validate_call_wrapper(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File ".venv/lib/python3.12/site-packages/pydantic/_internal/_validate_call.py", line 96, in __call__
    res = self.__pydantic_validator__.validate_python(pydantic_core.ArgsKwargs(args, kwargs))
          ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File ".venv/lib/python3.12/site-packages/docling/document_converter.py", line 170, in convert
    return next(all_res)
           ^^^^^^^^^^^^^
StopIteration
$ docling --version
Docling version: 2.6.0
Docling Core version: 2.4.0
Docling IBM Models version: 2.0.5
Docling Parse version: 2.0.4

Unfortunately, the document is proprietary -- checking today whether it can be released.
Thanks for the help though.

@PeterStaar-IBM
Copy link
Contributor

@charlescearl If you want, you can share it via the email in the CONTRIBUTORS.md file.

@niderhoff
Copy link

I am also receiving StopIteration, but I am using a text file:

alembic_autogen_error.txt

Traceback (most recent call last):
  File "/Users/A78751003/projects/genai/viper/src/viper/app/app.py", line 155, in upload_file
    result = converter.convert(source)
             ^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/A78751003/projects/genai/viper/.venv/lib/python3.12/site-packages/pydantic/validate_call_decorator.py", line 60, in wrapper_function
    return validate_call_wrapper(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/A78751003/projects/genai/viper/.venv/lib/python3.12/site-packages/pydantic/_internal/_validate_call.py", line 96, in __call__
    res = self.__pydantic_validator__.validate_python(pydantic_core.ArgsKwargs(args, kwargs))
          ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/A78751003/projects/genai/viper/.venv/lib/python3.12/site-packages/docling/document_converter.py", line 170, in convert
    return next(all_res)
           ^^^^^^^^^^^^^
StopIteration

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/Users/A78751003/projects/genai/viper/.venv/lib/python3.12/site-packages/uvicorn/protocols/http/h11_impl.py", line 406, in run_asgi
    result = await app(  # type: ignore[func-returns-value]
             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/A78751003/projects/genai/viper/.venv/lib/python3.12/site-packages/uvicorn/middleware/proxy_headers.py", line 60, in __call__
    return await self.app(scope, receive, send)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/A78751003/projects/genai/viper/.venv/lib/python3.12/site-packages/fastapi/applications.py", line 1054, in __call__
    await super().__call__(scope, receive, send)
  File "/Users/A78751003/projects/genai/viper/.venv/lib/python3.12/site-packages/starlette/applications.py", line 113, in __call__
    await self.middleware_stack(scope, receive, send)
  File "/Users/A78751003/projects/genai/viper/.venv/lib/python3.12/site-packages/starlette/middleware/errors.py", line 187, in __call__
    raise exc
  File "/Users/A78751003/projects/genai/viper/.venv/lib/python3.12/site-packages/starlette/middleware/errors.py", line 165, in __call__
    await self.app(scope, receive, _send)
  File "/Users/A78751003/projects/genai/viper/.venv/lib/python3.12/site-packages/starlette/middleware/exceptions.py", line 62, in __call__
    await wrap_app_handling_exceptions(self.app, conn)(scope, receive, send)
  File "/Users/A78751003/projects/genai/viper/.venv/lib/python3.12/site-packages/starlette/_exception_handler.py", line 53, in wrapped_app
    raise exc
  File "/Users/A78751003/projects/genai/viper/.venv/lib/python3.12/site-packages/starlette/_exception_handler.py", line 42, in wrapped_app
    await app(scope, receive, sender)
  File "/Users/A78751003/projects/genai/viper/.venv/lib/python3.12/site-packages/starlette/routing.py", line 715, in __call__
    await self.middleware_stack(scope, receive, send)
  File "/Users/A78751003/projects/genai/viper/.venv/lib/python3.12/site-packages/starlette/routing.py", line 735, in app
    await route.handle(scope, receive, send)
  File "/Users/A78751003/projects/genai/viper/.venv/lib/python3.12/site-packages/starlette/routing.py", line 288, in handle
    await self.app(scope, receive, send)
  File "/Users/A78751003/projects/genai/viper/.venv/lib/python3.12/site-packages/starlette/routing.py", line 76, in app
    await wrap_app_handling_exceptions(app, request)(scope, receive, send)
  File "/Users/A78751003/projects/genai/viper/.venv/lib/python3.12/site-packages/starlette/_exception_handler.py", line 53, in wrapped_app
    raise exc
  File "/Users/A78751003/projects/genai/viper/.venv/lib/python3.12/site-packages/starlette/_exception_handler.py", line 42, in wrapped_app
    await app(scope, receive, sender)
  File "/Users/A78751003/projects/genai/viper/.venv/lib/python3.12/site-packages/starlette/routing.py", line 73, in app
    response = await f(request)
               ^^^^^^^^^^^^^^^^
  File "/Users/A78751003/projects/genai/viper/.venv/lib/python3.12/site-packages/fastapi/routing.py", line 301, in app
    raw_response = await run_endpoint_function(
                   ^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/A78751003/projects/genai/viper/.venv/lib/python3.12/site-packages/fastapi/routing.py", line 212, in run_endpoint_function
    return await dependant.call(**values)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
RuntimeError: coroutine raised StopIteration

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

5 participants