Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

revert unstructured-client pin and make pip-compile #3298

Merged
merged 17 commits into from
Jul 2, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 2 additions & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
@@ -1,6 +1,8 @@
## 0.14.10-dev5

### Enhancements
* **Update unstructured-client dependency** Change unstructured-client dependency pin back to
greater than min version and updated tests that were failing given the update.

* **`.doc` files are now supported in the `arm64` image.**. `libreoffice24` is added to the `arm64` image, meaning `.doc` files are now supported. We have follow on work planned to investigate adding `.ppt` support for `arm64` as well.

Expand Down
40 changes: 35 additions & 5 deletions requirements/base.txt
Original file line number Diff line number Diff line change
Expand Up @@ -4,13 +4,19 @@
#
# pip-compile ./base.in
#
anyio==3.7.1
# via
# -c ././deps/constraints.txt
# httpx
backoff==2.2.1
# via -r ./base.in
beautifulsoup4==4.12.3
# via -r ./base.in
certifi==2024.6.2
# via
# -c ././deps/constraints.txt
# httpcore
# httpx
# requests
# unstructured-client
chardet==5.2.0
Expand All @@ -22,15 +28,27 @@ charset-normalizer==3.3.2
click==8.1.7
# via nltk
dataclasses-json==0.6.7
# via -r ./base.in
dataclasses-json-speakeasy==0.5.11
# via
# -r ./base.in
# unstructured-client
deepdiff==7.0.1
# via unstructured-client
emoji==2.12.1
# via -r ./base.in
exceptiongroup==1.2.1
# via anyio
filetype==1.2.0
# via -r ./base.in
h11==0.14.0
# via httpcore
httpcore==1.0.5
# via httpx
httpx==0.27.0
# via unstructured-client
idna==3.7
# via
# anyio
# httpx
# requests
# unstructured-client
joblib==1.4.2
Expand All @@ -44,23 +62,28 @@ lxml==5.2.2
marshmallow==3.21.3
# via
# dataclasses-json
# dataclasses-json-speakeasy
# unstructured-client
mypy-extensions==1.0.0
# via
# typing-inspect
# unstructured-client
nest-asyncio==1.6.0
# via unstructured-client
nltk==3.8.1
# via -r ./base.in
numpy==1.26.4
# via -r ./base.in
ordered-set==4.1.0
# via deepdiff
packaging==23.2
# via
# -c ././deps/constraints.txt
# marshmallow
# unstructured-client
psutil==6.0.0
# via -r ./base.in
pypdf==4.2.0
# via unstructured-client
python-dateutil==2.9.0.post0
# via unstructured-client
python-iso639==2024.4.27
Expand All @@ -74,12 +97,19 @@ regex==2024.5.15
requests==2.32.3
# via
# -r ./base.in
# requests-toolbelt
# unstructured-client
requests-toolbelt==1.0.0
# via unstructured-client
six==1.16.0
# via
# langdetect
# python-dateutil
# unstructured-client
sniffio==1.3.1
# via
# anyio
# httpx
soupsieve==2.5
# via beautifulsoup4
tabulate==0.9.0
Expand All @@ -92,14 +122,14 @@ typing-extensions==4.12.2
# via
# -r ./base.in
# emoji
# pypdf
# typing-inspect
# unstructured-client
typing-inspect==0.9.0
# via
# dataclasses-json
# dataclasses-json-speakeasy
# unstructured-client
unstructured-client==0.18.0
unstructured-client==0.23.8
# via
# -c ././deps/constraints.txt
# -r ./base.in
Expand Down
2 changes: 1 addition & 1 deletion requirements/deps/constraints.txt
Original file line number Diff line number Diff line change
Expand Up @@ -49,7 +49,7 @@ urllib3<1.27
botocore<1.34.52

# NOTE(jennings): pinned due to later versions not supporting api_key_auth in UnstructuredClient
unstructured-client<=0.18.0
unstructured-client>=0.15.1

fsspec==2024.5.0

Expand Down
21 changes: 16 additions & 5 deletions requirements/dev.txt
Original file line number Diff line number Diff line change
Expand Up @@ -7,6 +7,7 @@
anyio==3.7.1
# via
# -c ././deps/constraints.txt
# -c ./base.txt
# httpx
# jupyter-server
appnope==0.1.4
Expand Down Expand Up @@ -76,6 +77,7 @@ distlib==0.3.8
# via virtualenv
exceptiongroup==1.2.1
# via
# -c ./base.txt
# -c ./test.txt
# anyio
executing==2.0.1
Expand All @@ -87,11 +89,17 @@ filelock==3.15.4
fqdn==1.5.1
# via jsonschema
h11==0.14.0
# via httpcore
# via
# -c ./base.txt
# httpcore
httpcore==1.0.5
# via httpx
# via
# -c ./base.txt
# httpx
httpx==0.27.0
# via jupyterlab
# via
# -c ./base.txt
# jupyterlab
identify==2.5.36
# via pre-commit
idna==3.7
Expand All @@ -111,7 +119,7 @@ importlib-metadata==7.1.0
# jupyterlab
# jupyterlab-server
# nbconvert
ipykernel==6.29.4
ipykernel==6.29.5
# via
# jupyter
# jupyter-console
Expand Down Expand Up @@ -218,7 +226,9 @@ nbformat==5.10.4
# nbclient
# nbconvert
nest-asyncio==1.6.0
# via ipykernel
# via
# -c ./base.txt
# ipykernel
nodeenv==1.9.1
# via pre-commit
notebook==7.2.1
Expand Down Expand Up @@ -348,6 +358,7 @@ six==1.16.0
# rfc3339-validator
sniffio==1.3.1
# via
# -c ./base.txt
# anyio
# httpx
soupsieve==2.5
Expand Down
8 changes: 5 additions & 3 deletions requirements/extra-pdf-image.txt
Original file line number Diff line number Diff line change
Expand Up @@ -48,7 +48,7 @@ fsspec==2024.5.0
# torch
google-api-core[grpc]==2.19.1
# via google-cloud-vision
google-auth==2.30.0
google-auth==2.31.0
# via
# google-api-core
# google-cloud-vision
Expand Down Expand Up @@ -164,7 +164,7 @@ pillow==10.4.0
# pytesseract
# torchvision
# unstructured-pytesseract
pillow-heif==0.16.0
pillow-heif==0.17.0
# via -r ./extra-pdf-image.in
portalocker==2.10.0
# via iopath
Expand Down Expand Up @@ -199,7 +199,9 @@ pyparsing==3.0.9
# -c ././deps/constraints.txt
# matplotlib
pypdf==4.2.0
# via -r ./extra-pdf-image.in
# via
# -c ./base.txt
# -r ./extra-pdf-image.in
pypdfium2==4.30.0
# via pdfplumber
pytesseract==0.3.10
Expand Down
4 changes: 2 additions & 2 deletions requirements/ingest/airtable.txt
Original file line number Diff line number Diff line change
Expand Up @@ -23,9 +23,9 @@ inflection==0.5.1
# via pyairtable
pyairtable==2.3.3
# via -r ./ingest/airtable.in
pydantic==2.7.4
pydantic==2.8.0
# via pyairtable
pydantic-core==2.18.4
pydantic-core==2.20.0
# via pydantic
requests==2.32.3
# via
Expand Down
18 changes: 14 additions & 4 deletions requirements/ingest/astra.txt
Original file line number Diff line number Diff line change
Expand Up @@ -6,6 +6,7 @@
#
anyio==3.7.1
# via
# -c ./ingest/../base.txt
# -c ./ingest/../deps/constraints.txt
# httpx
astrapy==1.3.1
Expand Down Expand Up @@ -34,19 +35,27 @@ click==8.1.7
deprecation==2.1.0
# via astrapy
exceptiongroup==1.2.1
# via anyio
# via
# -c ./ingest/../base.txt
# anyio
geomet==0.2.1.post1
# via cassandra-driver
h11==0.14.0
# via httpcore
# via
# -c ./ingest/../base.txt
# httpcore
h2==4.1.0
# via httpx
hpack==4.0.0
# via h2
httpcore==1.0.5
# via httpx
# via
# -c ./ingest/../base.txt
# httpx
httpx[http2]==0.27.0
# via astrapy
# via
# -c ./ingest/../base.txt
# astrapy
hyperframe==6.0.1
# via h2
idna==3.7
Expand Down Expand Up @@ -80,6 +89,7 @@ six==1.16.0
# python-dateutil
sniffio==1.3.1
# via
# -c ./ingest/../base.txt
# anyio
# httpx
toml==0.10.2
Expand Down
4 changes: 3 additions & 1 deletion requirements/ingest/box.txt
Original file line number Diff line number Diff line change
Expand Up @@ -46,7 +46,9 @@ requests==2.32.3
# boxsdk
# requests-toolbelt
requests-toolbelt==1.0.0
# via boxsdk
# via
# -c ./ingest/../base.txt
# boxsdk
six==1.16.0
# via
# -c ./ingest/../base.txt
Expand Down
21 changes: 15 additions & 6 deletions requirements/ingest/chroma.txt
Original file line number Diff line number Diff line change
Expand Up @@ -8,6 +8,7 @@ annotated-types==0.7.0
# via pydantic
anyio==3.7.1
# via
# -c ./ingest/../base.txt
# -c ./ingest/../deps/constraints.txt
# httpx
# starlette
Expand Down Expand Up @@ -52,7 +53,9 @@ deprecated==1.2.14
# opentelemetry-api
# opentelemetry-exporter-otlp-proto-grpc
exceptiongroup==1.2.1
# via anyio
# via
# -c ./ingest/../base.txt
# anyio
fastapi==0.110.3
# via chromadb
filelock==3.15.4
Expand All @@ -63,7 +66,7 @@ fsspec==2024.5.0
# via
# -c ./ingest/../deps/constraints.txt
# huggingface-hub
google-auth==2.30.0
google-auth==2.31.0
# via kubernetes
googleapis-common-protos==1.63.2
# via opentelemetry-exporter-otlp-proto-grpc
Expand All @@ -73,14 +76,19 @@ grpcio==1.64.1
# opentelemetry-exporter-otlp-proto-grpc
h11==0.14.0
# via
# -c ./ingest/../base.txt
# httpcore
# uvicorn
httpcore==1.0.5
# via httpx
# via
# -c ./ingest/../base.txt
# httpx
httptools==0.6.1
# via uvicorn
httpx==0.27.0
# via chromadb
# via
# -c ./ingest/../base.txt
# chromadb
huggingface-hub==0.23.4
# via tokenizers
humanfriendly==10.0
Expand Down Expand Up @@ -182,11 +190,11 @@ pyasn1==0.6.0
# rsa
pyasn1-modules==0.4.0
# via google-auth
pydantic==2.7.4
pydantic==2.8.0
# via
# chromadb
# fastapi
pydantic-core==2.18.4
pydantic-core==2.20.0
# via pydantic
pypika==0.48.9
# via chromadb
Expand Down Expand Up @@ -225,6 +233,7 @@ six==1.16.0
# python-dateutil
sniffio==1.3.1
# via
# -c ./ingest/../base.txt
# anyio
# httpx
starlette==0.37.2
Expand Down
2 changes: 1 addition & 1 deletion requirements/ingest/databricks-volumes.txt
Original file line number Diff line number Diff line change
Expand Up @@ -17,7 +17,7 @@ charset-normalizer==3.3.2
# requests
databricks-sdk==0.29.0
# via -r ./ingest/databricks-volumes.in
google-auth==2.30.0
google-auth==2.31.0
# via databricks-sdk
idna==3.7
# via
Expand Down
Loading
Loading