You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Describe the bug
When we run ingest pipeline with chunking options and add an additional pipeline node after chunking (such as embeddings), we see that the element data does not persist to the next pipeline node
To Reproduce
Run unstructured/examples/ingest/s3-small-batch/ingest.sh with additional chunking and embedding cli params
Expected behavior
Embedding outputs are empty, and when elements output of chunking is re-read into memory (to persist it to embeddings) it is an empty list
Environment Info
Base requirements and s3 requirements
The text was updated successfully, but these errors were encountered:
ahmetmeleq
changed the title
bug/ingest-pipeline-with-chunking
bug/ingest pipeline with chunking and embedding does not persist data to the embedding step
Oct 27, 2023
ahmetmeleq
changed the title
bug/ingest pipeline with chunking and embedding does not persist data to the embedding step
bug: ingest pipeline with chunking and embedding does not persist data to the embedding step
Oct 27, 2023
Closes#1414Closes#2039
This PR:
- Uses Pinecone python cli to implement a destination connector for
Pinecone and provides the ingest readme requirements
[(here)](https://github.com/Unstructured-IO/unstructured/tree/main/unstructured/ingest#the-checklist)
for the connector
- Updates documentation for the s3 destination connector
- Alphabetically sorts setup.py contents
- Updates logs for the chunking node in ingest pipeline
- Adds a baseline session handle implementation for destination
connectors, to be able to parallelize their operations
- For the
[bug](#1892)
related to persisting element data to ingest embedding nodes; this PR
tests the
[solution](#1893)
with its ingest test
- Solves a bug on ingest chunking params with [bugfix on chunking params
and implementing related
test](69e1949)
---------
Co-authored-by: Roman Isecke <[email protected]>
Describe the bug
When we run ingest pipeline with chunking options and add an additional pipeline node after chunking (such as embeddings), we see that the element data does not persist to the next pipeline node
To Reproduce
Run unstructured/examples/ingest/s3-small-batch/ingest.sh with additional chunking and embedding cli params
Expected behavior
Embedding outputs are empty, and when elements output of chunking is re-read into memory (to persist it to embeddings) it is an empty list
Environment Info
Base requirements and s3 requirements
The text was updated successfully, but these errors were encountered: