From 3b4794a8fa4fa25bcc860d7c3250fe9a40e1b79e Mon Sep 17 00:00:00 2001 From: Roman Isecke <136338424+rbiseck3@users.noreply.github.com> Date: Fri, 20 Oct 2023 13:21:06 -0400 Subject: [PATCH] bugfix/mapping source connectors in destination cli commands (#1788) Due to the dynamic nature of how the source connector is called when a destination command is invoked, the configs need to be mapped and the fsspec config needs to be dynamically added based on the type of runner being used. This code was added to all currently supported destination commands. --- CHANGELOG.md | 4 +--- 1 file changed, 1 insertion(+), 3 deletions(-) diff --git a/CHANGELOG.md b/CHANGELOG.md index aed8fa1875..5170ce2607 100644 --- a/CHANGELOG.md +++ b/CHANGELOG.md @@ -32,11 +32,9 @@ ocr agent tesseract/paddle in environment variable `OCR_AGENT` for OCRing the en * **Fix metrics folder not discoverable** Fixes issue where unstructured/metrics folder is not discoverable on PyPI by adding an `__init__.py` file under the folder. * **Fix a bug when `parition_pdf` get `model_name=None`** In API usage the `model_name` value is `None` and the `cast` function in `partition_pdf` would return `None` and lead to attribution error. Now we use `str` function to explicit convert the content to string so it is garanteed to have `starts_with` and other string functions as attributes * **Fix html partition fail on tables without `tbody` tag** HTML tables may sometimes just contain headers without body (`tbody` tag) -<<<<<<< HEAD * **Fix out-of-order sequencing of split chunks.** Fixes behavior where "split" chunks were inserted at the beginning of the chunk sequence. This would produce a chunk sequence like [5a, 5b, 3a, 3b, 1, 2, 4] when sections 3 and 5 exceeded `max_characters`. -======= * **Deserialization of ingest docs fixed** When ingest docs are being deserialized as part of the ingest pipeline process (cli), there were certain fields that weren't getting persisted (metadata and date processed). The from_dict method was updated to take these into account and a unit test added to check. ->>>>>>> 9c37e516 (update changelog) +* **Map source cli command configs when destination set** Due to how the source connector is dynamically called when the destination connector is set via the CLI, the configs were being set incorrectoy, causing the source connector to break. The configs were fixed and updated to take into account Fsspec-specific connectors. ## 0.10.24