Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix: SQL stager and Snowflake uploader #341

Open
wants to merge 6 commits into
base: main
Choose a base branch
from

Conversation

mpolomdeepsense
Copy link
Contributor

@mpolomdeepsense mpolomdeepsense commented Jan 17, 2025

SQL stager error

When passed output_filename without a suffix it resulted in unsupported file format error. The fix adds .json suffix, if the filename doesn't have one already.

NOTE: Originally discovered in stager plugin. Output filename is passed without suffix.

controller   | handle: <Handle JobInvoker.process_in_pool.<locals>._done_callback(<Task finishe...1cb266fc081')>) at /home/controller/invoker/job_invoker.py:480>
controller   | Traceback (most recent call last):
controller   |   File "/usr/lib/python3.12/asyncio/events.py", line 88, in _run
controller   |     self._context.run(self._callback, *self._args)
controller   |   File "/home/controller/invoker/job_invoker.py", line 484, in _done_callback
controller   |     fut.result()
controller   |   File "/home/controller/invoker/job_invoker.py", line 457, in process_record
controller   |     raise e
controller   |   File "/home/controller/invoker/job_invoker.py", line 451, in process_record
controller   |     await self._process_record(record=record)
controller   |   File "/home/controller/invoker/job_invoker.py", line 421, in _process_record
controller   |     res = await self.invoke_plugin(inputs=job_inputs, record_ids=[record_id])
controller   |           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
controller   |   File "/home/controller/invoker/job_invoker.py", line 203, in invoke_plugin
controller   |     raise PluginError(
controller   | controller.invoker.base_invoker.PluginError: PluginError: failed to invoke plugin: [ValueError] Unsupported output format: /home/data/stager/71cb266fc081

image

Snowflake uploader error

Unexpected columns argument was passed to _fit_to_schema method inside SnowflakeUploader upload_dataframe method.

2025-01-17 13:34:21,438 SpawnPoolWorker-11 ERROR    Exception raised while running upload
Traceback (most recent call last):
  File "/home/marek/unstructured/unstructured-ingest/unstructured_ingest/v2/pipeline/interfaces.py", line 171, in run_async
    return await self._run_async(fn=fn, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/marek/unstructured/unstructured-ingest/unstructured_ingest/v2/pipeline/steps/upload.py", line 53, in _run_async
    fn(**fn_kwargs)
  File "/home/marek/unstructured/unstructured-ingest/unstructured_ingest/v2/processes/connectors/sql/sql.py", line 451, in run
    self.upload_dataframe(df=df, file_data=file_data)
  File "/home/marek/unstructured/unstructured-ingest/unstructured_ingest/v2/processes/connectors/sql/snowflake.py", line 173, in upload_dataframe
    self._fit_to_schema(df=df, columns=self.get_table_columns())
TypeError: SQLUploader._fit_to_schema() got an unexpected keyword argument 'columns'

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant