Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Serialization error with S3Config's credential_provider #3322

Closed
desmondcheongzx opened this issue Nov 18, 2024 · 4 comments
Closed

Serialization error with S3Config's credential_provider #3322

desmondcheongzx opened this issue Nov 18, 2024 · 4 comments
Assignees
Labels
bug Something isn't working p1 Important to tackle soon, but preemptable by p0

Comments

@desmondcheongzx
Copy link
Contributor

Describe the bug

In https://dist-data.slack.com/archives/C041NA2RBFD/p1731789019058459, a user encountered a serialization error when using a callable credential provider for S3.

Code:

def get_credentials() -> daft.io.S3Credentials:
    session = boto3.Session()
    creds = session.get_credentials()
    return daft.io.S3Credentials(
        key_id=creds.access_key,
        access_key=creds.secret_key,
        session_token=creds.token,
        expiry=datetime.datetime.now(datetime.timezone.utc) + datetime.timedelta(hours=1),
    )


s3_config = S3Config(
    credentials_provider=get_credentials,
)

io_config = IOConfig(s3=s3_config)

source = daft.read_parquet(
    "s3://noetik-datalake-prod/branches/main/warehouse/curated__cosmx_cells/*",
    io_config=io_config,
)

io_config = IOConfig(s3=s3_config)

source = daft.read_parquet(
    "s3://noetik-datalake-prod/branches/main/warehouse/curated__cosmx_cells/*",
    io_config=io_config,
)

Error:

(ScanWithTask-Project-FanoutHash [Stage:2] pid=3867337) thread '<unnamed>' panicked at src/daft-scan/src/python.rs:480:5: [repeated 30x across cluster]
(ScanWithTask-Project-FanoutHash [Stage:2] pid=3867337) pyo3_runtime.PanicException: called `Result::unwrap()` on an `Err` value: Custom("AttributeError: type object 'S3Credentials' has no attribute 'expiry'") [repeated 90x across cluster]
(ScanWithTask-Project-FanoutHash [Stage:2] pid=3867337) note: run with `RUST_BACKTRACE=1` environment variable to display a backtrace [repeated 30x across cluster]
(ScanWithTask-Project-FanoutHash [Stage:2] pid=3867337) Traceback (most recent call last): [repeated 60x across cluster]
(ScanWithTask-Project-FanoutHash [Stage:2] pid=3867337)   File "python/ray/_raylet.pyx", line 2254, in ray._raylet.task_execution_handler [repeated 60x across cluster]
(ScanWithTask-Project-FanoutHash [Stage:2] pid=3867337)   File "python/ray/_raylet.pyx", line 2150, in ray._raylet.execute_task_with_cancellation_handler [repeated 60x across cluster]
(ScanWithTask-Project-FanoutHash [Stage:2] pid=3867337)   File "python/ray/_raylet.pyx", line 1839, in ray._raylet.execute_task [repeated 300x across cluster]
(ScanWithTask-Project-FanoutHash [Stage:2] pid=3867337)   File "/home/noetik/twm/.venv/lib/python3.10/site-packages/ray/_private/serialization.py", line 460, in deserialize_objects [repeated 120x across cluster]
(ScanWithTask-Project-FanoutHash [Stage:2] pid=3867337)     return context.deserialize_objects(data_metadata_pairs, object_refs) [repeated 60x across cluster]
(ScanWithTask-Project-FanoutHash [Stage:2] pid=3867337)     obj = self._deserialize_object(data, metadata, object_ref) [repeated 60x across cluster]
(ScanWithTask-Project-FanoutHash [Stage:2] pid=3867337)   File "/home/noetik/twm/.venv/lib/python3.10/site-packages/ray/_private/serialization.py", line 317, in _deserialize_object [repeated 60x across cluster]
(ScanWithTask-Project-FanoutHash [Stage:2] pid=3867337)     return self._deserialize_msgpack_data(data, metadata_fields) [repeated 60x across cluster]
(ScanWithTask-Project-FanoutHash [Stage:2] pid=3867337)   File "/home/noetik/twm/.venv/lib/python3.10/site-packages/ray/_private/serialization.py", line 272, in _deserialize_msgpack_data [repeated 60x across cluster]
(ScanWithTask-Project-FanoutHash [Stage:2] pid=3867337)     python_objects = self._deserialize_pickle5_data(pickle5_data) [repeated 60x across cluster]
(ScanWithTask-Project-FanoutHash [Stage:2] pid=3867337)   File "/home/noetik/twm/.venv/lib/python3.10/site-packages/ray/_private/serialization.py", line 262, in _deserialize_pickle5_data [repeated 60x across cluster]
(ScanWithTask-Project-FanoutHash [Stage:2] pid=3867337)     obj = pickle.loads(in_band) [repeated 60x across cluster]

S3Credentials seems to be missing the expiry attribute, which is optional for the user to provide. Either in some constructor, or in some serde path, we're dropping this.

To Reproduce

No response

Expected behavior

No response

Component(s)

Other

Additional context

No response

@desmondcheongzx desmondcheongzx added bug Something isn't working p1 Important to tackle soon, but preemptable by p0 labels Nov 18, 2024
@tvanhens
Copy link

If I remove the expiry key I get a new error:

(ScanWithTask-LocalLimit [Stage:2] pid=359147) thread '<unnamed>' panicked at src/daft-scan/src/python.rs:455:5:
(ScanWithTask-LocalLimit [Stage:2] pid=359147) called `Result::unwrap()` on an `Err` value: Custom("AttributeError: type object 'S3Credentials' has no attribute 'access_key'")
(ScanWithTask-LocalLimit [Stage:2] pid=359147) note: run with `RUST_BACKTRACE=1` environment variable to display a backtrace
(ScanWithTask-LocalLimit [Stage:2] pid=359147) Traceback (most recent call last):
(ScanWithTask-LocalLimit [Stage:2] pid=359147)   File "python/ray/_raylet.pyx", line 2254, in ray._raylet.task_execution_handler
(ScanWithTask-LocalLimit [Stage:2] pid=359147)   File "python/ray/_raylet.pyx", line 2150, in ray._raylet.execute_task_with_cancellation_handler
(ScanWithTask-LocalLimit [Stage:2] pid=359147)   File "python/ray/_raylet.pyx", line 1805, in ray._raylet.execute_task
(ScanWithTask-LocalLimit [Stage:2] pid=359147)   File "python/ray/_raylet.pyx", line 1806, in ray._raylet.execute_task
(ScanWithTask-LocalLimit [Stage:2] pid=359147)   File "python/ray/_raylet.pyx", line 1809, in ray._raylet.execute_task
(ScanWithTask-LocalLimit [Stage:2] pid=359147)   File "python/ray/_raylet.pyx", line 1837, in ray._raylet.execute_task
(ScanWithTask-LocalLimit [Stage:2] pid=359147)   File "python/ray/_raylet.pyx", line 1839, in ray._raylet.execute_task
(ScanWithTask-LocalLimit [Stage:2] pid=359147)   File "/home/noetik/twm/.venv/lib/python3.10/site-packages/ray/_private/worker.py", line 841, in deserialize_objects
(ScanWithTask-LocalLimit [Stage:2] pid=359147)     return context.deserialize_objects(data_metadata_pairs, object_refs)
(ScanWithTask-LocalLimit [Stage:2] pid=359147)   File "/home/noetik/twm/.venv/lib/python3.10/site-packages/ray/_private/serialization.py", line 460, in deserialize_objects
(ScanWithTask-LocalLimit [Stage:2] pid=359147)     obj = self._deserialize_object(data, metadata, object_ref)
(ScanWithTask-LocalLimit [Stage:2] pid=359147)   File "/home/noetik/twm/.venv/lib/python3.10/site-packages/ray/_private/serialization.py", line 317, in _deserialize_object
(ScanWithTask-LocalLimit [Stage:2] pid=359147)     return self._deserialize_msgpack_data(data, metadata_fields)
(ScanWithTask-LocalLimit [Stage:2] pid=359147)   File "/home/noetik/twm/.venv/lib/python3.10/site-packages/ray/_private/serialization.py", line 272, in _deserialize_msgpack_data
(ScanWithTask-LocalLimit [Stage:2] pid=359147)     python_objects = self._deserialize_pickle5_data(pickle5_data)
(ScanWithTask-LocalLimit [Stage:2] pid=359147)   File "/home/noetik/twm/.venv/lib/python3.10/site-packages/ray/_private/serialization.py", line 262, in _deserialize_pickle5_data
(ScanWithTask-LocalLimit [Stage:2] pid=359147)     obj = pickle.loads(in_band)
(ScanWithTask-LocalLimit [Stage:2] pid=359147) pyo3_runtime.PanicException: called `Result::unwrap()` on an `Err` value: Custom("AttributeError: type object 'S3Credentials' has no attribute 'access_key'")
(ScanWithTask-LocalLimit [Stage:2] pid=359147) Exception ignored in: 'ray._raylet.task_execution_handler'
(ScanWithTask-LocalLimit [Stage:2] pid=359147) Traceback (most recent call last):
(ScanWithTask-LocalLimit [Stage:2] pid=359147)   File "python/ray/_raylet.pyx", line 2254, in ray._raylet.task_execution_handler
(ScanWithTask-LocalLimit [Stage:2] pid=359147)   File "python/ray/_raylet.pyx", line 2150, in ray._raylet.execute_task_with_cancellation_handler
(ScanWithTask-LocalLimit [Stage:2] pid=359147)   File "python/ray/_raylet.pyx", line 1805, in ray._raylet.execute_task
(ScanWithTask-LocalLimit [Stage:2] pid=359147)   File "python/ray/_raylet.pyx", line 1806, in ray._raylet.execute_task
(ScanWithTask-LocalLimit [Stage:2] pid=359147)   File "python/ray/_raylet.pyx", line 1809, in ray._raylet.execute_task
(ScanWithTask-LocalLimit [Stage:2] pid=359147)   File "python/ray/_raylet.pyx", line 1837, in ray._raylet.execute_task
(ScanWithTask-LocalLimit [Stage:2] pid=359147)   File "python/ray/_raylet.pyx", line 1839, in ray._raylet.execute_task
(ScanWithTask-LocalLimit [Stage:2] pid=359147)   File "/home/noetik/twm/.venv/lib/python3.10/site-packages/ray/_private/worker.py", line 841, in deserialize_objects
(ScanWithTask-LocalLimit [Stage:2] pid=359147)     return context.deserialize_objects(data_metadata_pairs, object_refs)
(ScanWithTask-LocalLimit [Stage:2] pid=359147)   File "/home/noetik/twm/.venv/lib/python3.10/site-packages/ray/_private/serialization.py", line 460, in deserialize_objects
(ScanWithTask-LocalLimit [Stage:2] pid=359147)     obj = self._deserialize_object(data, metadata, object_ref)
(ScanWithTask-LocalLimit [Stage:2] pid=359147)   File "/home/noetik/twm/.venv/lib/python3.10/site-packages/ray/_private/serialization.py", line 317, in _deserialize_object
(ScanWithTask-LocalLimit [Stage:2] pid=359147)     return self._deserialize_msgpack_data(data, metadata_fields)
(ScanWithTask-LocalLimit [Stage:2] pid=359147)   File "/home/noetik/twm/.venv/lib/python3.10/site-packages/ray/_private/serialization.py", line 272, in _deserialize_msgpack_data
(ScanWithTask-LocalLimit [Stage:2] pid=359147)     python_objects = self._deserialize_pickle5_data(pickle5_data)
(ScanWithTask-LocalLimit [Stage:2] pid=359147)   File "/home/noetik/twm/.venv/lib/python3.10/site-packages/ray/_private/serialization.py", line 262, in _deserialize_pickle5_data
(ScanWithTask-LocalLimit [Stage:2] pid=359147)     obj = pickle.loads(in_band)
(ScanWithTask-LocalLimit [Stage:2] pid=359147) pyo3_runtime.PanicException: called `Result::unwrap()` on an `Err` value: Custom("AttributeError: type object 'S3Credentials' has no attribute 'access_key'")
(ScanWithTask-LocalLimit [Stage:2] pid=359147) [2024-11-18 23:00:04,425 C 359147 359147] task_receiver.cc:213:  Check failed: objects_valid

@tvanhens
Copy link

It seems like something might be amiss with the S3Credentials object itself. It looks like it has no session_token property. When I try printing the resolved values for access key and secret key, they print fine. When I try to access the session token like so:

def get_credentials() -> daft.io.S3Credentials:
    session = boto3.Session()
    creds = session.get_credentials()
    return daft.io.S3Credentials(
        key_id=creds.access_key,
        access_key=creds.secret_key,
        session_token=creds.token,
        expiry=(datetime.datetime.now() + datetime.timedelta(seconds=1)),
    )

credentials = get_credentials()
print(credentials.session_token)

I get this error:

---------------------------------------------------------------------------
AttributeError                            Traceback (most recent call last)
File /home/noetik/twm/src/twm/data/ray/cosmx.py:71
     [65](https://vscode-remote+ssh-002dremote-002bcoder-002dvscode-002ecoder-002enoetikdev-002ecom-002d-002dtvanhens-002d-002dohio-002dtest-002d-002ddev.vscode-resource.vscode-cdn.net/home/noetik/twm/src/twm/data/ray/cosmx.py:65)     return daft.read_parquet(
     [66](https://vscode-remote+ssh-002dremote-002bcoder-002dvscode-002ecoder-002enoetikdev-002ecom-002d-002dtvanhens-002d-002dohio-002dtest-002d-002ddev.vscode-resource.vscode-cdn.net/home/noetik/twm/src/twm/data/ray/cosmx.py:66)         "s3://noetik-datalake-prod/branches/main/warehouse/curated__cosmx_cells/*",
     [67](https://vscode-remote+ssh-002dremote-002bcoder-002dvscode-002ecoder-002enoetikdev-002ecom-002d-002dtvanhens-002d-002dohio-002dtest-002d-002ddev.vscode-resource.vscode-cdn.net/home/noetik/twm/src/twm/data/ray/cosmx.py:67)         io_config=io_config,
     [68](https://vscode-remote+ssh-002dremote-002bcoder-002dvscode-002ecoder-002enoetikdev-002ecom-002d-002dtvanhens-002d-002dohio-002dtest-002d-002ddev.vscode-resource.vscode-cdn.net/home/noetik/twm/src/twm/data/ray/cosmx.py:68)     )
     [70](https://vscode-remote+ssh-002dremote-002bcoder-002dvscode-002ecoder-002enoetikdev-002ecom-002d-002dtvanhens-002d-002dohio-002dtest-002d-002ddev.vscode-resource.vscode-cdn.net/home/noetik/twm/src/twm/data/ray/cosmx.py:70) setup_ray()
---> [71](https://vscode-remote+ssh-002dremote-002bcoder-002dvscode-002ecoder-002enoetikdev-002ecom-002d-002dtvanhens-002d-002dohio-002dtest-002d-002ddev.vscode-resource.vscode-cdn.net/home/noetik/twm/src/twm/data/ray/cosmx.py:71) source = read_cosmx_cells()
     [73](https://vscode-remote+ssh-002dremote-002bcoder-002dvscode-002ecoder-002enoetikdev-002ecom-002d-002dtvanhens-002d-002dohio-002dtest-002d-002ddev.vscode-resource.vscode-cdn.net/home/noetik/twm/src/twm/data/ray/cosmx.py:73) source.show()

File /home/noetik/twm/src/twm/data/ray/cosmx.py:56
     [54](https://vscode-remote+ssh-002dremote-002bcoder-002dvscode-002ecoder-002enoetikdev-002ecom-002d-002dtvanhens-002d-002dohio-002dtest-002d-002ddev.vscode-resource.vscode-cdn.net/home/noetik/twm/src/twm/data/ray/cosmx.py:54) def read_cosmx_cells() -> daft.DataFrame:
     [55](https://vscode-remote+ssh-002dremote-002bcoder-002dvscode-002ecoder-002enoetikdev-002ecom-002d-002dtvanhens-002d-002dohio-002dtest-002d-002ddev.vscode-resource.vscode-cdn.net/home/noetik/twm/src/twm/data/ray/cosmx.py:55)     credentials = get_credentials()
---> [56](https://vscode-remote+ssh-002dremote-002bcoder-002dvscode-002ecoder-002enoetikdev-002ecom-002d-002dtvanhens-002d-002dohio-002dtest-002d-002ddev.vscode-resource.vscode-cdn.net/home/noetik/twm/src/twm/data/ray/cosmx.py:56)     print(credentials.session_token)
     [57](https://vscode-remote+ssh-002dremote-002bcoder-002dvscode-002ecoder-002enoetikdev-002ecom-002d-002dtvanhens-002d-002dohio-002dtest-002d-002ddev.vscode-resource.vscode-cdn.net/home/noetik/twm/src/twm/data/ray/cosmx.py:57)     s3_config = daft.io.S3Config(
     [58](https://vscode-remote+ssh-002dremote-002bcoder-002dvscode-002ecoder-002enoetikdev-002ecom-002d-002dtvanhens-002d-002dohio-002dtest-002d-002ddev.vscode-resource.vscode-cdn.net/home/noetik/twm/src/twm/data/ray/cosmx.py:58)         # Waiting on: https://github.com/Eventual-Inc/Daft/issues/3322
     [59](https://vscode-remote+ssh-002dremote-002bcoder-002dvscode-002ecoder-002enoetikdev-002ecom-002d-002dtvanhens-002d-002dohio-002dtest-002d-002ddev.vscode-resource.vscode-cdn.net/home/noetik/twm/src/twm/data/ray/cosmx.py:59)         credentials_provider=get_credentials,
   (...)
     [62](https://vscode-remote+ssh-002dremote-002bcoder-002dvscode-002ecoder-002enoetikdev-002ecom-002d-002dtvanhens-002d-002dohio-002dtest-002d-002ddev.vscode-resource.vscode-cdn.net/home/noetik/twm/src/twm/data/ray/cosmx.py:62)         # session_token=credentials.session_token,
     [63](https://vscode-remote+ssh-002dremote-002bcoder-002dvscode-002ecoder-002enoetikdev-002ecom-002d-002dtvanhens-002d-002dohio-002dtest-002d-002ddev.vscode-resource.vscode-cdn.net/home/noetik/twm/src/twm/data/ray/cosmx.py:63)     )
     [64](https://vscode-remote+ssh-002dremote-002bcoder-002dvscode-002ecoder-002enoetikdev-002ecom-002d-002dtvanhens-002d-002dohio-002dtest-002d-002ddev.vscode-resource.vscode-cdn.net/home/noetik/twm/src/twm/data/ray/cosmx.py:64)     io_config = daft.io.IOConfig(s3=s3_config)

AttributeError: 'builtins.S3Credentials' object has no attribute 'session_token'

@kevinzwang
Copy link
Member

Hey @tvanhens , thank you for your patience.

To follow up from Slack, I have a PR out that will fix the Ray serialization issue for writes (#3339). However, S3Config.credentials_provider will not work yet since it is not yet being used in our writers. Will make another PR to fix that. Once both are in, I'd expect your get_credentials() workflow to work for writing to S3.

@kevinzwang
Copy link
Member

Since the serialization fixes have been merged into main, I created a new issue to track S3 credentials provider for writes: #3367

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working p1 Important to tackle soon, but preemptable by p0
Projects
None yet
Development

No branches or pull requests

3 participants