You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Cloud provider: GCS for storage, but running locally
OS: MacOS
Other:
Bug
What happened:
I have a public GCS bucket with a bunch of Delta Lake tables. the bucket has viewer access for allUsers, meaning unauthenticated users can access it. you can easily test this with pandas or other libraries (I hit this with Ibis):
[ins] In [1]: importgcsfs
[ins] In [2]: importpandasaspd
[ins] In [3]: fromdeltalakeimportDeltaTable
[nav] In [4]: pd.read_parquet("gs://ibis-analytics/penguins.parquet", storage_options={"token": "anon"})
Out[4]:
speciesislandbill_length_mmbill_depth_mmflipper_length_mmbody_mass_gsexyear0AdelieTorgersen39.118.7181.03750.0male20071AdelieTorgersen39.517.4186.03800.0female20072AdelieTorgersen40.318.0195.03250.0female20073AdelieTorgersenNaNNaNNaNNaNNone20074AdelieTorgersen36.719.3193.03450.0female2007
.. ... ... ... ... ... ... ... ...
339ChinstrapDream55.819.8207.04000.0male2009340ChinstrapDream43.518.1202.03400.0female2009341ChinstrapDream49.618.2193.03775.0male2009342ChinstrapDream50.819.0210.04100.0male2009343ChinstrapDream50.218.7198.03775.0female2009
[344rowsx8columns]
[nav] In [5]: pd.read_csv("gs://ibis-analytics/penguins.csv", storage_options={"token": "anon"})
Out[5]:
speciesislandbill_length_mmbill_depth_mmflipper_length_mmbody_mass_gsexyear0AdelieTorgersen39.118.7181.03750.0male20071AdelieTorgersen39.517.4186.03800.0female20072AdelieTorgersen40.318.0195.03250.0female20073AdelieTorgersenNaNNaNNaNNaNNaN20074AdelieTorgersen36.719.3193.03450.0female2007
.. ... ... ... ... ... ... ... ...
339ChinstrapDream55.819.8207.04000.0male2009340ChinstrapDream43.518.1202.03400.0female2009341ChinstrapDream49.618.2193.03775.0male2009342ChinstrapDream50.819.0210.04100.0male2009343ChinstrapDream50.218.7198.03775.0female2009
[344rowsx8columns]
but trying to read a Delta Lake table in the same place -- if not authenticated with GCP it seems -- results in an error:
[ins] In [6]: DeltaTable("gs://ibis-analytics/penguins.delta", storage_options={"token": "anon"})
---------------------------------------------------------------------------OSErrorTraceback (mostrecentcalllast)
CellIn[6], line1---->1DeltaTable("gs://ibis-analytics/penguins.delta", storage_options={"token": "anon"})
File~/repos/ibis-analytics/.venv/lib/python3.12/site-packages/deltalake/table.py:380, inDeltaTable.__init__(self, table_uri, version, storage_options, without_files, log_buffer_size)
360""" 361 Create the Delta Table from a path with an optional version. 362 Multiple StorageBackends are currently supported: AWS S3, Azure Data Lake Storage Gen2, Google Cloud Storage (GCS) and local URI. (...) 377 378 """379self._storage_options=storage_options-->380self._table=RawDeltaTable(
381str(table_uri),
382version=version,
383storage_options=storage_options,
384without_files=without_files,
385log_buffer_size=log_buffer_size,
386 )
OSError: GenericGCSerror: Errorperformingtokenrequest: Errorafter10retriesin8.200125916s, max_retries:10, retry_timeout:180s, source:errorsendingrequestforurl (http://169.254.169.254/computeMetadata/v1/instance/service-accounts/default/token?audience=https%3A%2F%2Fwww.googleapis.com%2Foauth2%2Fv4%2Ftoken)
there's not a lot in the stacktrace to go on
What you expected to happen:
above works
How to reproduce it:
You can try it out on the bucket noted above: gs://ibis-analytics has penguins.csv, penguins.parquet, and penguins.delta in it
More details:
this was reproduced by others as well
The text was updated successfully, but these errors were encountered:
We simply use the object store crate in Rust, if it's not working then it's because anon is not a supported config, you should try asking this in arrow-rs where object store belongs
Environment
Delta-rs version:
deltalake==0.19.2
Binding: Python
Environment:
Bug
What happened:
I have a public GCS bucket with a bunch of Delta Lake tables. the bucket has viewer access for allUsers, meaning unauthenticated users can access it. you can easily test this with pandas or other libraries (I hit this with Ibis):
but trying to read a Delta Lake table in the same place -- if not authenticated with GCP it seems -- results in an error:
there's not a lot in the stacktrace to go on
What you expected to happen:
above works
How to reproduce it:
You can try it out on the bucket noted above:
gs://ibis-analytics
haspenguins.csv
,penguins.parquet
, andpenguins.delta
in itMore details:
this was reproduced by others as well
The text was updated successfully, but these errors were encountered: