Skip to content

Commit

Permalink
[BUG] Manually specify region in tutorial read_json (#1608)
Browse files Browse the repository at this point in the history
The change in #1592 introduced a bug where the inferred PyArrow
S3FileSystem doesn't correctly infer the region of the bucket

To work-around this, we can manually specify the `region_name` in our
IOConfig. This problem should be less common once we move towards
all-native reads.

Co-authored-by: Jay Chia <[email protected]@users.noreply.github.com>
  • Loading branch information
jaychia and Jay Chia authored Nov 15, 2023
1 parent 56cd1d6 commit c4b498a
Showing 1 changed file with 1 addition and 1 deletion.
Original file line number Diff line number Diff line change
Expand Up @@ -103,7 +103,7 @@
"import daft\n",
"\n",
"SAMPLE_DATA_PATH = \"s3://daft-public-data/redpajama-1t-sample/stackexchange_sample.jsonl\"\n",
"IO_CONFIG = daft.io.IOConfig(s3=daft.io.S3Config(anonymous=True)) # Use anonymous-mode for accessing AWS S3\n",
"IO_CONFIG = daft.io.IOConfig(s3=daft.io.S3Config(anonymous=True, region_name=\"us-west-2\")) # Use anonymous-mode for accessing AWS S3\n",
"\n",
"df = daft.read_json(SAMPLE_DATA_PATH, io_config=IO_CONFIG)\n",
"\n",
Expand Down

0 comments on commit c4b498a

Please sign in to comment.