Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

GCP: fix single byte read in GCSInputStream #8071

Merged
merged 2 commits into from
Jul 15, 2023

Conversation

bryanck
Copy link
Contributor

@bryanck bryanck commented Jul 15, 2023

This PR fixes the conversion of a byte to an int when reading a single byte from GCSInputStream. Decoding integers in Parquet was throwing an exception because of this, in this method.

@github-actions github-actions bot added the GCP label Jul 15, 2023
@@ -117,7 +117,7 @@ public int read() throws IOException {
readBytes.increment();
readOperations.increment();

return singleByteBuffer.array()[0];
return singleByteBuffer.array()[0] & 0xFF;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we use Byte.toUnsignedInt() here to be more clear?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1

@@ -117,7 +117,7 @@ public int read() throws IOException {
readBytes.increment();
readOperations.increment();

return singleByteBuffer.array()[0];
return singleByteBuffer.array()[0] & 0xFF;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1

@danielcweeks danielcweeks merged commit c27c1fb into apache:master Jul 15, 2023
nastra pushed a commit to nastra/iceberg that referenced this pull request Aug 15, 2023
* Spark: Update antlr4 to match Spark 3.4 (apache#7824)

* Parquet: Revert workaround for resource usage with zstd (apache#7834)

* GCP: fix single byte read in GCSInputStream (apache#8071)

* GCP: fix byte read in GCSInputStream

* add test

* Parquet: Cache codecs by name and level (apache#8182)

* GCP: Add prefix and bulk operations to GCSFileIO (apache#8168)

* AWS, GCS: Allow access to underlying storage client (apache#8208)

* spotless
@dchristle
Copy link
Contributor

@bryanck Thank you for fixing this! It stymied my efforts to try to use GCSFileIO for several weeks.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants