Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

bigquery: rowIterator.Next calls hangs forever possibly because of not having bigquery.readsessions.getData permission #8660

Closed
k-anshul opened this issue Oct 9, 2023 · 1 comment
Assignees
Labels
api: bigquery Issues related to the BigQuery API. priority: p2 Moderately-important priority. Fix may not be included in next release. type: bug Error or flaw in code with unintended results or allowing sub-optimal usage patterns.

Comments

@k-anshul
Copy link

k-anshul commented Oct 9, 2023

Client
bigquery - main

Environment
macOS arm64

Go Environment
go version go1.21.0 darwin/arm64

Expected behavior
Client should throw permission denied error(or similar)

Actual behavior
Client hangs forever

Additional context
I used a debugger to catch the place where its stuck. These are my findings.
The issue only happens when the SDK fetches the results using arrow streams.
In storage_iterator.go while processing streams in processStream function. It first creates a session and then read rows using it.session.readRows call. It checks for applicable error for retrying but weirdly no errors are returned here. It then consumes the stream using it.consumeRowStream call. I see the error being returned here but there is no retry classification logic here and it retries for all errors apart from context cancellation errors.
Here is the error msg that I see:

error(*fmt.wrapError) *{msg: "failed to consume rows on stream xxx: rpc error: code = PermissionDenied desc = there was an error operating on 'xxxx': the user does not have 'bigquery.readsessions.getData' permission for 'xxxxx'", err: error(*google.golang.org/grpc/internal/status.Error) *{s: *(*"google.golang.org/grpc/internal/status.Status")(0x140011ba088)}}
@k-anshul k-anshul added the triage me I really want to be triaged. label Oct 9, 2023
@product-auto-label product-auto-label bot added the api: bigquery Issues related to the BigQuery API. label Oct 9, 2023
@k-anshul k-anshul changed the title bigquery: rowIterator.Next calls in processStream possible because of not having bigquery.readsessions.getData permission bigquery: rowIterator.Next calls hangs forever possibly because of not having bigquery.readsessions.getData permission Oct 9, 2023
@shollyman shollyman assigned alvarowolfx and unassigned shollyman Oct 9, 2023
@alvarowolfx alvarowolfx added priority: p2 Moderately-important priority. Fix may not be included in next release. type: bug Error or flaw in code with unintended results or allowing sub-optimal usage patterns. and removed triage me I really want to be triaged. labels Oct 9, 2023
@alvarowolfx
Copy link
Contributor

I'm investigating this issue here @k-anshul. Weirdly enough, the PermissionDenied error should have been thrown on the it.session.readRows, not on the consumeRowStream > rowStream.Recv. I'll keep you up to date as I find more about this issue.

gcf-merge-on-green bot pushed a commit that referenced this issue Oct 12, 2023
Initial bug was found when the Storage Read API is called with a more restrict IAM/Role, which can cause an user to be able to create a ReadSession but not read from it (missing `bigquery.readsessions.getData` permission). This would make the process of reading the `read_streams` enter a retry loop because errors coming from the `Recv` calls are not handled properly, just the `ReadRows` call. This PR fixes this behavior.

Was reported on #8660 and tested locally by creating a custom role with the given configuration:

![image](https://togithub.com/googleapis/google-cloud-go/assets/1615543/b6dfdecf-5bb0-497f-8fcb-df8a8bdf1e3b)

Example of error:
```
failed to fetch via storage API: failed to read rows on stream projects/xxx/locations/us/sessions/yyy/streams/zzz: failed to consume rows on stream projects/xxx/locations/us/sessions/yyy/streams/zzz: rpc error: code = PermissionDenied desc = there was an error operating on 'projects/xxx/locations/us/sessions/yyy/streams/zzz': the user does not have 'bigquery.readsessions.getData' permission for 'projects/xxx/locations/us/sessions/yyy/streams/zzz
```

With the fix on this PR, now the processing of the stream stops and errors can be returned (like the PERMISSION_DENIED error in this scenario).
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
api: bigquery Issues related to the BigQuery API. priority: p2 Moderately-important priority. Fix may not be included in next release. type: bug Error or flaw in code with unintended results or allowing sub-optimal usage patterns.
Projects
None yet
Development

No branches or pull requests

4 participants
@shollyman @alvarowolfx @k-anshul and others