-
Notifications
You must be signed in to change notification settings - Fork 3.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
error fetching chunks & failed to get s3 object #5221
Comments
grpc timeout ??? vendor/github.com/grafana/dskit/grpcclient/grpcclient.go:98 level=warn ts=2022-01-24T10:57:05.273457211Z caller=grpc_logging.go:55 method=/logproto.Querier/Query duration=16.524312361s err="failed to get s3 object: RequestCanceled: request context canceled\ncaused by: context canceled" msg="gRPC\n" |
Same issue here ✋ |
Sadly I have the same issue. Any updates yet? |
make queries more cache-able by aligning them with their step intervals ? anything else ? |
Same here ✋ |
Same here |
Hey guys, have the same issue |
Same error with gcs backend. |
This error is an expected log. When the limit is 1000, the context will automatically cancel after 1000 logs are queried. |
Same issue here! any update? |
Same issue here! |
do you have any reference to this ? while this make sense i would expect not to get an error in this case when is a |
Same error with gcs backend. |
Same error with GCS backend as well. |
Same issue using S3 |
Same issue with GCS backend as well, is there any setting on that we can set to override the default 1000 limit? |
Same issue with azure blob store as well |
So, there are probably a few hundred reasons why TLDR; my network config was such that my EC2 instance needed the PUT response hop limit increased, and unfortunately the HTTP request sent by the golang Figuring this out was quite the chore. The logging here was so obtuse, even with debug enabled, they I wound up forking and customizing loki to add a bunch more logging. Mostly this just helped me determine that my configuration was being correctly applied and the S3 operations were being attempted as expected. Additionally I verified that my EC2 instance and Kubernetes pods running on it could all access the S3 bucket in question (test via CURL and the aws CLI). I also verified that they could perform The critical diagnostic step for me was to alter the aws-sdk's s3 configuration to have it output debug logs as well (see s3_storage_client.go): s3Config = s3Config.WithLogLevel(*aws.LogLevel(aws.LogDebugWithHTTPBody))
s3Config = s3Config.WithCredentialsChainVerboseErrors(true) This is what clued me in to the fact that the HTTP PUT to the EC2 metadata services In conclusion, this was a horrible debugging experience, but I think it is as much the |
Where is such an instruction? |
Hi there @liguozhong. If this is an expected log is there any discussion to update the error to warning or change the message? |
good idea. |
i change replication_factor from 3 to 1. the question is fixed. |
Hi. any updates? |
Same issue using S3 |
same issue, is thery any way around this? |
+1 |
…6360) **What this PR does / why we need it**: ``` level=error ts=2022-01-24T10:57:05.273404635Z caller=batch.go:699 msg="error fetching chunks" err="failed to get s3 object: context canceled" level=error ts=2022-01-24T10:57:05.273434964Z caller=batch.go:699 msg="error fetching chunks" err="failed to get s3 object: context canceled" level=warn ts=2022-01-24T10:57:05.273457211Z caller=grpc_logging.go:55 method=/logproto.Querier/Query duration=16.524312361s err="failed to get s3 object: RequestCanceled: request context canceled\ncaused by: context canceled" msg="gRPC\n" level=error ts=2022-01-24T10:57:05.273486368Z caller=batch.go:699 msg="error fetching chunks" err="failed to get s3 object: context canceled" level=error ts=2022-01-24T10:57:05.273786914Z caller=batch.go:699 msg="error fetching chunks" err="failed to get s3 object: RequestCanceled: request context canceled\ncaused by: context canceled" level=error ts=2022-01-24T10:57:05.274246531Z caller=batch.go:699 msg="error fetching chunks" err="failed to get s3 object: RequestCanceled: request context canceled\ncaused by: context canceled" level=warn ts=2022-01-24T10:57:05.274385302Z caller=grpc_logging.go:55 method=/logproto.Querier/Query duration=29.542213769s err="failed to get s3 object: RequestCanceled: request context canceled\ncaused by: context canceled" msg="gRPC\n" ``` When logql timeout, do not print these logs. Since we retry, we don't want to show intermittent timeouts, and we only care about the last error. **Which issue(s) this PR fixes**: Fixes [<5221>](#5221)
…rafana#6360) **What this PR does / why we need it**: ``` level=error ts=2022-01-24T10:57:05.273404635Z caller=batch.go:699 msg="error fetching chunks" err="failed to get s3 object: context canceled" level=error ts=2022-01-24T10:57:05.273434964Z caller=batch.go:699 msg="error fetching chunks" err="failed to get s3 object: context canceled" level=warn ts=2022-01-24T10:57:05.273457211Z caller=grpc_logging.go:55 method=/logproto.Querier/Query duration=16.524312361s err="failed to get s3 object: RequestCanceled: request context canceled\ncaused by: context canceled" msg="gRPC\n" level=error ts=2022-01-24T10:57:05.273486368Z caller=batch.go:699 msg="error fetching chunks" err="failed to get s3 object: context canceled" level=error ts=2022-01-24T10:57:05.273786914Z caller=batch.go:699 msg="error fetching chunks" err="failed to get s3 object: RequestCanceled: request context canceled\ncaused by: context canceled" level=error ts=2022-01-24T10:57:05.274246531Z caller=batch.go:699 msg="error fetching chunks" err="failed to get s3 object: RequestCanceled: request context canceled\ncaused by: context canceled" level=warn ts=2022-01-24T10:57:05.274385302Z caller=grpc_logging.go:55 method=/logproto.Querier/Query duration=29.542213769s err="failed to get s3 object: RequestCanceled: request context canceled\ncaused by: context canceled" msg="gRPC\n" ``` When logql timeout, do not print these logs. Since we retry, we don't want to show intermittent timeouts, and we only care about the last error. **Which issue(s) this PR fixes**: Fixes [<5221>](grafana#5221)
Same here ✋ |
This error seems to occur when there is no chunks bucket in S3. |
I found something error on the log.When I query ge QueryRange more than 1 hour,The error will be happend.Any one have ideas for it ?
Here is my Query Instance config:
Here is my Query Frontend Instance Config:
query_range:
make queries more cache-able by aligning them with their step intervals
align_queries_with_step: true
max_retries: 5
parallelise_shardable_queries: true
cache_results: true
split_queries_by_interval: 5m
results_cache:
cache:
enable_fifocache: true
fifocache:
size: 2048
validity: 24h
frontend:
log_queries_longer_than: 5s
compress_responses: true
max_outstanding_per_tenant: 2048
My S3 Config:
the minio Oss system is worked ,and can be searched
The text was updated successfully, but these errors were encountered: