Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

compact: --wait results in syncing blocks every minute for global view #2642

Closed
chrischdi opened this issue May 22, 2020 · 3 comments · Fixed by #2752
Closed

compact: --wait results in syncing blocks every minute for global view #2642

chrischdi opened this issue May 22, 2020 · 3 comments · Fixed by #2752

Comments

@chrischdi
Copy link
Contributor

chrischdi commented May 22, 2020

Thanos, Prometheus and Golang version used:

thanos, version 0.12.2 (branch: HEAD, revision: 52e10c6e0f644ea98fd057e7fbece828d8dd07c7)
  build user:       circleci@241aa351893e
  build date:       20200430-16:37:24
  go version:       go1.13.1

Object Storage Provider:
Cloudian S3

What happened:
We did set --wait and --wait-interval to reduce load to s3, which partially worked.
But thanos compact still syncs the blocks every minute which is a hardcoded value for an ui component unused by us.

What you expected to happen:
Especially reducing requests to the object storage.

How to reproduce it (as minimally and precisely as possible):

Run thanos compact e.g. having the following flags set:

thanos compact \
--log.level=debug \
--data-dir=/var/thanos/compact/data \
--http-address=0.0.0.0:10902 \
--retention.resolution-raw=40d \
--retention.resolution-5m=40d \
--objstore.config-file=/etc/s3/s3.yaml \
--wait \
--wait-interval=2h

Full logs to relevant components:

Logs

level=debug ts=2020-05-22T11:24:08.359852209Z caller=main.go:103 msg="maxprocs: Updating GOMAXPROCS=[1]: using minimum allowed GOMAXPROCS"
level=info ts=2020-05-22T11:24:08.360095871Z caller=main.go:152 msg="Tracing will be disabled"
level=info ts=2020-05-22T11:24:08.360267028Z caller=factory.go:46 msg="loading bucket configuration"
level=info ts=2020-05-22T11:24:08.361027639Z caller=compact.go:375 msg="retention policy of raw samples is enabled" duration=960h0m0s
level=info ts=2020-05-22T11:24:08.361068276Z caller=compact.go:378 msg="retention policy of 5 min aggregated samples is enabled" duration=960h0m0s
level=info ts=2020-05-22T11:24:08.361594198Z caller=compact.go:522 msg="starting compact node"
level=info ts=2020-05-22T11:24:08.361624948Z caller=intrumentation.go:48 msg="changing probe status" status=ready
level=info ts=2020-05-22T11:24:08.361859114Z caller=intrumentation.go:60 msg="changing probe status" status=healthy
level=info ts=2020-05-22T11:24:08.361883217Z caller=http.go:56 service=http/server component=compact msg="listening for requests and metrics" address=0.0.0.0:10902
level=info ts=2020-05-22T11:24:08.36294357Z caller=compact.go:887 msg="start sync of metas"
level=debug ts=2020-05-22T11:24:12.078020143Z caller=fetcher.go:734 msg="block is too fresh for now" block=01E8Y0MBZFHGXHXFVEXP5XHZVP
level=debug ts=2020-05-22T11:24:12.078078836Z caller=fetcher.go:734 msg="block is too fresh for now" block=01E8Y0MC57Z6ZRAAAG2RGFYYQR
level=debug ts=2020-05-22T11:24:12.078112804Z caller=fetcher.go:734 msg="block is too fresh for now" block=01E8Y0MCFSSSAG0PBKT79JRFQM
level=debug ts=2020-05-22T11:24:12.078138526Z caller=fetcher.go:734 msg="block is too fresh for now" block=01E8Y0MCZ6RF16XNM44V69RRPG
level=debug ts=2020-05-22T11:24:12.07816003Z caller=fetcher.go:734 msg="block is too fresh for now" block=01E8Y0MCTR5JDYBXRT1PZGTVW5
level=debug ts=2020-05-22T11:24:12.078179902Z caller=fetcher.go:734 msg="block is too fresh for now" block=01E8Y0MFM48G08ZEZ2SNBPRTC7
level=debug ts=2020-05-22T11:24:12.078243552Z caller=fetcher.go:734 msg="block is too fresh for now" block=01E8Y0MD5RRAR3V2TMYKJKV6PP
level=info ts=2020-05-22T11:24:12.079405898Z caller=fetcher.go:451 component=block.BaseFetcher msg="successfully synchronized block metadata" duration=3.715378449s cached=345 returned=345 partial=0
level=info ts=2020-05-22T11:24:13.860116293Z caller=fetcher.go:451 component=block.BaseFetcher msg="successfully synchronized block metadata" duration=1.692043901s cached=345 returned=345 partial=0
level=info ts=2020-05-22T11:24:14.581962846Z caller=fetcher.go:451 component=block.BaseFetcher msg="successfully synchronized block metadata" duration=6.218959216s cached=345 returned=338 partial=0
level=info ts=2020-05-22T11:24:14.642707539Z caller=compact.go:892 msg="start of GC"
level=info ts=2020-05-22T11:24:14.645527262Z caller=compact.go:904 msg="start of compactions"
level=info ts=2020-05-22T11:24:15.104986882Z caller=compact.go:936 msg="compaction iterations done"
level=info ts=2020-05-22T11:24:15.105218642Z caller=compact.go:393 msg="start first pass of downsampling"
level=debug ts=2020-05-22T11:24:17.076504696Z caller=fetcher.go:734 msg="block is too fresh for now" block=01E8Y0MCTR5JDYBXRT1PZGTVW5
level=debug ts=2020-05-22T11:24:17.076711868Z caller=fetcher.go:734 msg="block is too fresh for now" block=01E8Y0MFM48G08ZEZ2SNBPRTC7
level=debug ts=2020-05-22T11:24:17.076785125Z caller=fetcher.go:734 msg="block is too fresh for now" block=01E8Y0MBZFHGXHXFVEXP5XHZVP
level=debug ts=2020-05-22T11:24:17.076853528Z caller=fetcher.go:734 msg="block is too fresh for now" block=01E8Y0MCFSSSAG0PBKT79JRFQM
level=debug ts=2020-05-22T11:24:17.076889906Z caller=fetcher.go:734 msg="block is too fresh for now" block=01E8Y0MC57Z6ZRAAAG2RGFYYQR
level=debug ts=2020-05-22T11:24:17.076932943Z caller=fetcher.go:734 msg="block is too fresh for now" block=01E8Y0MD5RRAR3V2TMYKJKV6PP
level=debug ts=2020-05-22T11:24:17.0769714Z caller=fetcher.go:734 msg="block is too fresh for now" block=01E8Y0MCZ6RF16XNM44V69RRPG
level=info ts=2020-05-22T11:24:18.285598546Z caller=fetcher.go:451 component=block.BaseFetcher msg="successfully synchronized block metadata" duration=3.180278823s cached=345 returned=338 partial=0
level=info ts=2020-05-22T11:24:18.369832727Z caller=compact.go:401 msg="start second pass of downsampling"
level=debug ts=2020-05-22T11:24:19.643735192Z caller=fetcher.go:734 msg="block is too fresh for now" block=01E8Y0MCFSSSAG0PBKT79JRFQM
level=debug ts=2020-05-22T11:24:19.643792354Z caller=fetcher.go:734 msg="block is too fresh for now" block=01E8Y0MBZFHGXHXFVEXP5XHZVP
level=debug ts=2020-05-22T11:24:19.643805779Z caller=fetcher.go:734 msg="block is too fresh for now" block=01E8Y0MCZ6RF16XNM44V69RRPG
level=debug ts=2020-05-22T11:24:19.643821679Z caller=fetcher.go:734 msg="block is too fresh for now" block=01E8Y0MD5RRAR3V2TMYKJKV6PP
level=debug ts=2020-05-22T11:24:19.643831824Z caller=fetcher.go:734 msg="block is too fresh for now" block=01E8Y0MFM48G08ZEZ2SNBPRTC7
level=debug ts=2020-05-22T11:24:19.643843739Z caller=fetcher.go:734 msg="block is too fresh for now" block=01E8Y0MC57Z6ZRAAAG2RGFYYQR
level=debug ts=2020-05-22T11:24:19.643855839Z caller=fetcher.go:734 msg="block is too fresh for now" block=01E8Y0MCTR5JDYBXRT1PZGTVW5
level=info ts=2020-05-22T11:24:20.879823948Z caller=fetcher.go:451 component=block.BaseFetcher msg="successfully synchronized block metadata" duration=2.509924964s cached=345 returned=338 partial=0
level=info ts=2020-05-22T11:24:20.962328903Z caller=compact.go:408 msg="downsampling iterations done"
level=debug ts=2020-05-22T11:24:21.888403852Z caller=fetcher.go:734 msg="block is too fresh for now" block=01E8Y0MC57Z6ZRAAAG2RGFYYQR
level=debug ts=2020-05-22T11:24:21.888456352Z caller=fetcher.go:734 msg="block is too fresh for now" block=01E8Y0MCFSSSAG0PBKT79JRFQM
level=debug ts=2020-05-22T11:24:21.888472592Z caller=fetcher.go:734 msg="block is too fresh for now" block=01E8Y0MCTR5JDYBXRT1PZGTVW5
level=debug ts=2020-05-22T11:24:21.888488223Z caller=fetcher.go:734 msg="block is too fresh for now" block=01E8Y0MBZFHGXHXFVEXP5XHZVP
level=debug ts=2020-05-22T11:24:21.888496286Z caller=fetcher.go:734 msg="block is too fresh for now" block=01E8Y0MFM48G08ZEZ2SNBPRTC7
level=debug ts=2020-05-22T11:24:21.888510181Z caller=fetcher.go:734 msg="block is too fresh for now" block=01E8Y0MD5RRAR3V2TMYKJKV6PP
level=debug ts=2020-05-22T11:24:21.888518179Z caller=fetcher.go:734 msg="block is too fresh for now" block=01E8Y0MCZ6RF16XNM44V69RRPG
level=info ts=2020-05-22T11:24:22.96864975Z caller=fetcher.go:451 component=block.BaseFetcher msg="successfully synchronized block metadata" duration=2.006245183s cached=345 returned=338 partial=0
level=info ts=2020-05-22T11:24:23.052706848Z caller=retention.go:30 msg="start optional retention"
level=info ts=2020-05-22T11:24:23.05281422Z caller=retention.go:45 msg="optional retention apply done"
level=info ts=2020-05-22T11:24:23.052831903Z caller=clean.go:33 msg="started cleaning of aborted partial uploads"
level=info ts=2020-05-22T11:24:23.052840076Z caller=clean.go:60 msg="cleaning of aborted partial uploads done"
level=info ts=2020-05-22T11:24:23.052848325Z caller=blocks_cleaner.go:43 msg="started cleaning of blocks marked for deletion"
level=info ts=2020-05-22T11:24:23.05285568Z caller=blocks_cleaner.go:57 msg="cleaning of blocks marked for deletion done"
level=info ts=2020-05-22T11:25:13.570968977Z caller=fetcher.go:451 component=block.BaseFetcher msg="successfully synchronized block metadata" duration=1.402753122s cached=345 returned=345 partial=0
level=info ts=2020-05-22T11:26:14.07052159Z caller=fetcher.go:451 component=block.BaseFetcher msg="successfully synchronized block metadata" duration=1.902280788s cached=345 returned=345 partial=0
level=info ts=2020-05-22T11:27:13.486830073Z caller=fetcher.go:451 component=block.BaseFetcher msg="successfully synchronized block metadata" duration=1.314137134s cached=345 returned=345 partial=0
level=info ts=2020-05-22T11:28:13.683931001Z caller=fetcher.go:451 component=block.BaseFetcher msg="successfully synchronized block metadata" duration=1.515684702s cached=345 returned=345 partial=0
level=info ts=2020-05-22T11:29:13.640454637Z caller=fetcher.go:451 component=block.BaseFetcher msg="successfully synchronized block metadata" duration=1.472208192s cached=345 returned=345 partial=0
level=info ts=2020-05-22T11:30:14.15778152Z caller=fetcher.go:451 component=block.BaseFetcher msg="successfully synchronized block metadata" duration=1.989528136s cached=345 returned=345 partial=0
level=info ts=2020-05-22T11:31:13.640950099Z caller=fetcher.go:451 component=block.BaseFetcher msg="successfully synchronized block metadata" duration=1.472621424s cached=345 returned=345 partial=0
level=info ts=2020-05-22T11:32:14.386058969Z caller=fetcher.go:451 component=block.BaseFetcher msg="successfully synchronized block metadata" duration=2.217820533s cached=345 returned=345 partial=0
level=info ts=2020-05-22T11:33:13.591251311Z caller=fetcher.go:451 component=block.BaseFetcher msg="successfully synchronized block metadata" duration=1.422966633s cached=345 returned=345 partial=0
level=info ts=2020-05-22T11:34:13.587023804Z caller=fetcher.go:451 component=block.BaseFetcher msg="successfully synchronized block metadata" duration=1.418797035s cached=345 returned=345 partial=0
level=info ts=2020-05-22T11:35:13.840610449Z caller=fetcher.go:451 component=block.BaseFetcher msg="successfully synchronized block metadata" duration=1.672367039s cached=345 returned=345 partial=0
level=info ts=2020-05-22T11:36:13.487101612Z caller=fetcher.go:451 component=block.BaseFetcher msg="successfully synchronized block metadata" duration=1.318880498s cached=345 returned=345 partial=0

-->

Anything else we need to know:

I would propose to add a flag similar to --sync-block-duration of thanos store and use it in https://github.com/thanos-io/thanos/blob/master/cmd/thanos/compact.go#L420 instead of the hardcoded time.Minute.
I would maybe propose --wait-sync-block-duration.

If this would ok, I'd be happy to open a PR for it.

@bwplotka
Copy link
Member

Thanks! sorry for lag, have you checked recent caching? It requires running Memcached though.

I think we are ok to have this flag as well 👍 Looking on your PR now.

@chrischdi
Copy link
Contributor Author

Hi, yes we are using the caching already but if I didn't get it wrong it is not an option for the compact cli (yet?)?

@bwplotka
Copy link
Member

bwplotka commented Jun 18, 2020 via email

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants