Compactor doesn't compact data as configured #6313

amewayne · 2023-04-24T08:52:28Z

Thanos, Prometheus and Golang version used:
Thanos: v0.30.2
Prometheus: 2.42.0
Go: 1.19.5

Object Storage Provider:

Minio/S3

What happened:

Thanos-Compact stopped to compact data from Minio after a period of time.
I configured retention.resolution-raw=2d when I started compactor, but I can see that there are raw datas over 2 days.
I tried to restart compactor and it started to work for a while, but still stopped to compact soon.

Have no clue about this issue because everything seems good and no error found in the logs.

This is the command that I used to start compactor:

docker run --name="thanos-compactor" -d --restart=always -v /opt/prometheus-config:/etc/prometheus -v /data0/compact-data:/data0/compact-data --net=host thanos:v0.30.2 compact --http-address=0.0.0.0:10902 --data-dir=/data0/compact-data --objstore.config-file=/etc/prometheus/minio/prod.yml --compact.cleanup-interval=5m --wait --wait-interval=5m --retention.resolution-raw=2d --retention.resolution-5m=10d --retention.resolution-1h=60d  --log.level=info

What you expected to happen:

Compactor keeps working to compact data, and there will not be raw data older than 2 days on Minio.

How to reproduce it (as minimally and precisely as possible):

Full logs to relevant components:

level=info ts=2023-04-24T07:52:16.002244007Z caller=clean.go:34 msg="started cleaning of aborted partial uploads"
level=info ts=2023-04-24T07:52:16.002258707Z caller=clean.go:61 msg="cleaning of aborted partial uploads done"
level=info ts=2023-04-24T07:52:16.002268737Z caller=blocks_cleaner.go:44 msg="started cleaning of blocks marked for deletion"
level=info ts=2023-04-24T07:52:16.002307288Z caller=blocks_cleaner.go:58 msg="cleaning of blocks marked for deletion done"
level=info ts=2023-04-24T07:52:18.262869543Z caller=fetcher.go:478 component=block.BaseFetcher msg="successfully synchronized block metadata" duration=2.260546335s duration_ms=2260 cached=1973 returned=1878 partial=0
level=info ts=2023-04-24T07:52:18.263340051Z caller=compact.go:1296 msg="start of GC"
level=info ts=2023-04-24T07:52:18.270631636Z caller=compact.go:1319 msg="start of compactions"
level=info ts=2023-04-24T07:52:18.271490099Z caller=compact.go:1005 group="0@{monitor=\"kubernetes\", replica=\"1\"}" groupKey=0@10753051500942971680 msg="compaction available and planned; downloading blocks" plan="[01GYG0KACQXWFAXK6K7727TZ2S (min time: 1682006400000, max time: 1682013600000) 01GYG7F1MVKRFJDP3JKGP1S4N2 (min time: 1682013600000, max time: 1682020800000) 01GYGEARWWHF48ZQF6PH69D1XS (min time: 1682020800000, max time: 1682028000000) 01GYGN6G5CZYV7QDFVP6W5NJEK (min time: 1682028000000, max time: 1682035200000)]"
level=info ts=2023-04-24T07:53:08.08875303Z caller=fetcher.go:478 component=block.BaseFetcher msg="successfully synchronized block metadata" duration=2.069151008s duration_ms=2069 cached=1973 returned=1973 partial=0

Anything else we need to know:

The text was updated successfully, but these errors were encountered:

amewayne · 2023-04-27T06:31:26Z

Finally got some clues about this issue. I cleaned up the directory of compactor and restarted it with --compact.concurrency flag, and compactor seems working well this time.

If my understanding is correct, thanos_compact_todo_compaction_blocks stands for the number of current backlogs that need to compacted. thanos_compact_todo_compaction_blocks shows there are over 2k blocks that need to be handled, so it looks like that compactor stopped working. After adding --compact.concurrency, thanos_compact_todo_compaction_blocks keeps going down, so I guess it's related.

beramod · 2023-07-27T01:30:28Z

Hi @amewayne
I have a similar problem.
Could you tell me how you set compact.concurrency?

amewayne · 2023-07-27T09:15:26Z

Hi @amewayne I have a similar problem. Could you tell me how you set compact.concurrency?

I set --compact.concurrency=15 on a 32-core server.

Kiara0107 · 2023-11-02T14:01:53Z

I have the same issue, see #6866, I increased the concurrency form 1 to 6, but I see no difference. Do you have any other tips?

yeya24 · 2023-11-02T16:37:46Z

@Kiara0107 https://thanos.io/tip/operating/compactor-backlog.md/#troubleshoot-compactor-backlog Please take a look at this doc.

It might take some time to show differences after you increased concurrency since compaction takes time.

Kiara0107 · 2023-11-03T08:20:27Z

Thanks, I've read the doc, that's why I increased the concurrency. Unfortunately no changes after 24 hours. I would say I should have seen some results by now.

yeya24 added the component: compact label Nov 2, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Compactor doesn't compact data as configured #6313

Compactor doesn't compact data as configured #6313

amewayne commented Apr 24, 2023 •

edited

Loading

amewayne commented Apr 27, 2023

beramod commented Jul 27, 2023 •

edited

Loading

amewayne commented Jul 27, 2023

Kiara0107 commented Nov 2, 2023

yeya24 commented Nov 2, 2023

Kiara0107 commented Nov 3, 2023

Compactor doesn't compact data as configured #6313

Compactor doesn't compact data as configured #6313

Comments

amewayne commented Apr 24, 2023 • edited Loading

amewayne commented Apr 27, 2023

beramod commented Jul 27, 2023 • edited Loading

amewayne commented Jul 27, 2023

Kiara0107 commented Nov 2, 2023

yeya24 commented Nov 2, 2023

Kiara0107 commented Nov 3, 2023

amewayne commented Apr 24, 2023 •

edited

Loading

beramod commented Jul 27, 2023 •

edited

Loading