thanos compactor crashes with "write compaction: chunk 8 not found: reference sequence 0 out of range" #1300

bjakubski · 2019-07-03T10:10:37Z

Thanos, Prometheus and Golang version used

thanos, version 0.5.0 (branch: HEAD, revision: 72820b3)
build user: circleci@eeac5eb36061
build date: 20190606-10:53:12
go version: go1.12.5
What happened

thanos compactor crashes with ""write compaction: chunk 8 not found: reference sequence 0 out of range"

What you expected to happen

Should work fine :-)

How to reproduce it (as minimally and precisely as possible):

Not sure :-/

Full logs to relevant components

Out of the list of objects dumped along with error message I've found one without chunks

$  gsutil ls -r gs://REDACTED/01DBZNNTM2557YW8T35RBM676P
gs://REDACTED/01DBZNNTM2557YW8T35RBM676P/:
gs://REDACTED/01DBZNNTM2557YW8T35RBM676P/index
gs://REDACTED/01DBZNNTM2557YW8T35RBM676P/meta.json

meta.json contents:

{
	"version": 1,
	"ulid": "01DBZNNTM2557YW8T35RBM676P",
	"minTime": 1557993600000,
	"maxTime": 1558022400000,
	"stats": {
		"numSamples": 35591778,
		"numSeries": 39167,
		"numChunks": 299446
	},
	"compaction": {
		"level": 2,
		"sources": [
			"01DB04S079GCEBKMTWZBH8HQA3",
			"01DB0BMQG3W7M12M8DE3V9QW5C",
			"01DB0JGEQD5RCZ50JAS2NENHQ6",
			"01DB0SC6071QW08JWQVG000AKF"
		],
		"parents": [
			{
				"ulid": "01DB04S079GCEBKMTWZBH8HQA3",
				"minTime": 1557993600000,
				"maxTime": 1558000800000
			},
			{
				"ulid": "01DB0BMQG3W7M12M8DE3V9QW5C",
				"minTime": 1558000800000,
				"maxTime": 1558008000000
			},
			{
				"ulid": "01DB0JGEQD5RCZ50JAS2NENHQ6",
				"minTime": 1558008000000,
				"maxTime": 1558015200000
			},
			{
				"ulid": "01DB0SC6071QW08JWQVG000AKF",
				"minTime": 1558015200000,
				"maxTime": 1558022400000
			}
		]
	},
	"thanos": {
		"labels": {
			"environment": "devint",
			"instance_number": "1",
			"location": "REDACTED",
			"prometheus": "monitoring/prometheus-operator-prometheus",
			"prometheus_replica": "prometheus-prometheus-operator-prometheus-1",
			"stack": "data"
		},
		"downsample": {
			"resolution": 0
		},
		"source": "compactor"
	}
}

It is an object created by compactor apparently.

We've been running compactor for some time, but after a while (due to lack of local disk storage) it was crashing constantly. After an extended period of crashing storage has been added and compactor was able to go further until it encountered problem described here.

I guess important part is how such object ended up in bucket, although I wonder if it is possible for thanos to ignore such objects and keep processing data rest of data? (exposing data about bad objects in metrics)

I'd guess that it was somehow created during constant crashes we've had earlier, but have nothing to support that.

Anything else we need to know

#688 describes similar issue, although it is about much older thanos than we use here.
We've been running 0.5 and 0.4 before that. I'm not sure, but it is possible that 0.3.2 (compactor) was used at the beginning.

The text was updated successfully, but these errors were encountered:

GiedriusS · 2019-07-09T20:10:59Z

Sounds like Thanos Compact has crashed midway during the upload and you got inconsistent data. Nowadays Thanos Compact removes old, malformed data that is older than the maximum of consistency delay and the minimum age for removal (30 minutes). Does that not work for you for some reason, @bjakubski?

GiedriusS · 2019-07-09T20:11:31Z

Also, Thanos Compact now has metrics about failed compactions (in 0.6.0 which is soon to be released) so you can use that :P

bjakubski · 2019-07-11T16:00:21Z

I've just encountered the same problem on another object in bucket, but this time it was created by sidecar. I do not see any message in logs about sidecar ever uploading it, but this object is older (> week old) but according to GCP storage it was just created yesterday. I see messages in prometheus logs
2019-06-29 "write block"
2019-07-02 "found healthy block" (probably after prom restart)
2019-07-10 21:31 "found healthy block) (after prom restart)

object in bucket seems to have been created at 2019-07-10 21:25 - which is just after one of thanos sidecars successfully connected to prometheus (after pod restart). In logs I can see it uploading various objects, but nothing about this particular ulid

Now: I'm running unusual setup with thanos-sidecar: prometheus compaction is disabled, but retention is quite long: 15d IIRC It generally seems to work fine, although I wonder if it might cause problems like I have here?

What is the consistency delay? Should I've been expecting compactor to ignore/remove this object from bucket?

zhaopf1112 · 2019-07-19T08:19:55Z

Also I met the same question about this , the error log is write compaction: chunk 8 not found: reference sequence 0 out of range, i found the compactor pods always crashoff , and then restart pod again, below is the pod information. and i don't know how to fix this question, could anyone know the issue?

prometheus-operator-57fc4686dc-gw8st   1/1     Running   2          17d
thanos-compactor-0                     1/1     Running   162        24h
thanos-query-596fd87fcc-2tmx8          1/1     Running   0          24h
thanos-store-0                         1/1     Running   0          24h

kklimonda · 2019-07-30T11:37:17Z

I'm seeing the same error for blocks uploaded with thanos-compactor 0.6.0 (and then processed by 0.6.0) myself. Backend storage is Ceph cluster via Swift API.

thanos, version 0.6.0 (branch: HEAD, revision: c70b80eb83e52f5013ed5ffab72464989c132883)
  build user:       root@fd15c8276e5c
  build date:       20190718-11:11:58
  go version:       go1.12.5

thanos-compactor has uploaded a compacted block yesterday:

Jul 29 14:03:47 thanos01 thanos[32452]: level=info ts=2019-07-29T14:03:47.456113025Z caller=compact.go:444 msg="compact blocks" count=4 mint=1561622400000 maxt=1561651200000 ulid=01DGZ0PDS2MFX25P9CKG1C1TA3 sources="[01DEC9F6B0BVQHFG7BZDS75N0V 01DECGAXK0CM9HC2QPZ7MMH8M0 01DECQ6MV0K0FR00HHY9906178 01DECY2C2ZX4F6N9GKHKM2PC61]" duration=9.524377767s
Jul 29 14:03:47 thanos01 thanos[32452]: level=warn ts=2019-07-29T14:03:47.752332697Z caller=runutil.go:108 compactionGroup="0@{datacenter=\"p24\",environment=\"internal\",replica=\"02\"}" msg="detected close error" err="close file /var/lib/thanos/compactor/compact/0@{datacenter=\"p24\",environment=\"internal\",replica=\"02\"}/01DGZ0PDS2MFX25P9CKG1C1TA3/meta.json: close /var/lib/thanos/compactor/compact/0@{datacenter=\"p24\",environment=\"internal\",replica=\"02\"}/01DGZ0PDS2MFX25P9CKG1C1TA3/meta.json: file already closed"
Jul 29 14:03:55 thanos01 thanos[32452]: level=warn ts=2019-07-29T14:03:55.115560765Z caller=runutil.go:108 compactionGroup="0@{datacenter=\"p24\",environment=\"internal\",replica=\"02\"}" msg="detected close error" err="close file /var/lib/thanos/compactor/compact/0@{datacenter=\"p24\",environment=\"internal\",replica=\"02\"}/01DGZ0PDS2MFX25P9CKG1C1TA3/chunks/000001: close /var/lib/thanos/compactor/compact/0@{datacenter=\"p24\",environment=\"internal\",replica=\"02\"}/01DGZ0PDS2MFX25P9CKG1C1TA3/chunks/000001: file already closed"
Jul 29 14:03:56 thanos01 thanos[32452]: level=warn ts=2019-07-29T14:03:56.472928348Z caller=runutil.go:108 compactionGroup="0@{datacenter=\"p24\",environment=\"internal\",replica=\"02\"}" msg="detected close error" err="close file /var/lib/thanos/compactor/compact/0@{datacenter=\"p24\",environment=\"internal\",replica=\"02\"}/01DGZ0PDS2MFX25P9CKG1C1TA3/index: close /var/lib/thanos/compactor/compact/0@{datacenter=\"p24\",environment=\"internal\",replica=\"02\"}/01DGZ0PDS2MFX25P9CKG1C1TA3/index: file already closed"
Jul 29 14:03:56 thanos01 thanos[32452]: level=warn ts=2019-07-29T14:03:56.571187127Z caller=runutil.go:108 compactionGroup="0@{datacenter=\"p24\",environment=\"internal\",replica=\"02\"}" msg="detected close error" err="close file /var/lib/thanos/compactor/compact/0@{datacenter=\"p24\",environment=\"internal\",replica=\"02\"}/01DGZ0PDS2MFX25P9CKG1C1TA3/index.cache.json: close /var/lib/thanos/compactor/compact/0@{datacenter=\"p24\",environment=\"internal\",replica=\"02\"}/01DGZ0PDS2MFX25P9CKG1C1TA3/index.cache.json: file already closed"
Jul 29 14:03:56 thanos01 thanos[32452]: level=warn ts=2019-07-29T14:03:56.72380147Z caller=runutil.go:108 compactionGroup="0@{datacenter=\"p24\",environment=\"internal\",replica=\"02\"}" msg="detected close error" err="close file /var/lib/thanos/compactor/compact/0@{datacenter=\"p24\",environment=\"internal\",replica=\"02\"}/01DGZ0PDS2MFX25P9CKG1C1TA3/meta.json: close /var/lib/thanos/compactor/compact/0@{datacenter=\"p24\",environment=\"internal\",replica=\"02\"}/01DGZ0PDS2MFX25P9CKG1C1TA3/meta.json: file already closed"
Jul 29 14:03:56 thanos01 thanos[32452]: level=debug ts=2019-07-29T14:03:56.723861375Z caller=compact.go:906 compactionGroup="0@{datacenter=\"p24\",environment=\"internal\",replica=\"02\"}" msg="uploaded block" result_block=01DGZ0PDS2MFX25P9CKG1C1TA3 duration=8.994729984s

and now it's choking on that block:

Jul 30 11:25:38 thanos01 thanos[22465]: level=debug ts=2019-07-30T11:25:38.310116816Z caller=compact.go:252 msg="download meta" block=01DGZ0PDS2MFX25P9CKG1C1TA3
Jul 30 11:25:47 thanos01 thanos[22465]: level=debug ts=2019-07-30T11:25:47.039193509Z caller=compact.go:840 compactionGroup="0@{datacenter=\"p24\",environment=\"internal\",replica=\"02\"}" msg="downloaded and verified blocks" blocks="[/var/lib/thanos/compactor/compact/0@{datacenter=\"p24\",environment=\"internal\",replica=\"02\"}/01DEBE09AZXVCNWKYCWP43A7G4 /var/lib/thanos/compactor/compact/0@{datacenter=\"p24\",environment=\"internal\",replica=\"02\"}/01DGZ0PDS2MFX25P9CKG1C1TA3]" duration=7.277184353s
Jul 30 11:25:47 thanos01 thanos[22465]: level=error ts=2019-07-30T11:25:47.214649027Z caller=main.go:199 msg="running command failed" err="error executing compaction: compaction failed: compaction failed for group 0@{datacenter=\"p24\",environment=\"internal\",replica=\"02\"}: compact blocks [/var/lib/thanos/compactor/compact/0@{datacenter=\"p24\",environment=\"internal\",replica=\"02\"}/01DEBE09AZXVCNWKYCWP43A7G4 /var/lib/thanos/compactor/compact/0@{datacenter=\"p24\",environment=\"internal\",replica=\"02\"}/01DGZ0PDS2MFX25P9CKG1C1TA3]: write compaction: chunk 8 not found: reference sequence 0 out of range"

First, I'd expect it to survive broken blocks, but what's more concerning is that the block has been uploaded successfully before (unless those warning messages are not there just for show, and there is indeed something wrong).

What's uploaded:

$ swift list prometheus-storage | grep 01DGZ0PDS2MFX25P9CKG1C1TA3
01DGZ0PDS2MFX25P9CKG1C1TA3/chunks/000001
01DGZ0PDS2MFX25P9CKG1C1TA3/index
01DGZ0PDS2MFX25P9CKG1C1TA3/index.cache.json
01DGZ0PDS2MFX25P9CKG1C1TA3/meta.json
debug/metas/01DGZ0PDS2MFX25P9CKG1C1TA3.json

and meta.json:

{
	"ulid": "01DGZ0PDS2MFX25P9CKG1C1TA3",
	"minTime": 1561622400000,
	"maxTime": 1561651200000,
	"stats": {
		"numSamples": 271799367,
		"numSeries": 141915,
		"numChunks": 2265057
	},
	"compaction": {
		"level": 2,
		"sources": [
			"01DEC9F6B0BVQHFG7BZDS75N0V",
			"01DECGAXK0CM9HC2QPZ7MMH8M0",
			"01DECQ6MV0K0FR00HHY9906178",
			"01DECY2C2ZX4F6N9GKHKM2PC61"
		],
		"parents": [
			{
				"ulid": "01DEC9F6B0BVQHFG7BZDS75N0V",
				"minTime": 1561622400000,
				"maxTime": 1561629600000
			},
			{
				"ulid": "01DECGAXK0CM9HC2QPZ7MMH8M0",
				"minTime": 1561629600000,
				"maxTime": 1561636800000
			},
			{
				"ulid": "01DECQ6MV0K0FR00HHY9906178",
				"minTime": 1561636800000,
				"maxTime": 1561644000000
			},
			{
				"ulid": "01DECY2C2ZX4F6N9GKHKM2PC61",
				"minTime": 1561644000000,
				"maxTime": 1561651200000
			}
		]
	},
	"version": 1,
	"thanos": {
		"labels": {
			"datacenter": "p24",
			"environment": "internal",
			"replica": "02"
		},
		"downsample": {
			"resolution": 0
		},
		"source": "compactor"
	}
}

caarlos0 · 2019-08-08T20:58:02Z

This also breaks when running with --wait.

I was expecting it to fail, but not quit... but it did crashloop with that error.

After some time I deleted the pod and it seems to be running fine now... maybe that 30m mentioned above?

caarlos0 · 2019-08-08T21:00:46Z

OK that was quick, just crashlooped again

bwplotka · 2019-08-09T10:25:06Z

Are you hitting this maybe? #1331

Essentially partial upload and empty or missing chunk files in <block ulid>/chunks folder?

bwplotka · 2019-08-09T10:27:21Z

You have 01DGZ0PDS2MFX25P9CKG1C1TA3/chunks/000001 but it might be partiall uploaded. I think we should focus on that issue then.

Thanks all for reporting let's investigate this upload path with minio client. Essentially this might be some assumption: #1331 (comment)

bwplotka · 2019-08-09T10:28:46Z

Let's continue on one of the issues. This looks like a duplicate of #1331

EDIT: Actually this one was first but I think it does not matter much - let's continue on one (: Hope this is ok with you guys!

caarlos0 · 2019-08-09T12:51:47Z

Are you hitting this maybe? #1331

not sure, I'm using GCS... that issue seems to be specific to S3...

bwplotka · 2019-08-09T14:13:19Z

hm, maybe we are wrong and GCS is affected as well. What about other question about chunk folder?

caarlos0 · 2019-08-09T14:20:36Z

sorry, didn't see that question...

So, the log says:

level=error ts=2019-08-09T14:09:55.107857264Z caller=main.go:199 msg="running command failed" err="error executing compaction: compaction failed: compaction failed for group 0@{server=\"central\"}: compact blocks [/var/thanos/compact/compact/0@{server=\"central\"}/01DB9JXXFE1HE1CYQM171PRMFK /var/thanos/compact/compact/0@{server=\"central\"}/01DBFCAMBRRMRT0SJAT516AVF4 /var/thanos/compact/compact/0@{server=\"central\"}/01DBN5QC3G1ZJXQZJ6MXZNPMV2 /var/thanos/compact/compact/0@{server=\"central\"}/01DBTZ3WZY9VR0T3MDZRTP5Z95 /var/thanos/compact/compact/0@{server=\"central\"}/01DC0RGG6EK3BMK96YVRBDQT6E]: write compaction: chunk 8 not found: reference sequence 0 out of range"

gs://REDACTED/01DB9JXXFE1HE1CYQM171PRMFK/:
 217104068  2019-06-24T17:25:14Z  gs://REDACTED/01DB9JXXFE1HE1CYQM171PRMFK/index
      1650  2019-06-24T17:25:14Z  gs://REDACTED/01DB9JXXFE1HE1CYQM171PRMFK/meta.json

gs://REDACTED/01DB9JXXFE1HE1CYQM171PRMFK/chunks/:
 536861830  2019-06-24T17:24:35Z  gs://REDACTED/01DB9JXXFE1HE1CYQM171PRMFK/chunks/000001
 536862360  2019-06-24T17:24:47Z  gs://REDACTED/01DB9JXXFE1HE1CYQM171PRMFK/chunks/000002
 536867046  2019-06-24T17:25:00Z  gs://REDACTED/01DB9JXXFE1HE1CYQM171PRMFK/chunks/000003
 404281529  2019-06-24T17:25:09Z  gs://REDACTED/01DB9JXXFE1HE1CYQM171PRMFK/chunks/000004
TOTAL: 6 objects, 2231978483 bytes (2.08 GiB)

gs://REDACTED/01DBFCAMBRRMRT0SJAT516AVF4/:
 239464138  2019-06-24T17:26:09Z  gs://REDACTED/01DBFCAMBRRMRT0SJAT516AVF4/index
      1650  2019-06-24T17:26:09Z  gs://REDACTED/01DBFCAMBRRMRT0SJAT516AVF4/meta.json

gs://REDACTED/01DBFCAMBRRMRT0SJAT516AVF4/chunks/:
 536868373  2019-06-24T17:25:26Z  gs://REDACTED/01DBFCAMBRRMRT0SJAT516AVF4/chunks/000001
 536863290  2019-06-24T17:25:39Z  gs://REDACTED/01DBFCAMBRRMRT0SJAT516AVF4/chunks/000002
 536870423  2019-06-24T17:25:52Z  gs://REDACTED/01DBFCAMBRRMRT0SJAT516AVF4/chunks/000003
 498277142  2019-06-24T17:26:03Z  gs://REDACTED/01DBFCAMBRRMRT0SJAT516AVF4/chunks/000004
TOTAL: 6 objects, 2348345016 bytes (2.19 GiB)

gs://REDACTED/01DBN5QC3G1ZJXQZJ6MXZNPMV2/:
 264888644  2019-06-24T17:27:05Z  gs://REDACTED/01DBN5QC3G1ZJXQZJ6MXZNPMV2/index
      1650  2019-06-24T17:27:06Z  gs://REDACTED/01DBN5QC3G1ZJXQZJ6MXZNPMV2/meta.json

gs://REDACTED/01DBN5QC3G1ZJXQZJ6MXZNPMV2/chunks/:
 536870054  2019-06-24T17:26:21Z  gs://REDACTED/01DBN5QC3G1ZJXQZJ6MXZNPMV2/chunks/000001
 536866656  2019-06-24T17:26:34Z  gs://REDACTED/01DBN5QC3G1ZJXQZJ6MXZNPMV2/chunks/000002
 536869259  2019-06-24T17:26:47Z  gs://REDACTED/01DBN5QC3G1ZJXQZJ6MXZNPMV2/chunks/000003
 536843539  2019-06-24T17:26:59Z  gs://REDACTED/01DBN5QC3G1ZJXQZJ6MXZNPMV2/chunks/000004
  23772607  2019-06-24T17:27:00Z  gs://REDACTED/01DBN5QC3G1ZJXQZJ6MXZNPMV2/chunks/000005
TOTAL: 7 objects, 2436112409 bytes (2.27 GiB)

gs://REDACTED/01DBTZ3WZY9VR0T3MDZRTP5Z95/:
 232606362  2019-06-24T17:27:59Z  gs://REDACTED/01DBTZ3WZY9VR0T3MDZRTP5Z95/index
      1650  2019-06-24T17:27:59Z  gs://REDACTED/01DBTZ3WZY9VR0T3MDZRTP5Z95/meta.json
TOTAL: 2 objects, 232608012 bytes (221.83 MiB)

gs://REDACTED/01DC0RGG6EK3BMK96YVRBDQT6E/:
 235607510  2019-06-24T17:29:12Z  gs://REDACTED/01DC0RGG6EK3BMK96YVRBDQT6E/index
      1650  2019-06-24T17:29:15Z  gs://REDACTED/01DC0RGG6EK3BMK96YVRBDQT6E/meta.json

gs://REDACTED/01DC0RGG6EK3BMK96YVRBDQT6E/chunks/:
 536856293  2019-06-24T17:28:15Z  gs://REDACTED/01DC0RGG6EK3BMK96YVRBDQT6E/chunks/000001
 536849008  2019-06-24T17:28:30Z  gs://REDACTED/01DC0RGG6EK3BMK96YVRBDQT6E/chunks/000002
 536865860  2019-06-24T17:28:43Z  gs://REDACTED/01DC0RGG6EK3BMK96YVRBDQT6E/chunks/000003
 532574433  2019-06-24T17:28:55Z  gs://REDACTED/01DC0RGG6EK3BMK96YVRBDQT6E/chunks/000004
TOTAL: 6 objects, 2378754754 bytes (2.22 GiB)

It seems to be the problem is 01DBTZ3WZY9VR0T3MDZRTP5Z95? Should I just delete the entire folder?

caarlos0 · 2019-08-12T14:24:16Z

anything I can do as a workaround so compact works again?

GiedriusS · 2019-08-13T18:06:38Z

You should delete the blocks which have duplicated data and only leave one copy. It's up to you to decide which one it is (: It sounds like you need to delete the one you've mentioned but please double check.

caarlos0 · 2019-08-13T18:07:28Z

You should delete the blocks which have duplicated data and only leave one copy. It's up to you to decide which one it is (: It sounds like you need to delete the one you've mentioned but please double check.

yeah, did that and it seems to be running now

aaron-trout · 2019-11-11T15:37:15Z

Hi, I am having this issue using thanos 0.8.1. I have tried moving the directories for blocks it complains about out of the bucket but then every time I run the compactor it just finds some more to be sad about :-(

This is crashing the compactor with 0.8.1 even with --wait flag.

Any help to further debug this would be appreciated! (@bwplotka maybe?)

bwplotka · 2019-12-17T14:46:34Z

Hi, we started some doc where we discussed another potential cause of this, which is the false assumption of partial block upload: https://docs.google.com/document/d/1QvHt9NXRvmdzy51s4_G21UW00QwfL9BfqBDegcqNvg0/edit#

bwplotka · 2020-01-03T11:31:27Z

Reopening as it still occurs. The mentioned doc above states potential issue (however quite unlikely) of compactor removing a block when it is being still uploaded for some reason (long upload / old block upload). We are working on fix for that particular case, but more info would be useful why chunk file is missing from the block.

ivan-kiselev · 2020-01-03T12:22:04Z

To compliment the last comment:

thanosio/thanos:v0.9.0

Compactor logs:

level=warn ts=2020-01-03T10:30:29.887625771Z caller=prober.go:117 msg="changing probe status" status=not-ready reason="error executing compaction: compaction failed: compaction failed for group 0@3883330991078906199: gather index issues for block /thanos-compact-data/compact/0@3883330991078906199/01DXKWN08YHZQMCHHJ3VQVVDSP: open index file: try lock file: open /thanos-compact-data/compact/0@3883330991078906199/01DXKWN08YHZQMCHHJ3VQVVDSP/index: no such file or directory"
level=info ts=2020-01-03T10:30:29.888132722Z caller=http.go:78 service=http/server component=compact msg="internal server shutdown" err="error executing compaction: compaction failed: compaction failed for group 0@3883330991078906199: gather index issues for block /thanos-compact-data/compact/0@3883330991078906199/01DXKWN08YHZQMCHHJ3VQVVDSP: open index file: try lock file: open /thanos-compact-data/compact/0@3883330991078906199/01DXKWN08YHZQMCHHJ3VQVVDSP/index: no such file or directory"

Meantime in the bucket (there is the only file meta.json in the bucket indeed):

{
	"ulid": "01DXKWN08YHZQMCHHJ3VQVVDSP",
	"minTime": 1577952000000,
	"maxTime": 1577980800000,
	"stats": {
		"numSamples": 954058569,
		"numSeries": 906289,
		"numChunks": 8144375
	},
	"compaction": {
		"level": 2,
		"sources": [
			"01DXJYK0WNNX8WKK3KXSK2JR6S",
			"01DXK5ER4N46ST170D3D8JX1FS",
			"01DXKCAFCYHS2Z94XP5QYBFJHA",
			"01DXKK66MG06PDAQAQD1YNS8AX"
		],
		"parents": [
			{
				"ulid": "01DXJYK0WNNX8WKK3KXSK2JR6S",
				"minTime": 1577952000000,
				"maxTime": 1577959200000
			},
			{
				"ulid": "01DXK5ER4N46ST170D3D8JX1FS",
				"minTime": 1577959200000,
				"maxTime": 1577966400000
			},
			{
				"ulid": "01DXKCAFCYHS2Z94XP5QYBFJHA",
				"minTime": 1577966400000,
				"maxTime": 1577973600000
			},
			{
				"ulid": "01DXKK66MG06PDAQAQD1YNS8AX",
				"minTime": 1577973600000,
				"maxTime": 1577980800000
			}
		]
	},
	"version": 1,
	"thanos": {
		"labels": {
                             ....
		},
		"downsample": {
			"resolution": 0
		},
		"source": "compactor"
	}
}

So it seems compactor failed to upload itself and not crashing against it.

This replaces man 4 inconsistent meta.json syncs places in other components. Fixes: #1335 Fixes: #1919 Fixes: #1300 * One place for sync logic for both compactor and store * Corrupted disk cache for meta.json is handled gracefully. * Blocks without meta.json are handled properly for all compactor phases. * Synchronize was not taking into account deletion by removing meta.json. * Prepare for future implementation of https://thanos.io/proposals/201901-read-write-operations-bucket.md/ * Better observability for syncronize process. * More logs for store startup process. * Remove Compactor Syncer. * Added metric for partialUploadAttempt deletions. * More tests. TODO in separate PR: * More observability for index-cache loading / adding time. Signed-off-by: Bartek Plotka <[email protected]> Signed-off-by: Bartlomiej Plotka <[email protected]>

This replaces man 4 inconsistent meta.json syncs places in other components. Fixes: #1335 Fixes: #1919 Fixes: #1300 * One place for sync logic for both compactor and store * Corrupted disk cache for meta.json is handled gracefully. * Blocks without meta.json are handled properly for all compactor phases. * Synchronize was not taking into account deletion by removing meta.json. * Prepare for future implementation of https://thanos.io/proposals/201901-read-write-operations-bucket.md/ * Better observability for syncronize process. * More logs for store startup process. * Remove Compactor Syncer. * Added metric for partialUploadAttempt deletions. * More tests. TODO in separate PR: * More observability for index-cache loading / adding time. Signed-off-by: Bartlomiej Plotka <[email protected]>

Fixes: #1335 Fixes: #1919 Fixes: #1300 * Clean up of meta files are now started only if block which is being uploaded is older than 2 days (only a mitigation). * Blocks without meta.json are handled properly for all compactor phases. * Prepare for future implementation of https://thanos.io/proposals/201901-read-write-operations-bucket.md/ * Added metric for partialUploadAttempt deletions and delayed it. * More tests. Signed-off-by: Bartlomiej Plotka <[email protected]>

bwplotka · 2020-01-06T13:05:08Z

We merged fix for the case of: #1919 which potentially might cause such chunk file to be missing.

GiedriusS added the component: compact label Jul 9, 2019

GiedriusS added the question label Jul 9, 2019

bwplotka added the duplicate label Aug 9, 2019

bwplotka closed this as completed Aug 9, 2019

bwplotka mentioned this issue Jan 3, 2020

Store gateway fails to sync block meta.json #1874

Closed

bwplotka reopened this Jan 3, 2020

bwplotka mentioned this issue Jan 3, 2020

Added block.MetaFetcher logic for resilient sync of meta files. #1934

Merged

bwplotka mentioned this issue Jan 3, 2020

Use block.MetaFetcher in Compactor. #1937

Merged

bwplotka closed this as completed in #1937 Jan 6, 2020

domino7 mentioned this issue Jan 16, 2020

*: Support multi-tenancy on object storage using custom prefix #1318

Closed

mateuszdrab mentioned this issue Dec 18, 2022

thanos compactor fails with populate block: chunk iter: cannot populate chunk 8: segment index 0 out of range #5978

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

thanos compactor crashes with "write compaction: chunk 8 not found: reference sequence 0 out of range" #1300

thanos compactor crashes with "write compaction: chunk 8 not found: reference sequence 0 out of range" #1300

bjakubski commented Jul 3, 2019

GiedriusS commented Jul 9, 2019

GiedriusS commented Jul 9, 2019

bjakubski commented Jul 11, 2019

zhaopf1112 commented Jul 19, 2019

kklimonda commented Jul 30, 2019

caarlos0 commented Aug 8, 2019

caarlos0 commented Aug 8, 2019

bwplotka commented Aug 9, 2019

bwplotka commented Aug 9, 2019

bwplotka commented Aug 9, 2019 •

edited

Loading

caarlos0 commented Aug 9, 2019

bwplotka commented Aug 9, 2019

caarlos0 commented Aug 9, 2019 •

edited

Loading

caarlos0 commented Aug 12, 2019

GiedriusS commented Aug 13, 2019

caarlos0 commented Aug 13, 2019

aaron-trout commented Nov 11, 2019

bwplotka commented Dec 17, 2019 •

edited

Loading

bwplotka commented Jan 3, 2020 •

edited

Loading

ivan-kiselev commented Jan 3, 2020 •

edited

Loading

bwplotka commented Jan 6, 2020

thanos compactor crashes with "write compaction: chunk 8 not found: reference sequence 0 out of range" #1300

thanos compactor crashes with "write compaction: chunk 8 not found: reference sequence 0 out of range" #1300

Comments

bjakubski commented Jul 3, 2019

GiedriusS commented Jul 9, 2019

GiedriusS commented Jul 9, 2019

bjakubski commented Jul 11, 2019

zhaopf1112 commented Jul 19, 2019

kklimonda commented Jul 30, 2019

caarlos0 commented Aug 8, 2019

caarlos0 commented Aug 8, 2019

bwplotka commented Aug 9, 2019

bwplotka commented Aug 9, 2019

bwplotka commented Aug 9, 2019 • edited Loading

caarlos0 commented Aug 9, 2019

bwplotka commented Aug 9, 2019

caarlos0 commented Aug 9, 2019 • edited Loading

caarlos0 commented Aug 12, 2019

GiedriusS commented Aug 13, 2019

caarlos0 commented Aug 13, 2019

aaron-trout commented Nov 11, 2019

bwplotka commented Dec 17, 2019 • edited Loading

bwplotka commented Jan 3, 2020 • edited Loading

ivan-kiselev commented Jan 3, 2020 • edited Loading

bwplotka commented Jan 6, 2020

bwplotka commented Aug 9, 2019 •

edited

Loading

caarlos0 commented Aug 9, 2019 •

edited

Loading

bwplotka commented Dec 17, 2019 •

edited

Loading

bwplotka commented Jan 3, 2020 •

edited

Loading

ivan-kiselev commented Jan 3, 2020 •

edited

Loading