Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Shipper doesn't upload OOO compacted blocks #6462

Closed
yeya24 opened this issue Jun 21, 2023 · 12 comments
Closed

Shipper doesn't upload OOO compacted blocks #6462

yeya24 opened this issue Jun 21, 2023 · 12 comments

Comments

@yeya24
Copy link
Contributor

yeya24 commented Jun 21, 2023

What happened:

If OOO is enabled, then shipper won't upload those OOO compacted blocks because their compaction level is > 1 but shipper doesn't upload compacted blocks when OOO is enabled.

https://github.com/thanos-io/thanos/blob/main/pkg/shipper/shipper.go#L295

See cortexproject/cortex#5402 also

What you expected to happen:

If out of order samples feature is enabled, shipper should still be able to upload those compacted blocks to objstore.

How to reproduce it (as minimally and precisely as possible):

Run two Prometheus instances with the same external labels. Remote write to one receiver with object store configured and OOO enabled.

Alternatives:

I am not sure about Prometheus' logic of OOO block compaction. But it sounds weird because compaction should be turned off so even if we have overlapped blocks generated from OOO, we shouldn't do any local compaction but let compactors take care of it. Seems like TSDB instead always tries to compact overlapping blocks and we don't have a way to disable that.

@yeya24
Copy link
Contributor Author

yeya24 commented Jun 21, 2023

I am able to see blocks still being compacted due to overlapping. It was generated from OOO blocks.

{
	"ulid": "01H3BPCJCK1VR0PP9C9BQ6F35V",
	"minTime": 1687240800000,
	"maxTime": 1687241100000,
	"stats": {
		"numSamples": 1050,
		"numSeries": 35,
		"numChunks": 85
	},
	"compaction": {
		"level": 2,
		"sources": [
			"01H3BPAV0PM2039C54HDP3BG1Z",
			"01H3BPCHXCQPQYSG477FJY3RX7"
		],
		"parents": [
			{
				"ulid": "01H3BPAV0PM2039C54HDP3BG1Z",
				"minTime": 1687240800000,
				"maxTime": 1687241100000
			},
			{
				"ulid": "01H3BPCHXCQPQYSG477FJY3RX7",
				"minTime": 1687240800000,
				"maxTime": 1687241100000
			}
		]
	},
	"version": 1
}

@fpetkovski
Copy link
Contributor

Is there a way to distinguish an OOO block from a regular one?

@yeya24
Copy link
Contributor Author

yeya24 commented Jun 22, 2023

@fpetkovski, there is a method to set compaction hints to out of order https://github.com/prometheus/prometheus/blob/main/tsdb/block.go#L196.

But this seems only available on lv1 block created from OOO head. After 1 compaction this hint will be removed.

@fpetkovski
Copy link
Contributor

Hm, maybe the shipper could inspect source block metas and figure out that way if a block is created from an OOO chunk.

@yeya24
Copy link
Contributor Author

yeya24 commented Jun 22, 2023

@fpetkovski Not sure if it works or not because source blocks will be deleted after compaction so shipper might miss that? I guess having a fsnotify watcher might work, but that over complicates things.

What's the downside if we just enable upload compacted blocks when OOO is enabled?

@ahurtaud
Copy link
Contributor

I have a different approach of the same issue:
my thanos receive has some out-of-order blocks:

ts=2023-07-25T11:00:17.509047296Z caller=compact.go:523 level=info component=receive component=multi-tsdb tenant=rbox msg="write block" mint=1690272000000 maxt=1690279200000 ulid=01H66B9DW1T2V781EZQ4FN5CEX duration=16.483687516s
ts=2023-07-25T11:00:17.582559703Z caller=db.go:1552 level=info component=receive component=multi-tsdb tenant=rbox msg="Deleting obsolete block" block=01H3RWG46AX39WXYM4EGQXHZ5R
ts=2023-07-25T11:00:18.133169984Z caller=head.go:1286 level=info component=receive component=multi-tsdb tenant=rbox msg="Head GC completed" caller=truncateMemory duration=550.324171ms
ts=2023-07-25T11:00:18.227512302Z caller=checkpoint.go:100 level=info component=receive component=multi-tsdb tenant=rbox msg="Creating checkpoint" from_segment=1979 to_segment=1983 mint=1690279200000
ts=2023-07-25T11:00:26.976879201Z caller=head.go:1254 level=info component=receive component=multi-tsdb tenant=rbox msg="WAL checkpoint complete" first=1979 last=1983 duration=8.749608806s
ts=2023-07-25T11:00:36.848874675Z caller=compact.go:523 level=info component=receive component=multi-tsdb tenant=rbox msg="write block" mint=1690272000000 maxt=1690279200000 ulid=01H66BABNW48B74PW4YRAM31TQ duration=5.300701905s
ts=2023-07-25T11:00:36.859603042Z caller=db.go:1220 level=info component=receive component=multi-tsdb tenant=rbox msg="out-of-order compaction completed" duration=5.311420471s ulids=[01H66BABNW48B74PW4YRAM31TQ]
ts=2023-07-25T11:00:36.880953971Z caller=db.go:1396 level=warn component=receive component=multi-tsdb tenant=rbox msg="Overlapping blocks found during reloadBlocks" detail="[mint: 1690272000000, maxt: 1690279200000, range: 2h0m0s, blocks: 2]: <ulid: 01H66B9DW1T2V781EZQ4FN5CEX, mint: 1690272000000, maxt: 1690279200000, range: 2h0m0s>, <ulid: 01H66BABNW48B74PW4YRAM31TQ, mint: 1690272000000, maxt: 1690279200000, range: 2h0m0s>"
ts=2023-07-25T11:00:37.250954605Z caller=head.go:1286 level=info component=receive component=multi-tsdb tenant=rbox msg="Head GC completed" caller=truncateOOO duration=369.385013ms
ts=2023-07-25T11:00:37.296653766Z caller=compact.go:708 level=info component=receive component=multi-tsdb tenant=rbox msg="Found overlapping blocks during compaction" ulid=01H66BAH90X25EJN5S5BJ4JEF2
ts=2023-07-25T11:00:38.505274618Z caller=shipper.go:334 level=info component=receive component=multi-tsdb tenant=rbox msg="upload new block" id=01H66B9DW1T2V781EZQ4FN5CEX
ts=2023-07-25T11:00:40.200875882Z caller=shipper.go:334 level=info component=receive component=multi-tsdb tenant=rbox msg="upload new block" id=01H66BABNW48B74PW4YRAM31TQ
ts=2023-07-25T11:00:54.876861119Z caller=compact.go:464 level=info component=receive component=multi-tsdb tenant=rbox msg="compact blocks" count=2 mint=1690272000000 maxt=1690279200000 ulid=01H66BAH90X25EJN5S5BJ4JEF2 sources="[01H66B9DW1T2V781EZQ4FN5CEX 01H66BABNW48B74PW4YRAM31TQ]" duration=17.59654791s
ts=2023-07-25T11:00:54.947662343Z caller=db.go:1552 level=info component=receive component=multi-tsdb tenant=rbox msg="Deleting obsolete block" block=01H66BABNW48B74PW4YRAM31TQ
ts=2023-07-25T11:00:55.042155878Z caller=db.go:1552 level=info component=receive component=multi-tsdb tenant=rbox msg="Deleting obsolete block" block=01H66B9DW1T2V781EZQ4FN5CEX

So the TSDB of receive actually rewrite a new block with the hint: from-out-of-order.

However, both blocks have been uploaded by shipper which halts my compactor because of overlapping blocks.

meta.json 01H66BABNW48B74PW4YRAM31TQ
{
  "ulid": "01H66BABNW48B74PW4YRAM31TQ",
  "minTime": 1690272000000,
  "maxTime": 1690279200000,
  "stats": {
  	"numSamples": 135016,
  	"numSeries": 91424,
  	"numChunks": 91424
  },
  "compaction": {
  	"level": 1,
  	"sources": [
  		"01H66BABNW48B74PW4YRAM31TQ"
  	],
  	"hints": [
  		"from-out-of-order"
  	]
  },
  "version": 1,
  "thanos": {
  	"labels": {
  		"prometheus": "rbox",
  		"receive": "true",
  		"receive_replica": "thanos-receive-rbox-ingester-0",
  		"receiver": "rbox",
  		"stack": "ccp-ne-ogob01a",
  		"tenant_id": "rbox"
  	},
  	"downsample": {
  		"resolution": 0
  	},
  	"source": "receive",
  	"segment_files": [
  		"000001"
  	],
  	"files": [
  		{
  			"rel_path": "chunks/000001",
  			"size_bytes": 2271083
  		},
  		{
  			"rel_path": "index",
  			"size_bytes": 16322685
  		},
  		{
  			"rel_path": "meta.json"
  		}
  	]
  }
}
meta.json 01H66BAH90X25EJN5S5BJ4JEF2 (not uploaded thats the issue)
{
  "ulid": "01H66BAH90X25EJN5S5BJ4JEF2",
  "minTime": 1690272000000,
  "maxTime": 1690279200000,
  "stats": {
  	"numSamples": 63400665,
  	"numSeries": 255020,
  	"numChunks": 532703
  },
  "compaction": {
  	"level": 2,
  	"sources": [
  		"01H66B9DW1T2V781EZQ4FN5CEX",
  		"01H66BABNW48B74PW4YRAM31TQ"
  	],
  	"parents": [
  		{
  			"ulid": "01H66B9DW1T2V781EZQ4FN5CEX",
  			"minTime": 1690272000000,
  			"maxTime": 1690279200000
  		},
  		{
  			"ulid": "01H66BABNW48B74PW4YRAM31TQ",
  			"minTime": 1690272000000,
  			"maxTime": 1690279200000
  		}
  	]
  },
  "version": 1
}

(i deleted 01H66B9DW1T2V781EZQ4FN5CEX to fix my compactor but it was classic with no hint)

Here the source blocks were uploaded (which halt my compactor because of overlapping blocks) but not the compacted block level2.

I think this level2 block should be the only block being uploaded to longterm storage, maybe by adding an upload delay to the shipper (as the source blocks are instantly removed by TSDB, but right after shipper upload :( see logs. )? WDYT?

@yeya24
Copy link
Contributor Author

yeya24 commented Jul 27, 2023

@ahurtaud I think it is an interesting idea. But it is hard to ensure that within this delay the OOO block compaction will happen and we can get the l2 block.
One way I think is that we can skip l1 block with from-out-of-order hints. But we still need to support upload compacted block for the l2 block. It would be great if the l2 block can have some hints tbh.

@fpetkovski
Copy link
Contributor

fpetkovski commented Jul 28, 2023

I wonder if we can modify the duplicate filter to filter out source OOO blocks once a compacted version is available. I am not familiar with OOO so I don't exactly know how that detection would work.

@yeya24
Copy link
Contributor Author

yeya24 commented Jul 28, 2023

@fpetkovski I was thinking about the same, but it requires the upload delay and I feel it cannot ensure we don't upload L1 blocks.
To me I feel it is simplest to just upload both blocks and enable vertical compaction.

@yeya24
Copy link
Contributor Author

yeya24 commented Oct 29, 2023

I think what we will do is to always enable shipper uploading compacted blocks in Cortex. It is used in ingester which is almost the same as receiver.
I found other than OOO samples scenario, there is no other way to generate compacted blocks at ingester so it should be safe to enable it.

@GiedriusS
Copy link
Member

We now disable overlapping compaction in the TSDB/Receiver so this doesn't happen anymore. I'll close the ticket

@bmiguel-teixeira
Copy link

Hey,

Is there a MR or Release that contains this change/fix/improvement?

CharlieTLe added a commit to CharlieTLe/cortex that referenced this issue May 18, 2024
… parameterize uploading compacted blocks

In v1.15.2, ingesters configured with OOO samples ingestion enabled
could hit this bug (cortexproject#5402)
where ingesters would not upload compacted blocks
(thanos-io/thanos#6462).

In v1.16.1, ingesters are configured to always upload compacted blocks
(cortexproject#5625).

In v1.17, ingesters stopped uploading compacted blocks
(cortexproject#5735).

This can cause problems for users upgrading from v1.15.2 with OOO
ingestion enabled to v1.17 because both versions are hard coded to
disable uploading compacted blocks from the ingesters.

The workaround was to downgrade from v1.17 to v1.16 to allow those
compacted blocks to be uploaded (and eventually deleted).

The new flag is set to true by default which reverts the behavior of the
ingester uploading compacted blocks back to v1.16.

Signed-off-by: Charlie Le <[email protected]>
yeya24 pushed a commit to cortexproject/cortex that referenced this issue May 19, 2024
… parameterize uploading compacted blocks (#5959)

In v1.15.2, ingesters configured with OOO samples ingestion enabled
could hit this bug (#5402)
where ingesters would not upload compacted blocks
(thanos-io/thanos#6462).

In v1.16.1, ingesters are configured to always upload compacted blocks
(#5625).

In v1.17, ingesters stopped uploading compacted blocks
(#5735).

This can cause problems for users upgrading from v1.15.2 with OOO
ingestion enabled to v1.17 because both versions are hard coded to
disable uploading compacted blocks from the ingesters.

The workaround was to downgrade from v1.17 to v1.16 to allow those
compacted blocks to be uploaded (and eventually deleted).

The new flag is set to true by default which reverts the behavior of the
ingester uploading compacted blocks back to v1.16.

Signed-off-by: Charlie Le <[email protected]>
yeya24 pushed a commit to cortexproject/cortex that referenced this issue May 20, 2024
… parameterize uploading compacted blocks (#5959)

In v1.15.2, ingesters configured with OOO samples ingestion enabled
could hit this bug (#5402)
where ingesters would not upload compacted blocks
(thanos-io/thanos#6462).

In v1.16.1, ingesters are configured to always upload compacted blocks
(#5625).

In v1.17, ingesters stopped uploading compacted blocks
(#5735).

This can cause problems for users upgrading from v1.15.2 with OOO
ingestion enabled to v1.17 because both versions are hard coded to
disable uploading compacted blocks from the ingesters.

The workaround was to downgrade from v1.17 to v1.16 to allow those
compacted blocks to be uploaded (and eventually deleted).

The new flag is set to true by default which reverts the behavior of the
ingester uploading compacted blocks back to v1.16.

Signed-off-by: Charlie Le <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants