Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

compact: could "blocks marked for deletions" be deleted at the beginning? #2605

Closed
ulikl opened this issue May 14, 2020 · 11 comments
Closed

Comments

@ulikl
Copy link

ulikl commented May 14, 2020

Thanos, Prometheus and Golang version used:
#thanos --version
thanos, version 0.12.2 (branch: HEAD, revision: 52e10c6)
build user: root@c1a6cf60f03d
build date: 20200430-16:24:03
go version: go1.13.6

Object Storage Provider:
S3 compatible

What happened:

I have problem with the s3 storage quota with the new feature #2136 from v0.12.0.

Generally this is a great feature.
But the blocks marked for deletion are only deleted at the end after the retention actions.
This needs more external storage than necessary as the steps before create new files.

What you expected to happen:

Could the deleting of the blocks are ready to be deleted be delete at the beginning of a compaction iteration?
This should not interfere with the other compact action, as these block are not any more used by compaction.

How to reproduce it (as minimally and precisely as possible):
With every compact execution.

Full logs to relevant components:

You can see the order of the action in log file:

level=info ts=2020-05-13T11:27:32.088281638Z caller=compact.go:892 msg="start of GC"
level=info ts=2020-05-13T11:27:32.088452538Z caller=compact.go:904 msg="start of compactions"
level=info ts=2020-05-13T11:27:32.170084918Z caller=compact.go:633 group="0@{store=\"emea\"}" groupKey=0@2207242157792197222 msg="compaction available and planned; downloading block
...
level=info ts=2020-05-13T11:32:11.373968898Z caller=compact.go:887 msg="start sync of metas"
level=info ts=2020-05-13T11:33:25.185776942Z caller=fetcher.go:451 component=block.BaseFetcher msg="successfully synchronized block metadata" duration=1m13.811763883s cached=689 returned=59 partial=0
level=info ts=2020-05-13T11:33:25.187312386Z caller=compact.go:892 msg="start of GC"
level=info ts=2020-05-13T11:33:25.187461328Z caller=compact.go:904 msg="start of compactions"
level=info ts=2020-05-13T11:33:25.330177556Z caller=compact.go:936 msg="compaction iterations done"
level=info ts=2020-05-13T11:33:25.330237655Z caller=compact.go:410 msg="downsampling was explicitly disabled"
level=info ts=2020-05-13T11:34:25.790819672Z caller=fetcher.go:451 component=block.BaseFetcher msg="successfully synchronized block metadata" duration=1m0.46055915s cached=689 returned=59 partial=0
level=info ts=2020-05-13T11:34:25.792727418Z caller=retention.go:30 msg="start optional retention"
level=info ts=2020-05-13T11:34:25.792957546Z caller=retention.go:45 msg="optional retention apply done"
level=info ts=2020-05-13T11:34:25.792978207Z caller=clean.go:33 msg="started cleaning of aborted partial uploads"
level=info ts=2020-05-13T11:34:25.792989017Z caller=clean.go:60 msg="cleaning of aborted partial uploads done"
level=info ts=2020-05-13T11:34:25.793001202Z caller=blocks_cleaner.go:43 msg="started cleaning of blocks marked for deletion"
level=info ts=2020-05-13T11:34:26.257318243Z caller=blocks_cleaner.go:53 msg="deleted block marked for deletion" block=01E7N100XFABK55ZBB38PQPGS7
...
@bwplotka
Copy link
Member

I guess we could, but still how that be helpful for you? It's a matter of few iterations really. Should not matter over longer period of time (:

@ulikl
Copy link
Author

ulikl commented May 19, 2020

You are right, usually it should not be much difference.
I was just uploading a few pending days at once, as there where problems accessing my storage.
So the deleted uncompacted blocks where quiet adding up.

@GiedriusS
Copy link
Member

Started thinking about this too, this could be painful where you might run into quota problems if Thanos Compactor wouldn't finish its iteration quickly enough. Maybe it would be nice to have some kind of subcommand under tools that would go ahead and wipe all of the blocks that have the deletion mark?

@pracucci
Copy link
Contributor

We're currently running into a similar problem. We have compactor which is continuously looping within the BucketCompactor.Compact() because there's always some work to do and we never hit the blocksCleaner.DeleteMarkedBlocks().

Have you considered decoupling the blocks deletion from the compactor main loop and running the blocks deletion in background with its own periodic ticker? It would require to do another full bucket scan to find blocks marked for deletion (using a dedicated metadata fetcher) but it wouldn't be blocked by the compactor.

@Jakob3xD
Copy link
Contributor

Jakob3xD commented Jun 2, 2020

Just for the visualization of the issue. Here is a pic of my graph. Due to the delayed deletion it creates a "peak" of ~2TB of data, which in my opinion is quit heavy.
image

@zygiss
Copy link

zygiss commented Jun 17, 2020

As a workaround, would running another instance of Compactor as a cron job with --debug.max-compaction-level=1 be a terrible idea here?

If I'm reading it right, this instance of Compactor will not try to compact any blocks, so it can safely run alongside exsting Compactors and instead of compacting, skip to removing blocks marked for deletion.

@sevagh
Copy link
Contributor

sevagh commented Jul 1, 2020

Based on @zygiss idea above to disable compaction, but it being 1am and me feeling fuzzy in the brain, I wasn't certain about modifying the appropriate hidden compaction level, so I compiled myself a special thanos binary with the following modifications:

diff --git a/cmd/thanos/compact.go b/cmd/thanos/compact.go
index 627010f1..67dcea3e 100644
--- a/cmd/thanos/compact.go
+++ b/cmd/thanos/compact.go
@@ -121,6 +121,9 @@ func registerCompact(m map[string]setupFunc, app *kingpin.Application) {
                "as querying long time ranges without non-downsampled data is not efficient and useful e.g it is not possible to render all samples for a human eye anyway").
                Default("false").Bool()

+       disableCompacting := cmd.Flag("compacting.disable", "Disables compacting to only delete blocks. Use at your own risk").
+               Default("false").Bool()
+
        maxCompactionLevel := cmd.Flag("debug.max-compaction-level", fmt.Sprintf("Maximum compaction level, default is %d: %s", compactions.maxLevel(), compactions.String())).
                Hidden().Default(strconv.Itoa(compactions.maxLevel())).Int()

@@ -168,6 +171,7 @@ func registerCompact(m map[string]setupFunc, app *kingpin.Application) {
                        },
                        component.Compact,
                        *disableDownsampling,
+                       *disableCompacting,
                        *maxCompactionLevel,
                        *blockSyncConcurrency,
                        *compactionConcurrency,
@@ -195,6 +199,7 @@ func runCompact(
        retentionByResolution map[compact.ResolutionLevel]time.Duration,
        component component.Component,
        disableDownsampling bool,
+       disableCompacting bool,
        maxCompactionLevel, blockSyncConcurrency int,
        concurrency int,
        dedupReplicaLabels []string,
@@ -382,8 +387,10 @@ func runCompact(
        }

        compactMainFn := func() error {
-               if err := compactor.Compact(ctx); err != nil {
-                       return errors.Wrap(err, "compaction")
+               if !disableCompacting {
+                       if err := compactor.Compact(ctx); err != nil {
+                               return errors.Wrap(err, "compaction")
+                       }
                }

                if !disableDownsampling {

It ran pretty well, proceeding immediately to deletion. For safety, I shut off the normal compactor (to avoid any risk of race conditions - even though there may not be any in the the deletion function).

Logs:

Jun 30 21:56:31 sv3-thanos2 thanos-custom-compact[29759]: level=info ts=2020-07-01T04:56:31.87090466Z caller=blocks_cleaner.go:53 msg="deleted block marked for deletion" block=01EBJRWBN8J8GF703ZVHDJZ29C
Jun 30 21:56:31 sv3-thanos2 thanos-custom-compact[29759]: level=info ts=2020-07-01T04:56:31.908009471Z caller=blocks_cleaner.go:53 msg="deleted block marked for deletion" block=01EBZ08SDN8GFYKWS535GPDSR3
Jun 30 21:56:32 sv3-thanos2 thanos-custom-compact[29759]: level=info ts=2020-07-01T04:56:32.36013667Z caller=blocks_cleaner.go:53 msg="deleted block marked for deletion" block=01EBK21JXGCZ3ECE2BMBMKWTAQ
Jun 30 21:56:32 sv3-thanos2 thanos-custom-compact[29759]: level=info ts=2020-07-01T04:56:32.455291751Z caller=blocks_cleaner.go:53 msg="deleted block marked for deletion" block=01EBR4HGXVSPJZYSM2MF76W3XR

It ran fine, exit, and now I'm back to my normal compactor.

@stale
Copy link

stale bot commented Aug 6, 2020

Hello 👋 Looks like there was no activity on this issue for last 30 days.
Do you mind updating us on the status? Is this still reproducible or needed? If yes, just comment on this PR or push a commit. Thanks! 🤗
If there will be no activity for next week, this issue will be closed (we can always reopen an issue if we need!). Alternatively, use remind command if you wish to be reminded at some point in future.

@stale stale bot added the stale label Aug 6, 2020
@GiedriusS GiedriusS removed the stale label Aug 6, 2020
@Jakob3xD
Copy link
Contributor

Jakob3xD commented Aug 13, 2020

Would it be possible to add the -wait option to the Bucket downsample tool to split the workload of the compactor into two parts? The compactor itself compacting the raw blocks and deleting files and the downsampler just creating downsampled blocks.
My main issue with the current compactor is the downsampling taking so much time. As there is no option to boost/ tune the downsampling it would be nice to split it, so the compactor cleans the bucket more often.

@stale
Copy link

stale bot commented Nov 8, 2020

Hello 👋 Looks like there was no activity on this issue for the last two months.
Do you mind updating us on the status? Is this still reproducible or needed? If yes, just comment on this PR or push a commit. Thanks! 🤗
If there will be no activity in the next two weeks, this issue will be closed (we can always reopen an issue if we need!). Alternatively, use remind command if you wish to be reminded at some point in future.

@stale stale bot added the stale label Nov 8, 2020
@GiedriusS
Copy link
Member

Would it be possible to add the -wait option to the Bucket downsample tool to split the workload of the compactor into two parts? The compactor itself compacting the raw blocks and deleting files and the downsampler just creating downsampled blocks.
My main issue with the current compactor is the downsampling taking so much time. As there is no option to boost/ tune the downsampling it would be nice to split it, so the compactor cleans the bucket more often.

Maybe let's open a separate issue for this? As for the OP, it has been implemented in #3115 and will be in the next version. Closing as fixed.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

7 participants