-
Notifications
You must be signed in to change notification settings - Fork 2.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
bucket verify: repair out of order labels #964
bucket verify: repair out of order labels #964
Conversation
Running this code gives me the following error, which I think is from Prometheus or the TSDB. Perhaps I need to make sure the requirements have the recent patch to fix this issue upstream. More digging next week. Errors:
|
When we have label sets that are not in the correct order, fixing that changes the order of the series in the index. So the index must be rewritten in that new order. This makes this repair tool take up a bunch more memory, but produces blocks that verify correctly.
cb51d71
to
f09d3f5
Compare
Latest patches correct the There seem to be other issues, the code to upload to a backup bucket seems broken. is asked to parse strings that look like |
The directory name must be the block ID name exactly to verify. A temp directory or random name will not work here.
Ready for review. This correctly identifies the out of order labels, repairs the TSDB block, and does the safe-delete operation correctly. |
Pointer/reference logic error was eliminating all chunks for a series in a given TSDB block that wasn't the first chunk. Chunks are now referenced correctly via pointers.
The repaired TSDB blocks didn't seem to match the originals. They didn't have the correct number of samples and chunks. Digging through the ignore chunks code I found some pointer referencing issues that was causing the repair process to compare the same chunk to itself and to ignore it as a duplicate. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you add tests? It's a bit hard to understand what kind of impact this will have.
@@ -559,9 +559,9 @@ func sanitizeChunkSequence(chks []chunks.Meta, mint int64, maxt int64, ignoreChk | |||
var last *chunks.Meta | |||
|
|||
OUTER: | |||
for _, c := range chks { | |||
for i := range chks { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why the change?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Very subtle. So we are remembering a pointer to the chunk c
from the last iteration to compare it against the chunk in the current iteration. However, when we use for index, value := range slice
the value
is not a pointer into the slice. In fact its a new variable the current item of the slice is copied into. Which means our pointer based comparisons are broken -- they always compare the current chunk to itself as the address of the variable c
doesn't change throughout the loop.
Using just a slice index here allows us to correctly store a pointer to the item of the slice from the last iteration and compare that to the chunk in the current iteration. Otherwise, this code was removing all chunks in the series other than the first one.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's document this :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Where is the right place to do so? Glad to do it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added comments in the code. If that's not the best place, let me know.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
amazing, nice catch!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
More like, why did the repair just lose all the data in by blocks?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Very nice catch, the comment makes it clear when we read this in 3 months again :) 👍
cmd/thanos/bucket.go
Outdated
@@ -85,10 +85,11 @@ func registerBucketVerify(m map[string]setupFunc, root *kingpin.CmdClause, name | |||
var backupBkt objstore.Bucket | |||
if len(backupconfContentYaml) == 0 { | |||
if *repair { | |||
return errors.Wrap(err, "repair is specified, so backup client is required") | |||
return errors.Errorf("repair is specified, so backup client is required") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: some linters throw errors on things like this, I prefer errors.New
here
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed. Thanks.
@@ -559,9 +559,9 @@ func sanitizeChunkSequence(chks []chunks.Meta, mint int64, maxt int64, ignoreChk | |||
var last *chunks.Meta | |||
|
|||
OUTER: | |||
for _, c := range chks { | |||
for i := range chks { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's document this :)
Some linters catch errors.Errorf() as its not really part of the errors package.
We're comparing items by pointers, using Go's range variables is misleading here and we need not fall into the same trap.
CircleCI is failing on
|
@jjneely I've rerun CI it passes, we are planning to get rid of gossip soon , so those flaky tests will go away. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pretty dense PR, but looks good after what's a bit of a nit, but we don't want to duplicate.
pkg/block/index.go
Outdated
id := all.At() | ||
|
||
if err := indexr.Series(id, &lset, &chks); err != nil { | ||
return err | ||
} | ||
// Make sure labels are in sorted order | ||
sort.Slice(lset, func(i, j int) bool { | ||
return lset[i].Name < lset[j].Name |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
labels.Labels already implements the sort.Interface https://github.com/prometheus/tsdb/blob/4b3a5ac5d36e5262d2656c8d149e137c2d1fab12/labels/labels.go#L39-L41
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Excellent. Thanks for that. I've updated the code to use sort.Sort()
This prevents us from having to re-implement label sorting.
lgtm 👍 |
Awesome, thanks! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think I like it but some small minor issues.
We are releasing rc.0 in couple of minutes, but don't worry, this should be fine to get into 0.4.0. Good work!
Thanks!
} | ||
} else { | ||
backupBkt, err = client.NewBucket(logger, backupconfContentYaml, reg, name) | ||
// nil Prometheus registerer: don't create conflicting metrics |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is not a good solution. It's essentially as easy as prometheus.WrapRegisterWithPrefix("backup_..., reg)
(:
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
But also, not sure if it matters as it is only batch jobs, no one looks on metrics ;p
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, otherwise we register the same metrics twice.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No, you don't: It's essentially as easy as prometheus.WrapRegisterWithPrefix("backup_..., reg) (:
@@ -531,7 +531,7 @@ func IgnoreDuplicateOutsideChunk(_ int64, _ int64, last *chunks.Meta, curr *chun | |||
// the current one. | |||
if curr.MinTime != last.MinTime || curr.MaxTime != last.MaxTime { | |||
return false, errors.Errorf("non-sequential chunks not equal: [%d, %d] and [%d, %d]", | |||
last.MaxTime, last.MaxTime, curr.MinTime, curr.MaxTime) | |||
last.MinTime, last.MaxTime, curr.MinTime, curr.MaxTime) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
wow! that was super confusing indeed, thanks for spotting!
pkg/block/index.go
Outdated
@@ -559,9 +559,14 @@ func sanitizeChunkSequence(chks []chunks.Meta, mint int64, maxt int64, ignoreChk | |||
var last *chunks.Meta | |||
|
|||
OUTER: | |||
for _, c := range chks { | |||
// This compares the current chunk to the chunk from the last iteration | |||
// by pointers. If we use "i, c := range cks" the variable c is a new |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
// by pointers. If we use "i, c := range cks" the variable c is a new | |
// by pointers. If we use "i, c := range chks" the variable c is a new |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed
pkg/block/index.go
Outdated
// This compares the current chunk to the chunk from the last iteration | ||
// by pointers. If we use "i, c := range cks" the variable c is a new | ||
// variable who's address doesn't change through the entire loop. | ||
// The current element of the chks slice is copied into it. We must take |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
// The current element of the chks slice is copied into it. We must take | |
// The current element of the chks slice is copied into it. We must take |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed.
} | ||
|
||
return repl, nil | ||
} | ||
|
||
type seriesRepair struct { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why not using just series
type or name it series
?
pkg/block/index.go
Outdated
id := all.At() | ||
|
||
if err := indexr.Series(id, &lset, &chks); err != nil { | ||
return err | ||
} | ||
// Make sure labels are in sorted order |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Missing trailing period for comment.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed.
pkg/block/index.go
Outdated
return errors.Wrap(all.Err(), "iterate series") | ||
} | ||
|
||
// sort the series -- if labels moved around the ordering will be different |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's always keep comments a full sentence.
// sort the series -- if labels moved around the ordering will be different | |
// Sort the series. If labels moved around the ordering will be different. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed.
pkg/block/index.go
Outdated
return labels.Compare(series[i].lset, series[j].lset) < 0 | ||
}) | ||
|
||
// build new TSDB block |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
wrong comment again (full sentence, please)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed.
Retrying CI |
* query: cleanup store statuses as they come and go (thanos-io#910) Signed-off-by: Adrien Fillon <[email protected]> * [docs] Example of using official prometheus helm chart to deploy server with sidecar (thanos-io#1003) * update documentation with an example of using official prometheus helm chart Signed-off-by: Ivan Kiselev <[email protected]> * a little formatting to values Signed-off-by: Ivan Kiselev <[email protected]> * satisfy PR comments Signed-off-by: Ivan Kiselev <[email protected]> * Compact: group concurrency (thanos-io#1010) * compact: add concurrency to group compact * add flag to controll the number of goroutines to use when compacting group * update compact.md for group-compact-concurrency * fixed: miss wg.Add() * address CR * regenerate docs * use err group * fix typo in flag description * handle context * set up workers in main loop * move var initialisation * remove debug log * validate concurrency * move comment * warn -> error * remove extra newline * fix typo * dns: Added miekgdns resolver as a hidden option to query and ruler. (thanos-io#1016) Fixes: thanos-io#1015 Signed-off-by: Bartek Plotka <[email protected]> * query: set default evaluation interval (thanos-io#1028) Subqueries allows request with no [specified resolution](https://prometheus.io/blog/2019/01/28/subquery-support/). Set it up and allow to configure default evaluation interval. * store+compactor: pre-compute index cache during compaction (thanos-io#986) Fixes first part of thanos-io#942 This changes allow to safe some startup & sync time in store gateway as it is no longer is needed to compute index-cache from block index on its own. For compatibility store GW still can do it, but it first checks bucket if there is index-cached uploaded already. In the same time, compactor precomputes the index cache file on every compaction. To allow quicker addition of index cache files we added `--index.generate-missing-cache-file` flag, that if enabled precompute missing files on compactor startup. Note that it will take time and it's only one-off step per bucket. Signed-off-by: Aleksei Semiglazov <[email protected]> * Added website for Thanos' docs using Hugo. (thanos-io#807) Hosted in github pages. Signed-off-by: adrien-f <[email protected]> Signed-off-by: Bartek Plotka <[email protected]> * gcs: Fixed scopes for inline ServiceAccount option. (thanos-io#1033) Without this that option was unusable. Signed-off-by: Bartek Plotka <[email protected]> * Fixed root docs and liche is now checking root dir as well. (thanos-io#1040) Signed-off-by: Bartek Plotka <[email protected]> * storage docs: add detail about GCS policies and testing (thanos-io#1037) * add more details about GCS policies and testing * remove fixed names from exec command * Prometheus library updated to v2.8.1 (thanos-io#1009) * compact: group concurrency improvements (thanos-io#1029) * group concurrency improvements * remove unnecessary error check * add to wg in main goroutine * receive: Add block shipping (thanos-io#1011) * receive: Add retention flag for local tsdb storage (thanos-io#1046) * querier: Add /api/v1/labels support (thanos-io#905) * Feature: add /api/v1/labels support Signed-off-by: jojohappy <[email protected]> * Disabled gossip by default, marked all flags as deprecated. (thanos-io#1055) + changed small label. Signed-off-by: Bartek Plotka <[email protected]> * ruler: Fixed Chunk going out or Max Uint16. (thanos-io#1041) Fixes thanos-io#1038 Signed-off-by: Bartek Plotka <[email protected]> * store: azure: allow passing an endpoint parameter for specific regions (thanos-io#980) Fix thanos-io#968 Signed-off-by: Adrien Fillon <[email protected]> * feature: support POST method for series endpoint (thanos-io#1021) Signed-off-by: Joseph Lee <[email protected]> * bucket verify: repair out of order labels (thanos-io#964) * bucket verify: repair out of order labels * verify repair: correctly order series in the index on rewrite When we have label sets that are not in the correct order, fixing that changes the order of the series in the index. So the index must be rewritten in that new order. This makes this repair tool take up a bunch more memory, but produces blocks that verify correctly. * Fix the TSDB block safe-delete function The directory name must be the block ID name exactly to verify. A temp directory or random name will not work here. * verify repair: fix duplicate chunk detection Pointer/reference logic error was eliminating all chunks for a series in a given TSDB block that wasn't the first chunk. Chunks are now referenced correctly via pointers. * PR feedback: use errors.Errorf() instead of fmt.Errorf() * Use errors.New() Some linters catch errors.Errorf() as its not really part of the errors package. * Liberally comment this for loop We're comparing items by pointers, using Go's range variables is misleading here and we need not fall into the same trap. * Take advantage of sort.Interface This prevents us from having to re-implement label sorting. * PR Feedback: Comments are full sentences. * Cut release 0.4.0-rc.0 (thanos-io#1017) * Cut release 0.4.0-rc.0 🎉 🎉 NOTE: This is last release that has gossip. Signed-off-by: Bartek Plotka <[email protected]> Co-Authored-By: povilasv <[email protected]> * Fixed crossbuild. Signed-off-by: Bartek Plotka <[email protected]> * ci: Env fixes. (thanos-io#1058) Signed-off-by: Bartek Plotka <[email protected]> * Removed bzr requirement for make crossbuild. Signed-off-by: Bartek Plotka <[email protected]> * Bump github.com/hashicorp/golang-lru from 0.5.0 to 0.5.1 (thanos-io#1051) Bumps [github.com/hashicorp/golang-lru](https://github.com/hashicorp/golang-lru) from 0.5.0 to 0.5.1. - [Release notes](https://github.com/hashicorp/golang-lru/releases) - [Commits](hashicorp/golang-lru@v0.5.0...v0.5.1) Signed-off-by: dependabot[bot] <[email protected]> * Initialze and correctly register all index cache metrics. (thanos-io#1069) * store/cache: add more tests (thanos-io#1071) * Fixed Downsampling process; Fixed `runutil.CloseAndCaptureErr` (thanos-io#1070) * runutil. Simplified CloseWithErrCapture. Signed-off-by: Bartek Plotka <[email protected]> * Fixed Downsampling process; Fixed runutil.CloseAndCaptureErr Fixes thanos-io#1065 Root cause: * runutil defered capture error function was not passing error properly so unit tests were passing, event though there was bug * streamed block write index cache requires index file which was not closed (saved) properly yet. Closers need to be closed to perform this. Signed-off-by: Bartek Plotka <[email protected]> * objstore: Expose S3 region attribute (thanos-io#1060) Minio is able to autodetect the region for cloud providers like AWS but the logic fails with Scaleway Object Storage solution. Related issue on Minio: minio/mc#2570 * Fixed fetching go-bindata failed (thanos-io#1074) * Fixed bug: - fetching go-bindata failed. - change the repo of go-bindata to github.com/go-bindata/go-bindata, because old repo has been archived. - pin the go-bindata as v3.3.1. Signed-off-by: jojohappy <[email protected]> * Add CHANGELOG Signed-off-by: jojohappy <[email protected]> * Remove CHANGELOG Signed-off-by: jojohappy <[email protected]> * add compare flags func to compare flags between prometheus and sidecar (thanos-io#838) Original message: * update documentation for a max/min block duration add compare flags func to compare flags between prom and sidecar * fix some nits Functional change: now we check the configured flags (if possible) and error out if MinTime != MaxTime. We need to check this always since if that is not true then we will get overlapping blocks. Additionally, an error message is printed out if it is not equal to 2h (the recommended value). * Ensured index cache is best effort, refactored tests, validated edge cases. (thanos-io#1073) Fixes thanos-io#651 Current size also includes slice header. Signed-off-by: Bartek Plotka <[email protected]> * website: Moved to netlify. (thanos-io#1078) Signed-off-by: Bartek Plotka <[email protected]> * website: Fixing netlify. (thanos-io#1080) Signed-off-by: Bartek Plotka <[email protected]> * website: Added "founded by" footer. (thanos-io#1081) Signed-off-by: Bartek Plotka <[email protected]> * store/proxy: properly check if context has ended (thanos-io#1082) How the code was before it could happen that we might receive some series from the stream however by the time we'd send them back to the reader, it would not read it anymore since the deadline would have been exceeded. Properly use a `select` here to get out of the goroutine if the deadline has been exceeded. Might potentially fix a problem where we see one goroutine hanging constantly (and thus blocking from work being done): ``` goroutine profile: total 126 25 @ 0x42f62f 0x40502b 0x405001 0x404de5 0xe7435b 0x45cc41 0xe7435a github.com/improbable-eng/thanos/pkg/store.startStreamSeriesSet.func1+0x18a /go/src/github.com/improbable-eng/thanos/pkg/store/proxy.go:318 ``` * Cut release v0.4.0-rc.1 (thanos-io#1088) Signed-off-by: Bartek Plotka <[email protected]> * website: Removed ghpages handling; fixed docs; and status badge. (thanos-io#1084) Signed-off-by: Bartek Plotka <[email protected]> * Fix readme (thanos-io#1090) * store: Compose indexCache properly allowing injection for testing purposes. (thanos-io#1098) Signed-off-by: Bartek Plotka <[email protected]> * website: add sponsor section on homepage (thanos-io#1062) * website: Adjusted logos sizing and responsiveness. (thanos-io#1105) Signed-off-by: Bartek Plotka <[email protected]> * Add Monzo to "Used by" section 🎉 (thanos-io#1106) * Compactor: remove malformed blocks after delay (thanos-io#1053) * compactor removes malformed blocks after delay * compactor removes malformed blocks after delay * include missing file * reuse existing freshness check * fix comment * remove unused var * fix comment * syncDelay -> consistencyDelay * fix comment * update flag description * address cr * fix dupliacte error handling * minimum value for --consistency-delay * update * docs * add test case * move test to inmem bucket * Add Utility Warehouse to "used by" section (thanos-io#1108) * Add Utility Warehouse logo * Make logo smaller * website: add Adform as users (thanos-io#1109) We use Thanos extensively as well so I have added Adform. * Cut release v0.4.0 (thanos-io#1107) Signed-off-by: Bartek Plotka <[email protected]>
Detected and worked around in #953 this PR introduces code into the repair function that should repair affected TSDB blocks.
Changes