-
Notifications
You must be signed in to change notification settings - Fork 2.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Cannot access thanos metrics #688
Comments
Why wrong idea? (: |
Well, compaction is done after some time, and seems like Something you can do is to move Can you copy here it's meta.json and overview of Looking into error message: |
Also please update to Thanos v0.2.0 or master, because there were fixes to partial upload issues. |
Related to #377 |
I though that maybe the temporary files that the compactor stores locally were important for the compactor process and I did not copy them to the new host but I guess that this is not a problem right?
Yes, the directory seems empty in S3, no chunks folder.
This is the content of the meta.json file:
So I've just moved the folder and now I can access old metrics again ;). Although, the store still complains about other block (it's on my first post). I've checked that on S3 and seems that has data but the index file is missing:
and the meta.json:
And compactor is complaining too:
It's possible to re-create that index file or should I get rid off the block?
Yes, that's one of my new year's wishes :) Thank you a lot for your great work and support ! Roberto |
Tmp files can be removed. It's leftover from compactor being restart in between some job. Please upgrade to v0.2.0 as some of those bugs are fixed. In terms of index, you cannot recreate it because essentially you lost all the label names and locations of encoded samples. I think you are hitting some old issue already fixed, when we were not correctly checking error: #403 so please update (: |
Deleted the block and now all seems fine. I will update to 0.2.0 ASAP. |
Versions
thanos, version 0.1.0 (branch: HEAD, revision: ebb58c2)
build user: root@393679b8c49c
build date: 20180918-17:02:30
go version: go1.10.3
prometheus, version 2.5.0 (branch: HEAD, revision: 67dc912ac8b24f94a1fc478f352d25179c94ab9b)
build user: root@578ab108d0b9
build date: 20181106-11:40:44
go version: go1.11.1
Our setup
1 x prometheus server + thanos sidecar (8c, 16G RAM)
2 x thanos queriers (4c, 8G RAM)
1 x thanos-store + thanos compactor (8c, 16G RAM)
backend: S3 ceph+radosgw
OS: CentOS 7.6
Issue
Debugging other issues I've realized that lately I cannot access to the metrics older than 15d (prometheus local retention) while we have a few months (~140G) of metrics in S3 and it worked fine before. From the thanos dash I got the following message if I put time range of 1w or older (the metrics are shown but no data older than 15d)
The thanos store is looping with the following warning:
And the thanos compactor seems broken:
Possible cause
Lately we had problems with the compactor crashing due to OOM, and eventually I ran it from a different (bigger) machine and this was probably not a good idea :)
I've also upgraded to prometheus v.2.5.0 but all was working fine after that, I don't think that is related. I don't have any specific configuration for thanos nor prometheus, just default values.
Any help/comments about how to unblock the situation? thank you in advance!!
Roberto
The text was updated successfully, but these errors were encountered: