-
Notifications
You must be signed in to change notification settings - Fork 164
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Upgrade to Cortex 1.15.x fails #482
Comments
Can you share the error? |
cortex-distributor-5478b6758f-cl457 1/1 Running 0 10m
cortex-distributor-5478b6758f-lgkjc 1/1 Running 0 10m
cortex-ingester-66488dd769-krjmz 1/1 Running 0 2d23h
cortex-ingester-66488dd769-twt2g 1/1 Running 0 2d23h
cortex-nginx-5cf69b45b7-qq9ph 1/1 Running 0 16d
cortex-nginx-5cf69b45b7-tqkzb 1/1 Running 0 16d
cortex-querier-57f85ccccb-v4lsv 1/1 Running 0 2d23h
cortex-query-frontend-6cc65c85c5-bwczv 1/1 Running 0 10m
cortex-query-frontend-6cc65c85c5-rg2xm 1/1 Running 0 10m
Failing pods after upgrade from 1.14.1 to 1.15.3:
cortex-alertmanager-8647857b79-v9qdm 0/1 Running 3 (87s ago) 6m29s
cortex-compactor-0 0/1 Running 0 6m21s
cortex-ingester-64d6b76cc8-bp5d5 0/1 Running 0 6m29s
cortex-querier-74f7f589-2f9js 0/1 Running 3 (87s ago) 6m28s
cortex-ruler-7746ccfd6f-xnjr7 0/1 Running 3 (85s ago) 6m27s
cortex-store-gateway-0 0/1 Running 0 6m23s
k logs -n cortex cortex-querier-74f7f589-2f9js
level=info ts=2023-09-28T10:30:59.065683442Z caller=main.go:194 msg="Starting Cortex" version="(version=1.15.3, branch=HEAD, revision=21e8366)"
level=info ts=2023-09-28T10:30:59.066064447Z caller=server.go:323 http=[::]:8080 grpc=[::]:9095 msg="server listening on addresses"
level=info ts=2023-09-28T10:30:59.067243862Z caller=memberlist_client.go:399 msg="Using memberlist cluster node name" name=cortex-querier-74f7f589-2f9js-36456d1f
level=info ts=2023-09-28T10:30:59.077885501Z caller=memberlist_client.go:575 msg="joined memberlist cluster" reached_nodes=4
level=info ts=2023-09-28T10:30:59.085044494Z caller=memberlist_client.go:536 msg="joined memberlist cluster" reached_nodes=4
ts=2023-09-28T10:31:00.178013763Z caller=memberlist_logger.go:74 level=warn msg="Got ping for unexpected node 'cortex-querier-74f7f589-2f9js-808f43ae' from=100.77.7.114:7946"
ts=2023-09-28T10:31:02.179318929Z caller=memberlist_logger.go:74 level=warn msg="Got ping for unexpected node 'cortex-querier-74f7f589-2f9js-808f43ae' from=100.77.7.50:7946"
ts=2023-09-28T10:31:02.179673135Z caller=memberlist_logger.go:74 level=warn msg="Got ping for unexpected node 'cortex-querier-74f7f589-2f9js-808f43ae' from=100.77.7.42:7946"
ts=2023-09-28T10:31:02.179716035Z caller=memberlist_logger.go:74 level=warn msg="Got ping for unexpected node 'cortex-querier-74f7f589-2f9js-808f43ae' from=100.77.7.109:7946"
ts=2023-09-28T10:31:02.179881038Z caller=memberlist_logger.go:74 level=warn msg="Got ping for unexpected node cortex-querier-74f7f589-2f9js-808f43ae from=100.77.7.114:42498"
ts=2023-09-28T10:31:04.069733066Z caller=memberlist_logger.go:74 level=warn msg="Got ping for unexpected node 'cortex-querier-74f7f589-2f9js-808f43ae' from=100.77.7.109:7946"
k logs cortex-ruler-7746ccfd6f-xnjr7 -n cortex
level=info ts=2023-09-28T10:31:00.404790924Z caller=main.go:194 msg="Starting Cortex" version="(version=1.15.3, branch=HEAD, revision=21e8366)"
level=info ts=2023-09-28T10:31:00.405091428Z caller=server.go:323 http=[::]:8080 grpc=[::]:9095 msg="server listening on addresses"
k logs -n cortex cortex-store-gateway-0
level=info ts=2023-09-28T10:22:49.386296971Z caller=main.go:194 msg="Starting Cortex" version="(version=1.15.3, branch=HEAD, revision=21e8366)"
level=info ts=2023-09-28T10:22:49.386591376Z caller=server.go:323 http=[::]:8080 grpc=[::]:9095 msg="server listening on addresses"
k logs cortex-alertmanager-8647857b79-v9qdm -n cortex
level=info ts=2023-09-28T10:27:38.465985397Z caller=main.go:194 msg="Starting Cortex" version="(version=1.15.3, branch=HEAD, revision=21e8366)"
level=info ts=2023-09-28T10:27:38.466976213Z caller=server.go:323 http=[::]:8080 grpc=[::]:9095 msg="server listening on addresses"
k logs cortex-alertmanager-8647857b79-v9qdm -n cortex
level=info ts=2023-09-28T10:27:38.465985397Z caller=main.go:194 msg="Starting Cortex" version="(version=1.15.3, branch=HEAD, revision=21e8366)"
level=info ts=2023-09-28T10:27:38.466976213Z caller=server.go:323 http=[::]:8080 grpc=[::]:9095 msg="server listening on addresses"
k logs -n cortex cortex-compactor-0
level=info ts=2023-09-28T10:22:48.991038111Z caller=main.go:194 msg="Starting Cortex" version="(version=1.15.3, branch=HEAD, revision=21e8366)"
level=info ts=2023-09-28T10:22:48.991531119Z caller=server.go:323 http=[::]:8080 grpc=[::]:9095 msg="server listening on addresses"
level=info ts=2023-09-28T10:22:48.992992641Z caller=module_service.go:64 msg=initialising module=server
level=info ts=2023-09-28T10:22:48.993143843Z caller=module_service.go:64 msg=initialising module=memberlist-kv
level=info ts=2023-09-28T10:22:48.993146343Z caller=module_service.go:64 msg=initialising module=runtime-config
level=info ts=2023-09-28T10:22:48.993390847Z caller=module_service.go:64 msg=initialising module=compactor
k logs -n cortex cortex-ingester-64d6b76cc8-bp5d5
level=info ts=2023-09-28T10:22:37.989674144Z caller=main.go:194 msg="Starting Cortex" version="(version=1.15.3, branch=HEAD, revision=21e8366)"
level=info ts=2023-09-28T10:22:37.989985548Z caller=server.go:323 http=[::]:8080 grpc=[::]:9095 msg="server listening on addresses"
|
There is not a single error in your logs. |
Yep - the readiness checks are not healthy (as below) so pods never get to a ready state and are eventually killed. 1s Warning Unhealthy pod/cortex-store-gateway-0 Startup probe failed: Get "http://100.77.7.153:8080/ready": context deadline exceeded (Client.Timeout exceeded while awaiting headers)
25s Warning Unhealthy pod/cortex-compactor-0 Startup probe failed: HTTP probe failed with statuscode: 503 NAME ENDPOINTS AGE
cortex-alertmanager 237d
cortex-compactor 237d
cortex-distributor 100.77.7.148:8080,100.77.7.154:8080 237d
cortex-distributor-headless 100.77.7.148:9095,100.77.7.154:9095 237d
cortex-ingester 100.77.7.42:8080,100.77.7.50:8080 237d
cortex-ingester-headless 100.77.7.42:9095,100.77.7.50:9095 237d
cortex-memberlist 100.77.7.148:7946,100.77.7.154:7946,100.77.7.42:7946 + 1 more... 237d
cortex-nginx 100.77.7.16:80,100.77.7.23:80 237d
cortex-querier 100.77.7.47:8080 237d
cortex-query-frontend 100.77.7.151:8080,100.77.7.155:8080 237d
cortex-query-frontend-headless 100.77.7.151:9095,100.77.7.155:9095 237d
cortex-ruler 237d
cortex-store-gateway 237d
cortex-store-gateway-headless 237d |
Can't reproduce. For me everything works fine. Try running
before upgrading/installing to get the latest memcached. I tested with:
|
Have you tried an upgrade from 1.14.1 to 1.15.3, thanks? Looking at the CHANGELOG - https://github.com/cortexproject/cortex/blob/master/CHANGELOG.md?plain=1 doesn't appear that you need to do anything specific to upgrade on the application side. If you can confirm the upgrade is okay via the helm chart method than I'll close this call and raise one on the cortex application, thanks. |
Found reason why my pods not starting - cortexproject/cortex#5449 - I'm using Azure and needed: endpoint_suffix: blob.core.windows.net Thanks for confirming chart was okay. |
When is the helm chart going to support 1.15.x of Cortex? Thanks.
Quick test upgrading cortex image from v1.14.1 to v1.15.x causes pods to crashloop and fail to start.
The text was updated successfully, but these errors were encountered: