Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

workload report “EstablishDisaggTask Failed4: Deadline Exceeded” when kill one tikv #7691

Closed
Lily2025 opened this issue Jun 25, 2023 · 3 comments
Labels
affects-7.5 This bug affects the 7.5.x(LTS) versions. affects-8.1 This bug affects the 8.1.x(LTS) versions. component/storage severity/major type/bug The issue is confirmed as a bug.

Comments

@Lily2025
Copy link

Bug Report

Please answer these questions before submitting your issue. Thanks!

1. Minimal reproduce step (Required)

1、run ch
2、kill one tikv

2. What did you expect to see? (Required)

no unexpected load reported

3. What did you see instead (Required)

workload report error
[2023-06-21 13:52:08] execute run failed, err execute query q11 failed Error 1105: other error for mpp stream: From MPP<query:<query_ts:1687355518255892155, local_query_id:246, server_id:2528259, start_ts:442330124976652289>,task_id:3>: Poco::Exception. Code: 1000, e.code() = 2, e.displayText() = Exception: EstablishDisaggTask Failed4: Deadline Exceeded, e.what() = Exception

4. What is your TiFlash version? (Required)

githash:5a36917b396e394ee551661eb5190ae3231854d6

@yibin87
Copy link
Contributor

yibin87 commented Jul 3, 2023

It seems grpc call failed. And can we provide a reproduce environment or more tiflash log to help track the issue? @Lily2025

@yibin87 yibin87 removed their assignment Jul 3, 2023
@yibin87
Copy link
Contributor

yibin87 commented Jul 3, 2023

Add some TiFlash log here,
[2023/06/21 15:26:05.957 +00:00] [DEBUG] [StorageDisaggregated.cpp:179] ["batch cop tasks(nums: 1) build finish for tiflash_storage node"] [source="MPP<query:<query_ts:1687361165753836680, local_query_id:63669, server_id:417640, start_ts:442331605408678020>,task_id:2>"] [thread_id=379] [2023/06/21 15:26:05.957 +00:00] [DEBUG] [StorageDisaggregated.cpp:179] ["batch cop tasks(nums: 1) build finish for tiflash_storage node"] [source="MPP<query:<query_ts:1687361165679618637, local_query_id:63664, server_id:417640, start_ts:442331605395570757>,task_id:3>"] [thread_id=127] [2023/06/21 15:26:05.957 +00:00] [DEBUG] [StorageDisaggregated.cpp:179] ["batch cop tasks(nums: 1) build finish for tiflash_storage node"] [source="MPP<query:<query_ts:1687361165753836680, local_query_id:63669, server_id:417640, start_ts:442331605408678020>,task_id:3>"] [thread_id=305] [2023/06/21 15:26:05.957 +00:00] [DEBUG] [StorageDisaggregated.cpp:179] ["batch cop tasks(nums: 1) build finish for tiflash_storage node"] [source="MPP<query:<query_ts:1687361165712230105, local_query_id:63666, server_id:417640, start_ts:442331605408677901>,task_id:3>"] [thread_id=335] [2023/06/21 15:26:07.500 +00:00] [DEBUG] [<unknown>] ["got dead store: tc-tiflash-0.tc-tiflash-peer.endless-ha-test-tps-1744180-1-218.svc:3930"] [source=pingcap.ProbeState] [thread_id=5644] [2023/06/21 15:26:10.502 +00:00] [DEBUG] [<unknown>] ["got dead store: tc-tiflash-0.tc-tiflash-peer.endless-ha-test-tps-1744180-1-218.svc:3930"] [source=pingcap.ProbeState] [thread_id=5644] [2023/06/21 15:26:13.503 +00:00] [DEBUG] [<unknown>] ["got dead store: tc-tiflash-0.tc-tiflash-peer.endless-ha-test-tps-1744180-1-218.svc:3930"] [source=pingcap.ProbeState] [thread_id=5644] [2023/06/21 15:26:14.465 +00:00] [ERROR] [<unknown>] ["EstablishDisaggTask Failed4: Deadline Exceeded"] [source=pingcap.tikv] [thread_id=479]
Seems the TiFlash node try to send EstablishDisaggTask request, however, timeout. Besides, no "Handling EstablishDisaggTask request:" info found in TiFlash logs. Suspect the request is not sent to TiFlash, but to the killed tikv node instead.
And I think it should be labeled with "storage"

@JaySon-Huang
Copy link
Contributor

We have added more detail logging and some timeout about this problem. Will dive deep into it if a similar issue happens again.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
affects-7.5 This bug affects the 7.5.x(LTS) versions. affects-8.1 This bug affects the 8.1.x(LTS) versions. component/storage severity/major type/bug The issue is confirmed as a bug.
Projects
None yet
Development

No branches or pull requests

5 participants