FetchDisaggPages may wait forever when network partitions happened #8806

JinheLin · 2024-02-29T07:08:50Z

Injecting network partitions into write nodes for 10 minutes, from 2024-02-28 20:44:53 to 2024-02-28 20:54:53.

RPC request of FetchDisaggPages wait until the network is recovered.
(In the log below, although the page hit rate is 100%, we still need to fetch data in memtable from WNs.)

[2024/02/28 20:44:54.006 +08:00] [DEBUG] [SegmentReadTask.cpp:355] ["Ready to fetch pages, seg_task=s6_t2737_345_2_15500217546706758 page_hit_rate=100.00% pages_not_in_cache=[]"] [source="MPP<gather_id:1, query_ts:1709124283066671989, local_query_id:2280, server_id:1542, start_ts:448036676047732823,task_id:3> store_id=6 keyspace=4294967295 table_id=2737 segment_id=345 epoch=2 delta_epoch=15500217546706758"] [thread_id=6]
[2024/02/28 20:56:21.157 +08:00] [ERROR] [SegmentReadTask.cpp:386] ["s6_t2737_345_2_15500217546706758: Code: 11004, e.displayText() = DB::Exception: Check snap != nullptr failed: Can not find disaggregated task, task_id=DisTaskId<MPP<gather_id:1, query_ts:1709124283066671989, local_query_id:2280, server_id:1542, start_ts:448036676047732823,task_id:3>,executor=TableFullScan_41> (from s6_t2737_345_2_15500217546706758), e.what() = DB::Exception... [source="MPP<gather_id:1, query_ts:1709124283066671989, local_query_id:2280, server_id:1542, start_ts:448036676047732823,task_id:3> store_id=6 keyspace=4294967295 table_id=2737 segment_id=345 epoch=2 delta_epoch=15500217546706758"] [thread_id=6]

From the document of gRPC:

By default, gRPC does not set a deadline which means it is possible for a client to end up waiting for a response effectively forever. To avoid this you should always explicitly set a realistic deadline in your clients.

The text was updated successfully, but these errors were encountered:

close #8806

JinheLin added type/bug The issue is confirmed as a bug. severity/moderate component/storage affects-7.5 This bug affects the 7.5.x(LTS) versions. labels Feb 29, 2024

JinheLin mentioned this issue Feb 29, 2024

disagg: Set client RPC timeout for FetchDisaggPages #8807

Merged

12 tasks

ti-chi-bot bot closed this as completed in #8807 Mar 1, 2024

ti-chi-bot bot pushed a commit that referenced this issue Mar 1, 2024

disagg: Set client RPC timeout for FetchDisaggPages (#8807)

42dba5e

close #8806

ti-chi-bot mentioned this issue Mar 1, 2024

disagg: Set client RPC timeout for FetchDisaggPages (#8807) #8809

Closed

12 tasks

JinheLin mentioned this issue Mar 5, 2024

disagg: Set client RPC timeout for FetchDisaggPages (release-7.5) #8821

Merged

12 tasks

ti-chi-bot bot pushed a commit that referenced this issue Mar 5, 2024

disagg: Set client RPC timeout for FetchDisaggPages (#8821)

def7eaa

close #8806

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

FetchDisaggPages may wait forever when network partitions happened #8806

FetchDisaggPages may wait forever when network partitions happened #8806

JinheLin commented Feb 29, 2024 •

edited

Loading

FetchDisaggPages may wait forever when network partitions happened #8806

FetchDisaggPages may wait forever when network partitions happened #8806

Comments

JinheLin commented Feb 29, 2024 • edited Loading

JinheLin commented Feb 29, 2024 •

edited

Loading