Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

disagg: Set client RPC timeout for FetchDisaggPages #8807

Merged
merged 4 commits into from
Mar 1, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 2 additions & 0 deletions dbms/src/Core/Defines.h
Original file line number Diff line number Diff line change
Expand Up @@ -51,6 +51,8 @@ static constexpr UInt64 DEFAULT_DISAGG_TASK_BUILD_TIMEOUT_SEC = 60;
// It is now a short period to avoid long stale snapshots causing system
// instable.
static constexpr UInt64 DEFAULT_DISAGG_TASK_TIMEOUT_SEC = 5 * 60;
// Timeout for FetchDisaggPages in the TiFlash compute node.
static constexpr UInt64 DEFAULT_DISAGG_FETCH_PAGES_TIMEOUT_SEC = 30;

#define DEFAULT_DAG_RECORDS_PER_CHUNK 1024L
#define DEFAULT_BATCH_SEND_MIN_LIMIT (-1)
Expand Down
1 change: 1 addition & 0 deletions dbms/src/Interpreters/Settings.h
Original file line number Diff line number Diff line change
Expand Up @@ -54,6 +54,7 @@ struct Settings
M(SettingUInt64, mpp_task_waiting_timeout, DEFAULT_MPP_TASK_WAITING_TIMEOUT, "mpp task max time that waiting first data block from source input stream.") \
M(SettingUInt64, disagg_build_task_timeout, DEFAULT_DISAGG_TASK_BUILD_TIMEOUT_SEC, "disagg task establish timeout, unit is second.") \
M(SettingUInt64, disagg_task_snapshot_timeout, DEFAULT_DISAGG_TASK_TIMEOUT_SEC, "disagg task snapshot max endurable time, unit is second.") \
M(SettingUInt64, disagg_fetch_pages_timeout, DEFAULT_DISAGG_FETCH_PAGES_TIMEOUT_SEC, "fetch disagg pages timeout for one segment, unit is second.") \
M(SettingInt64, safe_point_update_interval_seconds, 1, "The interval in seconds to update safe point from PD.") \
M(SettingUInt64, min_compress_block_size, DEFAULT_MIN_COMPRESS_BLOCK_SIZE, "The actual size of the block to compress, if the uncompressed data less than max_compress_block_size is no less than this value " \
"and no less than the volume of data for one mark.") \
Expand Down
2 changes: 2 additions & 0 deletions dbms/src/Storages/DeltaMerge/SegmentReadTask.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -551,6 +551,8 @@ void SegmentReadTask::doFetchPages(const disaggregated::FetchDisaggPagesRequest
cluster->rpc_client,
extra_remote_info->store_address);
grpc::ClientContext client_context;
// set timeout for the streaming call to avoid inf wait before `Finish()`
rpc.setClientContext(client_context, dm_context->global_context.getSettingsRef().disagg_fetch_pages_timeout);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we set a timeout for a streaming call, the timeout is to limit the whole process time (until stream_resp->Finish() is called) rather than a timeout for each message.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes. disagg_fetch_pages_timeout is timeout for the fetching pages of a segment.

JinheLin marked this conversation as resolved.
Show resolved Hide resolved
auto stream_resp = rpc.call(&client_context, request);
RUNTIME_CHECK(stream_resp != nullptr);
SCOPE_EXIT({
Expand Down