-
Notifications
You must be signed in to change notification settings - Fork 3.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
update insert pending segments logic to synchronous #6336
update insert pending segments logic to synchronous #6336
Conversation
…s READ_COMMITTED will reduce insert id conflict. 2. Add an index to 'dataSource used end' is work well for the most of scenarios(get recently segments), and it will speed up sync add pending segments in DB. 3. 'select and insert' is not need within transaction.
Hi @FaxianZhao, thanks for the PR! It looks reasonable. Would you please fix the conflicts? |
skipSegmentLineageCheck | ||
); | ||
SegmentIdentifier identifier; | ||
synchronized (lockResult.getTaskLock()) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
TaskLock
is shared by tasks only if they have the same groupId (see TaskLockPosse
). Maybe this works for kafka indexing service because all kafkaIndexTasks for the same dataSource have the same groupId, but it's not always true for other task types.
I think it makes sense to call toolbox.getIndexerMetadataStorageCoordinator().allocatePendingSegment()
inside of TaskLockBox.doInCriticalSection()
. It's designed to perform some action in a critical section, so that any other tasks can't revoke the lock of the current task. Also, it guarantees that only one task can perform the action at the same time. Please check SegmentTransactionalInsertAction
as an example. What do you think?
… speed up insert pending segments.
Hi @jihoonson , I fix the conflicts. Should I rebase these 3 commits as one? |
@FaxianZhao thanks for the fix! You don't have to rebase any commits in most cases. Once you commit, it's fine to leave them as they are. The failing unit tests seem legit. Please check it.
|
@jihoonson Sorry for that typo 'null', I correct it as '() -> null'. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM. Thanks @FaxianZhao!
* 1. Mysql default transaction isolation is REPEATABLE_READ, treat it as READ_COMMITTED will reduce insert id conflict. 2. Add an index to 'dataSource used end' is work well for the most of scenarios(get recently segments), and it will speed up sync add pending segments in DB. 3. 'select and insert' is not need within transaction. * Use TaskLockbox.doInCriticalSection instead of synchronized syntax to speed up insert pending segments. * fix typo for NullPointerException
* 1. Mysql default transaction isolation is REPEATABLE_READ, treat it as READ_COMMITTED will reduce insert id conflict. 2. Add an index to 'dataSource used end' is work well for the most of scenarios(get recently segments), and it will speed up sync add pending segments in DB. 3. 'select and insert' is not need within transaction. * Use TaskLockbox.doInCriticalSection instead of synchronized syntax to speed up insert pending segments. * fix typo for NullPointerException
@FaxianZhao @jihoonson I think this PR should be reverted, after merged this PR, it takes very long long time to open the overlord console. In my scenario, there are 1200 peon tasks and we use mysql as the metastore. In this PR, use |
@jihoonson @FaxianZhao I think I know why the overlord console loading slow after merged this PR, the reason is every index task request overlord to perform a segment allocate action and all tasks occupied a http handle thread which blocked at |
@QiuMM thank you for reporting! I think I think this PR is still useful because we can avoid the slow pending segment allocation problem caused by the race in |
BTW, I guess your |
@jihoonson I have tried like below: /**
* Perform the given action with a guarantee that we synchronize actions on the same (Datasource, Interval) pair.
* This method just checks that the pair of (Datasource, Interval) is valid, does not care about locks for the given
* task and interval, so you should care about the locks by yourself.
*
* @param task task performing a {@link SyncDataSourceAndIntervalAction}
* @param interval interval
* @param action action to be performed
*/
public <T> T doInSyncDataSourceAndInterval(
Task task,
Interval interval,
SyncDataSourceAndIntervalAction<T> action
) throws Exception
{
Pair<String, Interval> pair = dataSourceIntervals.getOrDefault(task.getDataSource(), new HashMap<>()).get(interval);
if (pair != null) {
synchronized (pair) {
return action.perform();
}
} else {
log.warn("Invalid dataSource and interval pair: (%s, %s), can not sync on it!", task.getDataSource(), interval);
return null;
}
} However, the overlord console still loading very slowly after I restart the overlord since all newly created tasks post a segment allocate action. But about one hours later, I didn't see any delay when open the console except the moment that all tasks start to allocate new segments. I think it's still terrible because every time I restart the overlord I would like to open the console to see the status of the overlord, It's hard to bear if it loading slow. Maybe we should not restart all tasks at the same moment, how about add some delay? Besides, after I use such granular lock, some errors occured:
Such error occurs after I restart the overlord and all newly created tasks would pending for some minutes which is weird and make me confused (I have never modified
I have already set |
@QiuMM thank you for trying it out! I understand what you feel when the UI is very slow. We need to fix this. It sounds like the too many concurrent segmentAllocateActions is causing this problem, but, I don't understand why it's still slow with the granular locking. The
Before this PR, Regarding the exception of |
@jihoonson not only replica tasks, all tasks create by a supervisor would share same
I'm not very sure about this, but I have also observed such exception even if I didn't use |
I see. Since the lock contention can happen between tasks if they're writing into the interval of the same dataSource, 600 running tasks at the same time can cause this issue. If this is the case, before this PR, I think such many tasks could cause frequent pending segment allocation failures due to the allowed race in Maybe we need to use separate http thread pools for API and UI. Also, maybe #6348 and #6356 would help though you said your pendingSegment table is small. |
Yeah, I'm going to see this problem. I'll let you know if I found something. And I think I know why the exception of
This is a good solution, but it seems not easy to do for Jetty Server. |
Thanks. I've checked some Jetty configurations and it looks that Jetty's QoS filter (https://www.eclipse.org/jetty/documentation/9.4.x/qos-filter.html) might help. We're already using this filter for task's chat handler and the lookup. I think we can use it for the overlord too. The max number of requests for internal API calls is limited by some new configurations like |
Sounds good, I'll have a try. |
@QiuMM were you able to find anything interesting? I'm wondering if we need to adjust this PR or make special callouts in the release notes |
@jihoonson @jon-wei oh, I almost forgot this thing.
Yeah, I had found many retries. And I had tried the QoS filter, it's not a good solution if there are thousands of Finally, I solved this problem by using a more granular lock, i.e using the |
I delete the branch about #6283 by mistake. So I can't modify that pull request.
There are some comments.
Add an index to 'dataSource used start' is a good idea, but I didn't include it in this patch.