Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix copyRowsFunc hangs bug #611

Merged
merged 2 commits into from
Sep 17, 2018
Merged

Conversation

wfxiang08
Copy link

Related issue: #610

Description

When copyRowsFunc finishes copy rows, the status hasFurtherRange is kept across different iterations of copyRowsFunc, and when hasFurtherRange is false, never try to call CalculateNextIterationRangeEndValues.

When hasFurtherRange is false, the original table also might be locked and in this case CalculateNextIterationRangeEndValues will be hang forever.

Copy link
Contributor

@shlomi-noach shlomi-noach left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for this observation! I believe you are correct in your observation.

However, I also think the solution here is mistaken. See my comment inline.

@@ -1135,6 +1140,9 @@ func (this *Migrator) iterateChunks() error {
}
// Enqueue copy operation; to be executed by executeWriteFuncs()
this.copyRowsQueue <- copyRowsFunc
if !hasFurtherRange {
break
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since this code is outside copyRowsFunc, then it only skips enqueuing of new copyRowsFunc functions, but still there could be multiple copyRowsFunc instacnes inside this.copyRowsQueue that haven't been evaluated yet.

This is why I believe this PR mitigates the potential for deadlock but does not actually solve it.

I think we should check hasFurtherRange from within copyRowsFunc to solve it completely.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@shlomi-noach yes, you are right. I will make some modification.

@wfxiang08
Copy link
Author

@shlomi-noach I have make some modification, will you pls. have a review.

Copy link
Contributor

@shlomi-noach shlomi-noach left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this looks good. Let me send it to some testing.

@shlomi-noach shlomi-noach temporarily deployed to production/mysql_role=ghost_testing July 26, 2018 05:46 Inactive
@shlomi-noach shlomi-noach self-assigned this Sep 17, 2018
@shlomi-noach
Copy link
Contributor

Apologies for the time it took, I was sidetracked. On the flip side, this PR has been actively tested for some 50 days, so it's been well hammered. Merging!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants