Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Handle deadlocks #107

Open
avit opened this issue Apr 29, 2015 · 4 comments
Open

Handle deadlocks #107

avit opened this issue Apr 29, 2015 · 4 comments

Comments

@avit
Copy link

avit commented Apr 29, 2015

While copying data to the table, I experienced a deadlock that aborted the migration halfway through.

I'm guessing deadlocks happen if the INSERT statement and a trigger both try to touch the same row. Some possible solutions:

  • Add an option for deadlock_retries: 1 to handle it when the query gets dropped?
  • Use a different transaction isolation level for copying?
@avit
Copy link
Author

avit commented Apr 30, 2015

I'm trying the following in my migration:

  def using_retriable_connection
    Lhm.setup(RetriableConnection.new(connection))
    yield
    Lhm.setup(connection)
  end

  class RetriableConnection < SimpleDelegator
    def update(*)
      super
    rescue ::ActiveRecord::StatementInvalid => e
      retries ||= 0
      raise e if (retries += 1) > 3 || e.message !~ "Deadlock found when trying to get lock"
      sleep 0.1 * (retries**2)
      retry
    end
  end

An approach like this could work for wrapping each stride in the chunker with a retry. Thoughts?

@aswinanand
Copy link

Hi,

I have a patch for the deadlock issue here: https://github.com/aswinanand/lhm/commit/3c215e9e430a5a7a7e318f2eba9bd3c43a3d2c50

How can I add unit tests for this?

@arthurnn
Copy link
Contributor

arthurnn commented Jun 9, 2016

from pt-osc:

The tool retries each operation if these errors occur:

Lock wait timeout (innodb_lock_wait_timeout and lock_wait_timeout)
Deadlock found
Query is killed (KILL QUERY <thread_id>)
Connection is killed (KILL CONNECTION <thread_id>)
Lost connection to MySQL

(https://www.percona.com/doc/percona-toolkit/2.2/pt-online-schema-change.html)

So, we might wanna retry on all those messages.

@vnazarenko
Copy link

The same for me, lhm is really great, but deadlocks when app try to insert / update data in table, is killing everything (((( Do some workarounds exists?

mateomurphy added a commit to mateomurphy/lhm that referenced this issue Dec 10, 2018
willbarrett pushed a commit to entelo/lhm that referenced this issue Apr 29, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

No branches or pull requests

4 participants