Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Ideas #95

Closed
5 tasks
ankane opened this issue Nov 11, 2019 · 21 comments
Closed
5 tasks

Ideas #95

ankane opened this issue Nov 11, 2019 · 21 comments

Comments

@ankane
Copy link
Owner

ankane commented Nov 11, 2019

I'm still on the fence about many of them. If anyone would like to discuss any of the ideas below, please create a new issue.

Likely ideas

  • Show friendly message for lock timeout - lock_timeout_message branch
  • Add support for multiple databases with target_version and other options - {primary: 10, other: 12}
  • Change default lock timeout to 5 seconds in generator

Other ideas

  • Option to retry lock timeouts - lock_timeout_retries branch
  • Condense add_foreign_key and change_column_null suggestions into a single migration (downside: failed migration can result in db state that needs manually reconciled) - add_foreign_key may be safe to do since it can detect locks

Decided against

  • Helper methods - better as a separate gem - see Helper Methods #111
  • Don't require safety_assured for remove_column if ignored_columns detected (ignored_columns will likely be removed after it runs in production, so this would cause errors for other devs)
  • Enforce timestamptz (opt-in) - apps can change globally in initializer if needed
  • Deprecate safety_assured for remove_index!, execute!, etc (no good reason to change for now)
  • rails g commands to generate safe migrations (like gindex gem)
  • Enforce reversible migrations (opt-in)
  • Enforce reversible operations (opt-in) - remove_column with type, change_column_default with to/from
  • Include null constraints in schema.rb - dump_null_constraints branch (or may be better to use execute command) - this shouldn't be needed in Postgres 12+, as well as Rails 6.1+
@mvz
Copy link

mvz commented Nov 12, 2019

I'd be very interested in some helper method for renaming tables.

@jjb
Copy link

jjb commented Apr 6, 2020

Gitlab has these migration helpers for their own internal use. They automagically set up triggers and such in order to do things like seamless column renaming.

https://gitlab.com/gitlab-org/gitlab/issues/34292

@bheemreddy181-zz
Copy link

Can we add a retry mechanism if migrations fail due to a timeout?

@ankane
Copy link
Owner Author

ankane commented Oct 29, 2020

Hey @bheemreddy181, it's an interesting idea.

For statements run outside a transaction, retrying the statement should work:

class AddSomeIndexToUsers < ActiveRecord::Migration[6.0]
  disable_ddl_transaction!

  def change
    add_index :users, :some_column, algorithm: :concurrently
  end
end

For statements run inside a transaction, the entire transaction needs to be re-run. For migrations that only use Active Record's DDL transaction and have no side effects, retrying the entire migration should work:

class SimpleMigration < ActiveRecord::Migration[6.0]
  def change
    add_column :users, :some_column, :string
  end
end

However, retrying more complex migrations can cause issues or errors (which may be an acceptable trade-off).

class ComplexMigration < ActiveRecord::Migration[6.0]
  def change
    add_column :users, :some_column, :string
    commit_db_transaction # we can avoid retry if this method is called
    begin_db_transaction
    add_column :users, :other_column, :string
  end
end

Or

class ComplexMigration2 < ActiveRecord::Migration[6.0]
  disable_ddl_transaction!

  def change
    add_column :users, :some_column, :string

    connection.transaction do
      add_column :users, :other_column, :string
    end
  end
end

I've added the outside a transaction case for Postgres on the lock_timeout_retries branch if you want to try it. Enable retries with:

StrongMigrations.lock_timeout_retries = 2

@bheemreddy181-zz
Copy link

@ankane I will give a try and will let you know thanks for this feature, appreciate your work. Your contribution matters a lot.

@bheemreddy181-zz
Copy link

@ankane I started testing the retry feature started seeing the below error, am I missing something here

Screen Shot 2020-10-30 at 6 32 09 PM

@bheemreddy181-zz
Copy link

bheemreddy181-zz commented Nov 2, 2020

@ankane Can you take a quick look whenever you get some time on the above issue, let me know if I am missing anything here?

@bheemreddy181-zz
Copy link

@ankane Can you take a quick look whenever you get some time on the above issue, let me know if I am missing anything here?

Can I get some help on this testing @ankane

@timkoopmans
Copy link

timkoopmans commented Nov 25, 2020

I also get this undef error on strong_migrations_checker ... it was working and now it's not, feels intermittetnt (load order?)

StandardError: An error has occurred, this and all later migrations canceled:

undefined method `strong_migrations_checker' for #<ActiveRecord::MigrationProxy:0x00007fa24ed45038>
/Users/timothykoopmans/.asdf/installs/ruby/2.6.6/lib/ruby/gems/2.6.0/bundler/gems/strong_migrations-716b64e73ceb/lib/strong_migrations/migrator.rb:6:in `ddl_transaction'
/Users/timothykoopmans/.asdf/installs/ruby/2.6.6/lib/ruby/gems/2.6.0/gems/activerecord-5.2.4.4/lib/active_record/migration.rb:1291:in `execute_migration_in_transaction'
/Users/timothykoopmans/.asdf/installs/ruby/2.6.6/lib/ruby/gems/2.6.0/gems/activerecord-5.2.4.4/lib/active_record/migration.rb:1263:in `block in migrate_without_lock'

@ankane
Copy link
Owner Author

ankane commented Dec 2, 2020

@bheemreddy181 @timkoopmans Pushed a fix for that error, but running into some weird issues when retrying entire migrations (DB timeouts aren't being applied and the Ruby process hangs), so I wouldn't use the branch in production right now.

@bheemreddy181-zz
Copy link

bheemreddy181-zz commented Dec 17, 2020

@ankane i have seen the same thing while testing this feature , ruby process hangs until i remove my exclusive lock which i am holding from other terminal

Before removing lock
Screen Shot 2020-12-17 at 5 01 09 PM
Screen Shot 2020-12-17 at 5 01 17 PM

After removing lock
Screen Shot 2020-12-17 at 5 02 24 PM
Screen Shot 2020-12-17 at 5 02 31 PM

@bheemreddy181-zz
Copy link

@ankane any thoughts on this retry feature?

@peret
Copy link

peret commented Sep 13, 2021

Hi @ankane,

we are running into lock timeouts from time to time and would really appreciate the retry feature. I could switch to the lock_timeout_retries branch, but if I understand correctly, this still shouldn't be used in production? Is there anything I can help you with to solve this issue? I can spend some time to create a PR, if I'm able to debug it. Are there any more pointers/information that you came across, that could help?
Also, should I open a separate issue for this?

@peret
Copy link

peret commented Sep 13, 2021

I found the issue :)

StrongMigrations::Checker.set_timeouts doesn't do anything if @timeouts_set is already true. That's why no DB timeouts are set on the second try and the migration waits forever to acquire the lock. I can prepare a PR if you want, but basically, setting @timeouts_set = false again in StrongMigrations::Checker.with_lock_timeout_retries before actually retrying the block fixed it for me.

This also explains why it's not an issue if no DDL transaction is used, I think, since in that case the DB timeouts won't be rolled back by the transaction?

@bheemreddy181-zz
Copy link

That makes complete sense - when no DDL transaction is used we don't need a timeout as concurrent index additions will let other transactions to flow through ( will let reads and writes on the table ) . Thanks for looking into it.

@pirj pirj mentioned this issue Oct 6, 2021
@tute
Copy link

tute commented Nov 9, 2021

We are also hitting lock timeout errors, and are considering alternatives for retries. Thanks for your work, Andrew!

@tute
Copy link

tute commented Nov 9, 2021

@ankane, why does your implementation only retry when !in_transaction?? I tried rebasing your branch but get "expected 2 tries and got 1", and it stays out due to that condition.

TimeoutsTest#test_lock_timeout_retries [/Users/tute/Sites/opensource/strong_migrations/test/timeouts_test.rb:125]:
Expected: 2
  Actual: 1

90 runs, 154 assertions, 1 failures, 0 errors, 0 skips```

Thank you.

@ankane
Copy link
Owner Author

ankane commented Feb 18, 2022

Just added experimental support for lock timeout retries 🎉

@peret That was the issue - thanks!

@tute I don't fully follow the question, but check out the link above and latest test cases for a better idea of how it works.

Also, note that lock_timeout_delay has been renamed to lock_timeout_retry_delay from the branch.

If there's any feedback, feel free to create a new issue. Thanks all!

@ankane ankane closed this as completed Feb 18, 2022
@ankane ankane mentioned this issue Feb 18, 2022
5 tasks
@bheemreddy181-zz
Copy link

Thanks a ton @ankane for considering this feature.

@benoittgt
Copy link

benoittgt commented Mar 3, 2022

For

Don't require safety_assured for remove_column if ignored_columns detected (ignored_columns will likely be removed after it runs in production, so this would cause errors for other devs)

I think with safety_assured we should block migration if the column removed isn't in ignored_columns. Even if it will still break (adding ignored column in the same deployment), if the deployment was not done before with the ignored_columns, maybe this warning/raise could help do to not forget to do that.

Something like this:

We detect that :activate_at column is not in User's ignored_columns. You should deploy a version with the ignored columns before, and also keep having the ignored columns when running this migration.

At least if it's not blocking the migration. We should have a big warning. I am interested to work on that PR if you are ok @ankane

@ankane
Copy link
Owner Author

ankane commented Mar 3, 2022

Hey @benoittgt I think the current approach provides a good balance of safety and simplicity, but feel free to share if you decide to fork and change it for your applications.

For future readers: please create a new issue to discuss any ideas.

Repository owner locked and limited conversation to collaborators Mar 3, 2022
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

8 participants