Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implementation of https://blueprints.launchpad.net/percona-server/+spec/backup-safe-binlog-info #122

Merged
merged 1 commit into from
Jul 28, 2015

Conversation

akopytov
Copy link
Contributor

One inefficiency of the backup locks feature is that even though LOCK
TABLES FOR BACKUP as a light-weight FTWRL alternative does not affect
DML statements updating InnoDB tables, LOCK BINLOG FOR BACKUP does
affect them by blocking commits.

XtraBackup uses LOCK BINLOG FOR BACKUP to:

  1. retrieve consistent binary log coordinates with SHOW MASTER
    STATUS. More precisely, binary log coordinates must be consistent
    with the REDO log copy and non-transactional tables. Therefore, no
    updates can be done to non-transactional tables (this is achieved by
    an active LOCK TABLES FOR BACKUP lock), and no commits can be
    performed between SHOW MASTER STATUS and finalizing the redo log
    copy, which is achieved by LOCK BINLOG FOR BACKUP.
  2. retrieve consistent master connection information for a replication
    slave. More precisely, the binary log coordinates on the master as
    reported by SHOW SLAVE STATUS must be consistent with the REDO log
    copy, so LOCK BINLOG FOR BACKUP also block the I/O replication thread.
  3. For a GTID-enabled PXC node, the last binary log file must be
    included into an SST snapshot. Which is a rather artificial
    limitation on the WSREP side, but still XtraBackup obeys it by
    blocking commits with LOCK BINLOG FOR BACKUP to ensure the integrity
    of the binary log file copy.

Depending on the write rate on the server, finalizing the REDO log copy
may take a long time, so blocking commits for that duration may still
affect server availability considerably.

This task is to make the necessary server-side change to make it
possible for XtraBackup to avoid LOCK BINLOG FOR BINLOG in case #1, when
cases #2 and #3 do not apply, i.e. when no --slave-info is requested by
the XtraBackup options and the server is not a GTID-enabled PXC node.

Lifting limitations for cases #2 and #3 is also possible, but that is
outside the scope of this task.

The idea of the optimization is that even though InnoDB provides a
transactional storage for the binary log information (i.e. current file
name and offset), it cannot be fully trusted by XtraBackup, because that
information is only updated on an InnoDB commit operation. Which means
if the last operation before LOCK TABLES FOR BACKUP was an update to a
non-transactional storage engine, and no InnoDB commits occur before the
backup is finalized by XtraBackup, the InnoDB system header will contain
stale binary log coordinates.

One way to fix that would be to force binlog coordinates update in the
InnoDB system header on each update, regardless of the involved storage
engine(s). This is what a Galera node does to ensure XID consistency
which is stored in the same way as binary log coordinates: it forces XID
update in the InnoDB system header on each TOI operation, in particular
on each non-transactional update.

Another approach is less invasive: XtraBackup blocks all
non-transactional updates with LOCK TABLES FOR BACKUP anyway, so instead
of having all non-transactional updates flush binlog coordinates to
InnoDB unconditionally, LTFB can be modified to flush (and redo-log) the
current binlog coordinates to InnoDB. In which case binlog coordinates
provided by InnoDB will be consistent with REDO log under any
circumstances.

This patch implements the latter approach.

https://blueprints.launchpad.net/percona-server/+spec/backup-safe-binlog-info

One inefficiency of the backup locks feature is that even though LOCK
TABLES FOR BACKUP as a light-weight FTWRL alternative does not affect
DML statements updating InnoDB tables, LOCK BINLOG FOR BACKUP does
affect them by blocking commits.

XtraBackup uses LOCK BINLOG FOR BACKUP to:

1. retrieve consistent binary log coordinates with SHOW MASTER
   STATUS. More precisely, binary log coordinates must be consistent
   with the REDO log copy and non-transactional tables. Therefore, no
   updates can be done to non-transactional tables (this is achieved by
   an active LOCK TABLES FOR BACKUP lock), and no commits can be
   performed between SHOW MASTER STATUS and finalizing the redo log
   copy, which is achieved by LOCK BINLOG FOR BACKUP.

2. retrieve consistent master connection information for a replication
   slave. More precisely, the binary log coordinates on the master as
   reported by SHOW SLAVE STATUS must be consistent with the REDO log
   copy, so LOCK BINLOG FOR BACKUP also block the I/O replication thread.

3. For a GTID-enabled PXC node, the last binary log file must be
   included into an SST snapshot. Which is a rather artificial
   limitation on the WSREP side, but still XtraBackup obeys it by
   blocking commits with LOCK BINLOG FOR BACKUP to ensure the integrity
   of the binary log file copy.

Depending on the write rate on the server, finalizing the REDO log copy
may take a long time, so blocking commits for that duration may still
affect server availability considerably.

This task is to make the necessary server-side change to make it
possible for XtraBackup to avoid LOCK BINLOG FOR BINLOG in case percona#1, when
cases percona#2 and percona#3 do not apply, i.e. when no --slave-info is requested by
the XtraBackup options and the server is not a GTID-enabled PXC node.

Lifting limitations for cases percona#2 and percona#3 is also possible, but that is
outside the scope of this task.

The idea of the optimization is that even though InnoDB provides a
transactional storage for the binary log information (i.e. current file
name and offset), it cannot be fully trusted by XtraBackup, because that
information is only updated on an InnoDB commit operation. Which means
if the last operation before LOCK TABLES FOR BACKUP was an update to a
non-transactional storage engine, and no InnoDB commits occur before the
backup is finalized by XtraBackup, the InnoDB system header will contain
stale binary log coordinates.

One way to fix that would be to force binlog coordinates update in the
InnoDB system header on each update, regardless of the involved storage
engine(s). This is what a Galera node does to ensure XID consistency
which is stored in the same way as binary log coordinates: it forces XID
update in the InnoDB system header on each TOI operation, in particular
on each non-transactional update.

Another approach is less invasive: XtraBackup blocks all
non-transactional updates with LOCK TABLES FOR BACKUP anyway, so instead
of having all non-transactional updates flush binlog coordinates to
InnoDB unconditionally, LTFB can be modified to flush (and redo-log) the
current binlog coordinates to InnoDB. In which case binlog coordinates
provided by InnoDB will be consistent with REDO log under any
circumstances.

This patch implements the latter approach.
@akopytov akopytov merged commit 1426d2b into percona:5.6 Jul 28, 2015
@akopytov akopytov deleted the backup_safe_binlog_info branch July 28, 2015 16:12
@akopytov
Copy link
Contributor Author

george-lorch pushed a commit to george-lorch/percona-server that referenced this pull request May 7, 2016
Summary:
RocksDB has some static object dependencies, and without
following dependencies, it hits SIGABRT. This diff has
two fixes.
1. Updating RocksDB revision. D51789 in RocksDB changes the problematic
mutex from global variable to local (function-level) static variable.
2. Changing MyRocks no to call ddl_manager.persist_stats() by
kill_server_thread (which calls rocksdb_done_func()). By calling
ddl_manager.persist_stats(),
rocksdb::ThreadLocalPtr::StaticMeta::OnThreadExit() (which calls
pthread_mutex_lock on "static port::Mutex mutex") is called at the end
of the thread. On the other hand, there is a race between main thread and
kill_server_thread. main thread calls pthread_mutex_destroy for the
mutex. To guarantee not to call pthread_mutex_lock() after
pthread_mutex_destroy(), it is necessary not to call
ddl_manager.persist_stats() from kill_server_thread. This diff fixes the
issue.

@update-submodule: rocksdb

Test Plan: mtr cardinality --repeat=100

Reviewers: hermanlee4, spetrunia, jkedgar

Reviewed By: jkedgar

Subscribers: yhchiang, webscalesql-eng

Differential Revision: https://reviews.facebook.net/D51933
george-lorch pushed a commit to george-lorch/percona-server that referenced this pull request May 9, 2016
Summary:
RocksDB has some static object dependencies, and without
following dependencies, it hits SIGABRT. This diff has
two fixes.
1. Updating RocksDB revision. D51789 in RocksDB changes the problematic
mutex from global variable to local (function-level) static variable.
2. Changing MyRocks no to call ddl_manager.persist_stats() by
kill_server_thread (which calls rocksdb_done_func()). By calling
ddl_manager.persist_stats(),
rocksdb::ThreadLocalPtr::StaticMeta::OnThreadExit() (which calls
pthread_mutex_lock on "static port::Mutex mutex") is called at the end
of the thread. On the other hand, there is a race between main thread and
kill_server_thread. main thread calls pthread_mutex_destroy for the
mutex. To guarantee not to call pthread_mutex_lock() after
pthread_mutex_destroy(), it is necessary not to call
ddl_manager.persist_stats() from kill_server_thread. This diff fixes the
issue.

@update-submodule: rocksdb

Test Plan: mtr cardinality --repeat=100

Reviewers: hermanlee4, spetrunia, jkedgar

Reviewed By: jkedgar

Subscribers: yhchiang, webscalesql-eng

Differential Revision: https://reviews.facebook.net/D51933
inikep pushed a commit to inikep/percona-server that referenced this pull request Apr 23, 2020
Summary:
RocksDB has some static object dependencies, and without
following dependencies, it hits SIGABRT. This diff has
two fixes.
1. Updating RocksDB revision. D51789 in RocksDB changes the problematic
mutex from global variable to local (function-level) static variable.
2. Changing MyRocks no to call ddl_manager.persist_stats() by
kill_server_thread (which calls rocksdb_done_func()). By calling
ddl_manager.persist_stats(),
rocksdb::ThreadLocalPtr::StaticMeta::OnThreadExit() (which calls
pthread_mutex_lock on "static port::Mutex mutex") is called at the end
of the thread. On the other hand, there is a race between main thread and
kill_server_thread. main thread calls pthread_mutex_destroy for the
mutex. To guarantee not to call pthread_mutex_lock() after
pthread_mutex_destroy(), it is necessary not to call
ddl_manager.persist_stats() from kill_server_thread. This diff fixes the
issue.

update-submodule: rocksdb

Reviewed By: jkedgar

Differential Revision: https://reviews.facebook.net/D51933

fbshipit-source-id: ca6e0722d88
inikep pushed a commit to inikep/percona-server that referenced this pull request Feb 24, 2021
Summary:
RocksDB has some static object dependencies, and without
following dependencies, it hits SIGABRT. This diff has
two fixes.
1. Updating RocksDB revision. D51789 in RocksDB changes the problematic
mutex from global variable to local (function-level) static variable.
2. Changing MyRocks no to call ddl_manager.persist_stats() by
kill_server_thread (which calls rocksdb_done_func()). By calling
ddl_manager.persist_stats(),
rocksdb::ThreadLocalPtr::StaticMeta::OnThreadExit() (which calls
pthread_mutex_lock on "static port::Mutex mutex") is called at the end
of the thread. On the other hand, there is a race between main thread and
kill_server_thread. main thread calls pthread_mutex_destroy for the
mutex. To guarantee not to call pthread_mutex_lock() after
pthread_mutex_destroy(), it is necessary not to call
ddl_manager.persist_stats() from kill_server_thread. This diff fixes the
issue.

update-submodule: rocksdb

Reviewed By: jkedgar

Differential Revision: https://reviews.facebook.net/D51933

fbshipit-source-id: ca6e0722d88
inikep pushed a commit to inikep/percona-server that referenced this pull request Nov 15, 2021
Summary:
RocksDB has some static object dependencies, and without
following dependencies, it hits SIGABRT. This diff has
two fixes.
1. Updating RocksDB revision. D51789 in RocksDB changes the problematic
mutex from global variable to local (function-level) static variable.
2. Changing MyRocks no to call ddl_manager.persist_stats() by
kill_server_thread (which calls rocksdb_done_func()). By calling
ddl_manager.persist_stats(),
rocksdb::ThreadLocalPtr::StaticMeta::OnThreadExit() (which calls
pthread_mutex_lock on "static port::Mutex mutex") is called at the end
of the thread. On the other hand, there is a race between main thread and
kill_server_thread. main thread calls pthread_mutex_destroy for the
mutex. To guarantee not to call pthread_mutex_lock() after
pthread_mutex_destroy(), it is necessary not to call
ddl_manager.persist_stats() from kill_server_thread. This diff fixes the
issue.

update-submodule: rocksdb

Reviewed By: jkedgar

Differential Revision: https://reviews.facebook.net/D51933

fbshipit-source-id: e1317bab787
ldonoso pushed a commit to ldonoso/percona-server that referenced this pull request Mar 15, 2022
Summary:
RocksDB has some static object dependencies, and without
following dependencies, it hits SIGABRT. This diff has
two fixes.
1. Updating RocksDB revision. D51789 in RocksDB changes the problematic
mutex from global variable to local (function-level) static variable.
2. Changing MyRocks no to call ddl_manager.persist_stats() by
kill_server_thread (which calls rocksdb_done_func()). By calling
ddl_manager.persist_stats(),
rocksdb::ThreadLocalPtr::StaticMeta::OnThreadExit() (which calls
pthread_mutex_lock on "static port::Mutex mutex") is called at the end
of the thread. On the other hand, there is a race between main thread and
kill_server_thread. main thread calls pthread_mutex_destroy for the
mutex. To guarantee not to call pthread_mutex_lock() after
pthread_mutex_destroy(), it is necessary not to call
ddl_manager.persist_stats() from kill_server_thread. This diff fixes the
issue.

@update-submodule: rocksdb

Reviewed By: jkedgar

Differential Revision: https://reviews.facebook.net/D51933
inikep pushed a commit to inikep/percona-server that referenced this pull request Apr 9, 2024
Summary:
RocksDB has some static object dependencies, and without
following dependencies, it hits SIGABRT. This diff has
two fixes.
1. Updating RocksDB revision. D51789 in RocksDB changes the problematic
mutex from global variable to local (function-level) static variable.
2. Changing MyRocks no to call ddl_manager.persist_stats() by
kill_server_thread (which calls rocksdb_done_func()). By calling
ddl_manager.persist_stats(),
rocksdb::ThreadLocalPtr::StaticMeta::OnThreadExit() (which calls
pthread_mutex_lock on "static port::Mutex mutex") is called at the end
of the thread. On the other hand, there is a race between main thread and
kill_server_thread. main thread calls pthread_mutex_destroy for the
mutex. To guarantee not to call pthread_mutex_lock() after
pthread_mutex_destroy(), it is necessary not to call
ddl_manager.persist_stats() from kill_server_thread. This diff fixes the
issue.

@update-submodule: rocksdb

Reviewed By: jkedgar

Differential Revision: https://reviews.facebook.net/D51933
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant