Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Vats 5.7 #2

Closed
wants to merge 23 commits into from
Closed

Vats 5.7 #2

wants to merge 23 commits into from

Conversation

sensssz
Copy link

@sensssz sensssz commented Oct 17, 2016

Hi, Laurynas,

I forked your branch and fixed all known test failures based on it.

I removed the assertions you added because I don't think they should hold. Previously I used lock_sys->rec_hash directly because it was the only hash list in 5.6, but more were added in 5.7. I've changed the implementation so that it uses the correct hash list.

@laurynas-biveinis
Copy link
Owner

Thank you. Removing the asserts is the right thing to do, as they have been added only to catch missing implementation bits

@laurynas-biveinis laurynas-biveinis self-assigned this Oct 24, 2016
laurynas-biveinis pushed a commit that referenced this pull request Oct 25, 2016
Summary:
Inside index_next_same() call, we should
1. first check whether the record matches the index
   lookup prefix,
2. then check pushed index condition.

If we try to check #2 without checking #1 first, we may walk
off the index lookup prefix and scan till the end of the index.

Test Plan: Run mtr

Reviewers: hermanlee4, maykov, jtolmer, yoshinorim

Reviewed By: yoshinorim

Differential Revision: https://reviews.facebook.net/D38769
laurynas-biveinis pushed a commit that referenced this pull request Oct 25, 2016
Summary:
Inside index_next_same() call, we should
1. first check whether the record matches the index
   lookup prefix,
2. then check pushed index condition.

If we try to check #2 without checking #1 first, we may walk
off the index lookup prefix and scan till the end of the index.

Test Plan: Run mtr

Reviewers: hermanlee4, maykov, jtolmer, yoshinorim

Reviewed By: yoshinorim

Differential Revision: https://reviews.facebook.net/D38769
laurynas-biveinis pushed a commit that referenced this pull request Oct 25, 2016
Summary:
MyRocks had two bugs when calculating index scan cost.
1. block_size was not considered. This made covering index scan cost
(both full index scan and range scan) much higher
2. ha_rocksdb::records_in_range() may have estimated more rows
than the estimated number of rows in the table. This was wrong,
and MySQL optimizer decided to use full index scan even though
range scan was more efficient.

This diff fixes #1 by setting stats.block_size at ha_rocksdb::open(),
and fixes #2 by reducing the number of estimated rows if it was
larger than stats.records.

Test Plan:
mtr, updating some affected test cases, and new test case
rocksdb_range2

Reviewers: hermanlee4, jkedgar, spetrunia

Reviewed By: spetrunia

Subscribers: MarkCallaghan, webscalesql-eng

Differential Revision: https://reviews.facebook.net/D55869
laurynas-biveinis pushed a commit that referenced this pull request Oct 25, 2016
Summary:
MyRocks had two bugs when calculating index scan cost.
1. block_size was not considered. This made covering index scan cost
(both full index scan and range scan) much higher
2. ha_rocksdb::records_in_range() may have estimated more rows
than the estimated number of rows in the table. This was wrong,
and MySQL optimizer decided to use full index scan even though
range scan was more efficient.

This diff fixes #1 by setting stats.block_size at ha_rocksdb::open(),
and fixes #2 by reducing the number of estimated rows if it was
larger than stats.records.

Test Plan:
mtr, updating some affected test cases, and new test case
rocksdb_range2

Reviewers: hermanlee4, jkedgar, spetrunia

Reviewed By: spetrunia

Subscribers: MarkCallaghan, webscalesql-eng

Differential Revision: https://reviews.facebook.net/D55869
sensssz pushed a commit to sensssz/percona-server that referenced this pull request Nov 27, 2016
…hutdown)

On several testcases (i.e. rpl_gtid_mode), LeakSanitizer diagnoses
missed memory deallocation:

=================================================================
==16675==ERROR: LeakSanitizer: detected memory leaks

Direct leak of 21 byte(s) in 1 object(s) allocated from:
    #0 0x7f17748fa54a in malloc (/usr/lib/x86_64-linux-gnu/libasan.so.2+0x9854a)
    laurynas-biveinis#1 0xff7f7f in my_malloc /mnt/workspace/percona-server-5.6-asan-param/BUILD_TYPE/debug-asan/Host/ubuntu-xenial-64bit/mysys/my_malloc.c:38
    laurynas-biveinis#2 0x1634b83 in add_pfs_instr_to_array(char const*, char const*) /mnt/workspace/percona-server-5.6-asan-param/BUILD_TYPE/debug-asan/Host/ubuntu-xenial-64bit/storage/perfschema/pfs_server.cc:251
    laurynas-biveinis#3 0x58cccf in mysqld_get_one_option /mnt/workspace/percona-server-5.6-asan-param/BUILD_TYPE/debug-asan/Host/ubuntu-xenial-64bit/sql/mysqld.cc:9198
    laurynas-biveinis#4 0x10256c6 in my_handle_options /mnt/workspace/percona-server-5.6-asan-param/BUILD_TYPE/debug-asan/Host/ubuntu-xenial-64bit/mysys_ssl/my_getopt.cc:817
    laurynas-biveinis#5 0x1025c63 in handle_options /mnt/workspace/percona-server-5.6-asan-param/BUILD_TYPE/debug-asan/Host/ubuntu-xenial-64bit/mysys_ssl/my_getopt.cc:308
    laurynas-biveinis#6 0x5963e5 in handle_early_options() /mnt/workspace/percona-server-5.6-asan-param/BUILD_TYPE/debug-asan/Host/ubuntu-xenial-64bit/sql/mysqld.cc:7263
    laurynas-biveinis#7 0x5a35a3 in mysqld_main(int, char**) /mnt/workspace/percona-server-5.6-asan-param/BUILD_TYPE/debug-asan/Host/ubuntu-xenial-64bit/sql/mysqld.cc:5613
    laurynas-biveinis#8 0x586aae in main /mnt/workspace/percona-server-5.6-asan-param/BUILD_TYPE/debug-asan/Host/ubuntu-xenial-64bit/sql/main.cc:25
    laurynas-biveinis#9 0x7f17726cc82f in __libc_start_main (/lib/x86_64-linux-gnu/libc.so.6+0x2082f)

This class of errors is already attempted to suppress in
valgrind.supp. But these suppressions have been added to work around a
bug of racy PFS shutdown, which is not required anymore as
pfs_instr_config_array is deallocated exactly once since [1]. Thus,
free the elements of this array and remove related suppressions
instead.
@sensssz sensssz closed this Dec 7, 2016
@sensssz sensssz deleted the vats-5.7 branch December 7, 2016 15:43
laurynas-biveinis pushed a commit that referenced this pull request Jul 21, 2017
In WL-included builds ASAN run witnessed missed ~Query_log_event invocation.
The destruct-or was not called due to the WL's changes in the error propagation
that specifically affect LC MTS.
The failure is exposed in particular by rpl_trigger as the following
stack:

  #0 0x9ecd98 in __interceptor_malloc (/export/home/pb2/test/sb_2-22611026-1489061390.32/mysql-commercial-8.0.1-dmr-linux-x86_64-asan/bin/mysqld+0x9ecd98)
  #1 0x2b1a245 in my_raw_malloc(unsigned long, int) obj/mysys/../../mysqlcom-pro-8.0.1-dmr/mysys/my_malloc.cc:209:12
  #2 0x2b1a245 in my_malloc obj/mysys/../../mysqlcom-pro-8.0.1-dmr/mysys/my_malloc.cc:72
  #3 0x2940590 in Query_log_event::Query_log_event(char const*, unsigned int, binary_log::Format_description_event const*, binary_log::Log_event_type) obj/sql/../../mysqlcom-pro-8.0.1-dmr/sql/log_event.cc:4343:46
  #4 0x293d235 in Log_event::read_log_event(char const*, unsigned int, char const**, Format_description_log_event const*, bool) obj/sql/../../mysqlcom-pro-8.0.1-dmr/sql/log_event.cc:1686:17
  #5 0x293b96f in Log_event::read_log_event()
  #6 0x2a2a1c9 in next_event(Relay_log_info*)

Previously before the WL
Mts_submode_logical_clock::wait_for_workers_to_finish() had not
returned any error even when Coordinator thread is killed.

The WL patch needed to refine such behavior, but at doing so
it also had to attend log_event.cc::schedule_next_event() to register
an error to follow an existing pattern.
While my_error() does not take place the killed Coordinator continued
scheduling, ineffectively though - no Worker gets engaged (legal case
of deferred scheduling), and without noticing its killed status up to
a point when it resets the event pointer in
apply_event_and_update_pos():

  *ptr_ev= NULL; // announcing the event is passed to w-worker

The reset was intended for an assigned Worker to perform the event
destruction or by Coordinator itself when the event is deferred.
As neither is the current case the event gets unattended for its termination.

In contrast in the pre-WL sources the killed Coordinator does find a Worker.
However such Worker could be already down (errored out and exited), in
which case apply_event_and_update_pos() reasonably returns an error and executes

  delete ev

in exec_relay_log_event() error branch.

**Fixed** with deploying my_error() call in log_event.cc::schedule_next_event()
error branch which fits to the existing pattern.
THD::is_error() has been always checked by Coordinator before any attempt to
reset *ptr_ev= NULL. In the errored case Coordinator does not reset and
destroys the event itself in the exec_relay_log_event() error branch pretty similarly to
how the pre-WL sources do.

Tested against rpl_trigger and rpl suites to pass.

Approved on rb#15667.
laurynas-biveinis pushed a commit that referenced this pull request Jul 21, 2017
… WITH DEVELOPER STUDIO

When we do a release type build of the server (with both optimized and
debug enabled server/plugins) with Developer Studio, some MTR tests
when run with --debug-server will fail in one of two ways:

1. Tests which try to load a plugin into the mysql client fail with
   missing symbols. This is caused by the plugin having references to
   functions which do not exist in the non-debug client.

2. Some tests on sparc fail with Thread stack overrun.

Fix for issue #1: mtr will have appended /debug to the plugin dir part
when running with --debug-server and if there actually is such a
directory. The fix is to remove any trailing /debug from the
env. variable within the test. This will affect the client only, not
the server. Developer builds will not have put the plugins in a
subdirectory /debug so it makes no different to those.

Fix for issue #2: apparently this thread stack overrun is not feasible
to avoid, so just skip the test if running with debug server on sparc;
there is already an include file to do that.

Also added not_sparc_debug.inc to the "white list" so the tests are
skipped even when running mtr --no-skip.

(cherry picked from commit 9c79e477261ab252e38def436bca3336ef597603)
laurynas-biveinis pushed a commit that referenced this pull request Jul 21, 2017
Some character sets are designated as MY_CS_STRNXFRM, meaning that sorting
needs to go through my_strnxfrm() (implemented by the charset), and some are
not, meaning that a client can do the strnxfrm itself based on
cs->sort_order. However, most of the logic related to the latter has been
removed already (e.g. filesort always uses my_strnxfrm() since 2003), and now
it's mostly in the way. The three main uses left are:

 1. A microoptimization for constructing sort keys in filesort.
 2. A home-grown implementation of Boyer-Moore for accelerating certain
    LIKE patterns that should probably be handled through FTS.
 3. Some optimizations to MyISAM prefix keys.

Given that our default collation (utf8mb4_0900_ai_ci) now is a strnxfrm-based
collation, the benefits of keeping these around for a narrow range of
single-byte locales (like latin1_swedish_ci, cp850 and a bunch of more
obscure locales) seems dubious. We seemingly can't remove the flag entirely
due to #3 seemingly affecting the on-disk MyISAM structure, but we can remove
the code for #1 and #2.

Change-Id: If974e490d451b7278355e33ab1fca993f446b792
laurynas-biveinis pushed a commit that referenced this pull request Jul 21, 2017
 (A) move ndb_basename into utility library
 (B) new portability function ndb_getsockname()
laurynas-biveinis pushed a commit that referenced this pull request Oct 9, 2017
Patch #2:

find_child_doms() checks for duplicates each time it adds a result to
the result vector. As explained in the header comment for
Json_wrapper::seek(), the duplicate elimination is needed for paths
which contain multiple ellipses, so in most cases it is unnecessary
work.

This patch makes find_child_doms() only check for duplicates in the
case where the path contains multiple ellipses, and only when
inspecting an ellipsis path leg which is not the first one.

Additionally:

Remove checks for empty vector after a successful call to push_back().
That is, replace checks for is_seek_done(result, only_need_one) with a
simple check for only_need_one when we know the result vector cannot
be empty.

Call is_seek_done() from the loop in Json_dom::seek() instead of at
the top of find_child_doms(), so that we can break out of the loop
earlier if we find a match and only need one.

Microbenchmarks (64-bit, Intel Core i7-4770 3.4 GHz, GCC 6.3):

BM_JsonDomSearchEllipsis              25693 ns/iter [+210.9%]
BM_JsonDomSearchEllipsis_OnlyOne      17881 ns/iter [+324.3%]
BM_JsonDomSearchKey                     128 ns/iter [  +0.8%]
BM_JsonBinarySearchEllipsis          231319 ns/iter [ +38.7%]
BM_JsonBinarySearchEllipsis_OnlyOne  222726 ns/iter [ +41.6%]
BM_JsonBinarySearchKey                   86 ns/iter [   0.0%]

Change-Id: I0ee624830680247ec5aed302c0408db00240d441
laurynas-biveinis pushed a commit that referenced this pull request Oct 9, 2017
 PLUGIN INSTALLS

Split the UDF initialization/deinitialization into two:
1. Initialization/deinitialization of the global structures
2. Loading of the UDF definitions from the table and removing them from the global

Then kept the #2 at the place of the current initialization/deinitalization
routines and added #2 initialization very early (before component/plugin
initialization) and #2 deinitialization very late (after the plugin/compononent
deinitialization.

Added a test plugin and a regression test.
laurynas-biveinis pushed a commit that referenced this pull request Jan 30, 2018
Fix misc UBSAN warnings in unit tests.
To repeat:
export UBSAN_OPTIONS="print_stacktrace=1"

./runtime_output_directory/merge_large_tests-t --gtest_filter='-*DeathTest*' > /dev/null

unittest/gunit/gis_algos-t.cc:78:70:
runtime error: downcast of address 0x000012dc0be8 which does not point to an object of type 'Gis_polygon_ring'

include/sql_string.h:683:35: runtime error: null pointer passed as argument 2, which is declared to never be null
    #1 0x373e7af in histograms::Value_map<String>::add_values(String const&, unsigned long long) sql/histograms/value_map.cc:149
    #2 0x294fcf2 in dd_column_statistics_unittest::add_values(histograms::Value_map<String>&) unittest/gunit/dd_column_statistics-t.cc:62

runtime_output_directory/merge_keyring_file_tests-t --gtest_filter='-*DeathTest*' > /dev/null

plugin/keyring/common/keyring_key.cc:82:57: runtime error: null pointer passed as argument 2, which is declared to never be null

Change-Id: I2651362e3373244b72e6893f0e22e67402b49a52
(cherry picked from commit 1fe3f72561994da1d912a257689e1b18106f8828)
laurynas-biveinis pushed a commit that referenced this pull request Jan 30, 2018
…TABLE_UPGRADE_GUARD

To repeat: cmake -DWITH_ASAN=1 -DWITH_ASAN_SCOPE=1
./mtr --mem --sanitize main.dd_upgrade_error

A few dd tests fail with:
==26861==ERROR: AddressSanitizer: stack-use-after-scope on address 0x7000063bf5e8 at pc 0x00010d4dbe8b bp 0x7000063bda40 sp 0x7000063bda38
READ of size 8 at 0x7000063bf5e8 thread T2
    #0 0x10d4dbe8a in Prealloced_array<st_plugin_int**, 16ul>::empty() const prealloced_array.h:186
    #1 0x10d406a8b in lex_end(LEX*) sql_lex.cc:560
    #2 0x10dae4b6d in dd::upgrade::Table_upgrade_guard::~Table_upgrade_guard() (mysqld:x86_64+0x100f87b6d)
    #3 0x10dadc557 in dd::upgrade::migrate_table_to_dd(THD*, std::__1::basic_string<char, std::__1::char_traits<char>, Stateless_allocator<char, dd::String_type_alloc, My_free_functor> > const&, std::__1::basic_string<char, std::__1::char_traits<char>, Stateless_allocator<char, dd::String_type_alloc, My_free_functor> > const&, bool) (mysqld:x86_64+0x100f7f557)
    #4 0x10dad7e85 in dd::upgrade::migrate_plugin_table_to_dd(THD*) (mysqld:x86_64+0x100f7ae85)
    #5 0x10daec6a1 in dd::upgrade::do_pre_checks_and_initialize_dd(THD*) upgrade.cc:1216
    #6 0x10cd0a5c0 in bootstrap::handle_bootstrap(void*) bootstrap.cc:336

Change-Id: I265ec6dd97ee8076aaf03763840c0cdf9e20325b
Fix: increase lifetime of 'LEX lex;' which is used by 'table_guard'
laurynas-biveinis pushed a commit that referenced this pull request Jan 30, 2018
        as a component

   * The validate password plugin is converted to component.
   * For mysql-8.0 we will have both component and plugin(But
     plugin will be installed/uninstalled with below deprecate
     warning)
     "validate password plugin' is deprecated and will be
      removed in a future release. Please use validate_password
      component instead"
   * In the next major release we remove the plugin .so file.
   * With the component enabled, we see the system/status
     variables preceded with "validate_password." instead of
     "validate_password_"(for example, if we take 'length'
     system variable we see it as "validate_password.length"
     instead of "validate_password_length")
   * The packaging script should be installing the component
     by default for new installs

   The way it currently works with deprecations and removals
   is the following:
   1. We add a deprecation warning to using the old way and
      we switch the server to use the new way by default for
      new installations.
   2. We expect that people will upgrade their existing
      installations too to avoid the warning.
   3. In the next major release we remove the old way and
      hope that people did #2.
laurynas-biveinis pushed a commit that referenced this pull request Jan 30, 2018
…/BINDINGS/XCOM/XCOM/TASK.C

Problem
----------------------------

A Segmentation fault was encountered on a daily-trunk pushbuild
run for the test 'group_replication.gr_ssl_mode_verify_identity_error'
when starting group replication with the below symptoms -

group_replication.gr_ssl_mode_verify_identity_error w5 [ fail ]
        Test ended at 2017-09-27 17:37:18

CURRENT_TEST: group_replication.gr_ssl_mode_verify_identity_error
mysqltest: At line 58: query 'START GROUP_REPLICATION' failed with wrong
errno 2013: 'Lost connection to MySQL server during query', instead of
3092

Analysis
-----------------------------

The test that fails sporadically is Step #2 of gr_ssl_mode_verify_identity_error.
That step causes a global SSL comunication failure, in which not even the
local connections can be established, thus causing the join operation to fail.

When that happens, GCS will request the XCom thread termination. Since it does
not have socket access, it will contact it directly calling xcom_fsm(exit),
but this is a well-known non-thread safe operation, hence causing this crash.

Fix
-----------------------------

Create a new callback for GCS called should_exit, which is tested within
task_loop, in order to check if we should exit the XCom thread.
laurynas-biveinis added a commit that referenced this pull request Oct 18, 2018
A subset of binlog encryption tests was crashing with:

* thread percona#39, stop reason = signal SIGSTOP
    frame #0: 0x00007fff56063b66 libsystem_kernel.dylib`__pthread_kill + 10
    frame #1: 0x00007fff5622e080 libsystem_pthread.dylib`pthread_kill + 333
    frame #2: 0x000000010657442b mysqld-debug`my_write_core(sig=11) at stacktrace.cc:278
    frame #3: 0x0000000104d84334 mysqld-debug`::handle_fatal_signal(sig=11) at signal_handler.cc:254
    frame #4: 0x00007fff56221f5a libsystem_platform.dylib`_sigtramp + 26
    frame #5: 0x00007fff5622934d libsystem_pthread.dylib`pthread_mutex_lock + 1
    frame #6: 0x0000000106578d05 mysqld-debug`native_mutex_lock(mutex=0x0000000000000000) at thr_mutex.h:93
    frame #7: 0x0000000106578a57 mysqld-debug`safe_mutex_lock(mp=0x0000000000000000, try_lock=false, file="/Users/laurynas/percona/mysql-server/mysys/mf_iocache2.cc", line=113) at thr_mutex.cc:70
    frame #8: 0x000000010653cd3a mysqld-debug`my_mutex_lock(mp=0x00007ffb6b215038, file="/Users/laurynas/percona/mysql-server/mysys/mf_iocache2.cc", line=113) at thr_mutex.h:180
    frame #9: 0x000000010653b2cc mysqld-debug`inline_mysql_mutex_lock(that=0x00007ffb6b215038, src_file="/Users/laurynas/percona/mysql-server/mysys/mf_iocache2.cc", src_line=113) at mysql_mutex.h:267
  * frame #10: 0x000000010653b0d8 mysqld-debug`my_b_append_tell(info=0x00007ffb6b214fd8) at mf_iocache2.cc:113
    frame #11: 0x0000000105ed6a96 mysqld-debug`MYSQL_BIN_LOG::write_buffer(this=0x00007ffb6b214cb8, buf="", len=47, mi=0x00007ffb6b1f6a00) at binlog.cc:7128
    frame #12: 0x0000000105f4d54b mysqld-debug`queue_event(mi=0x00007ffb6b1f6a00, buf="", event_len=47, do_flush_mi=true) at rpl_slave.cc:7756
    frame #13: 0x0000000105f3a243 mysqld-debug`::handle_slave_io(arg=0x00007ffb6b1f6a00) at rpl_slave.cc:5382
    frame #14: 0x00000001065b87a5 mysqld-debug`pfs_spawn_thread(arg=0x00007ffb6a543af0) at pfs.cc:2836
    frame #15: 0x00007fff5622b661 libsystem_pthread.dylib`_pthread_body + 340
    frame #16: 0x00007fff5622b50d libsystem_pthread.dylib`_pthread_start + 377
    frame #17: 0x00007fff5622abf9 libsystem_pthread.dylib`thread_start + 13

This was caused by my_b_append_tell trying to lock a nullptr
IO_CACHE::append_buffer_lock. The lock was nullptr, because it's only
initialized for SEQ_READ_APPEND IO_CACHEs, whereas we have
WRITE_CACHE. This mismatch was introduced by WL#8599 [1] changing the
IO_CACHE type from the former to the latter.

Fix by using the correct API for the new IO_CACHE type: my_b_tell
instead of my_b_append_tell.

[1]:

commit dbd2ca2
Author: Joao Gramacho <[email protected]>
Date:   Tue Nov 1 06:45:39 2016 +0000

    WL#8599: Reduce contention in IO and SQL threads
    (...)
laurynas-biveinis added a commit that referenced this pull request Oct 18, 2018
create_table_info_t::create_table_def leaked memory in the case
enable_encryption(table) call failed:

worker[5] Sanitizer report from /tmp/results/PS/mysql-test/var/5/log/mysqld.2.err after tests:
 binlog_encryption.binlog_encryption_without_keyring group_replication.gr_change_master_hidden group_replication.gr_server_uuid_matches_group_name group_replication.gr_perfschema_connect_status group_replication.gr_single_primary_and_leader_election_on_error group_replication.gr_without_perfschema rpl.rpl_key_rotation
--------------------------------------------------------------------------
==14131==ERROR: LeakSanitizer: detected memory leaks

Direct leak of 1136 byte(s) in 1 object(s) allocated from:
    #0 0x7fe9233f1602 in malloc (/usr/lib/x86_64-linux-gnu/libasan.so.2+0x98602)
    #1 0xc692483 in ut_allocator<unsigned char>::allocate(unsigned long, unsigned char const*, unsigned int, bool, bool) storage/innobase/include/ut0new.h:608
    #2 0xc692483 in mem_heap_create_block_func(mem_block_info_t*, unsigned long, unsigned long) storage/innobase/mem/memory.cc:281
    #3 0xb99ff96 in mem_heap_create_func storage/innobase/include/mem0mem.ic:464
    #4 0xbae8604 in create_table_info_t::create_table_def(dd::Table const*) storage/innobase/handler/ha_innodb.cc:10349
    #5 0xbaee018 in create_table_info_t::create_table(dd::Table const*) storage/innobase/handler/ha_innodb.cc:12420
    #6 0xbaf1aba in int innobase_basic_ddl::create_impl<dd::Table>(THD*, char const*, TABLE*, HA_CREATE_INFO*, dd::Table*, bool, bool, bool, unsigned long, unsigned long) storage/innobase/handler/ha_innodb.cc:12805
    #7 0xbaf7e6a in ha_innobase::create(char const*, TABLE*, HA_CREATE_INFO*, dd::Table*) storage/innobase/handler/ha_innodb.cc:13756
    #8 0x2857f7a in ha_create_table(THD*, char const*, char const*, char const*, HA_CREATE_INFO*, List<Create_field> const*, bool, bool, dd::Table*) sql/handler.cc:5156
    #9 0x19d0d9f in rea_create_base_table sql/sql_table.cc:991
    #10 0x19d0d9f in create_table_impl sql/sql_table.cc:7118
    #11 0x19d37cf in mysql_create_table_no_lock(THD*, char const*, char const*, HA_CREATE_INFO*, Alter_info*, unsigned int, bool, bool*, handlerton**) sql/sql_table.cc:7200
    #12 0x19dffb2 in mysql_create_table(THD*, TABLE_LIST*, HA_CREATE_INFO*, Alter_info*) sql/sql_table.cc:7950
    #13 0x3b58b9b in Sql_cmd_create_table::execute(THD*) sql/sql_cmd_ddl_table.cc:319
    #14 0x15917c1 in mysql_execute_command(THD*, bool) sql/sql_parse.cc:4417
    #15 0x15b086e in mysql_parse(THD*, Parser_state*, bool) sql/sql_parse.cc:5139
    #16 0x8efc7fd in Query_log_event::do_apply_event(Relay_log_info const*, char const*, unsigned long) sql/log_event.cc:5295
    #17 0x8f7ea48 in Log_event::apply_event(Relay_log_info*) sql/log_event.cc:3882
    #18 0x91cb682 in apply_event_and_update_pos sql/rpl_slave.cc:4352
    #19 0x9215e69 in exec_relay_log_event sql/rpl_slave.cc:4812
    #20 0x9254685 in handle_slave_sql sql/rpl_slave.cc:6912
    #21 0xb1913a3 in pfs_spawn_thread storage/perfschema/pfs.cc:2836
    #22 0x7fe9231436b9 in start_thread (/lib/x86_64-linux-gnu/libpthread.so.0+0x76b9)

Fix by adding the missing mem_heap_free(heap) call.
laurynas-biveinis added a commit that referenced this pull request Oct 18, 2018
Avoid undefined behavior in audit_log_update_thd_local by avoiding
passing NULL as source pointer to memcpy, even with zero length.

The UBSan report fixed is

/usr/include/x86_64-linux-gnu/bits/string3.h:53:71: runtime error: null pointer passed as argument 2, which is declared to never be null
    #0 0x7fe5aad56fb1 in memcpy /usr/include/x86_64-linux-gnu/bits/string3.h:53
    #1 0x7fe5aad56fb1 in audit_log_update_thd_local plugin/audit_log/audit_log.cc:987
    #2 0x7fe5aad56fb1 in audit_log_notify plugin/audit_log/audit_log.cc:1105
    #3 0x1ecac37 in plugins_dispatch sql/sql_audit.cc:1284
    #4 0x1ecac37 in event_class_dispatch sql/sql_audit.cc:1322
    #5 0x1ecb311 in event_class_dispatch_error sql/sql_audit.cc:1340
    #6 0x1ed21b1 in mysql_audit_notify(THD*, mysql_event_connection_subclass_t, char const*, int) sql/sql_audit.cc:438
    #7 0x1350071 in check_connection sql/sql_connect.cc:868
    #8 0x1350071 in login_connection sql/sql_connect.cc:929
    #9 0x1357881 in thd_prepare_connection(THD*, bool) sql/sql_connect.cc:1084
    #10 0x1e66347 in handle_connection sql/conn_handler/connection_handler_per_thread.cc:313
    #11 0xb1913a3 in pfs_spawn_thread storage/perfschema/pfs.cc:2836
    #12 0x7fe5d352f6b9 in start_thread (/lib/x86_64-linux-gnu/libpthread.so.0+0x76b9)
    #13 0x7fe5d0bd741c in clone (/lib/x86_64-linux-gnu/libc.so.6+0x10741c)
laurynas-biveinis pushed a commit that referenced this pull request Jun 7, 2019
… keyring

Issue: when the server is started without a keyring, the redo log encryption
checks were never performed, redo log encryption wasn't initialized, but the
setting was left on, misleading the user. Later operations which triggered
this check could possibly fail.

Issue #2: redo log encryption didn't work properly when it was turned on using
the command line: in some cases, the redo log remained unencrypted (until
another operation triggered the redo log encryption setup methods to be called
again)

Both issues were caused by the refactoring of the periodic, once every
second encryption checks in upstream, in PS-5189.

This commit:

* changes the code so redo log encryption routines are called at least twice
during every startup with the encryption settings on. This fixes issue #2.
* changes the code so redo log encryption routines are called at least once
during every startup, even without a keyring, or a read-only server: this fixes
issues #1
* adds a test ensuring that in an invalid configuration (without a keyring) the
the server remains running, but the redo log encryption setting correctly displays the
OFF status
* adds a test ensuring that when the redo log is correctly configured (using the
dynamic variable or a command line parameter) the log is encrypted, and no unencrypted
data is present in it.
* backports additional redo log related bugfixes/refactorings from 8.0
(92198f4). The original commit contains fixes
for both the redo and the undo log, the backport only includes the redo log related
parts refactored to match our changes.

(cherry picked from commit c9f151e)
laurynas-biveinis pushed a commit that referenced this pull request Jun 21, 2019
Upstream commit ID : fb-mysql-5.6.35/8cb1dc836b68f1f13e8b2655b2b8cb2d57f400b3
PS-5217 : Merge fb-prod201803

Summary:
Original report: https://jira.mariadb.org/browse/MDEV-15816

To reproduce this bug just following below steps,

client 1:
USE test;
CREATE TABLE t1 (i INT) ENGINE=MyISAM;
HANDLER t1 OPEN h;
CREATE TABLE t2 (i INT) ENGINE=RocksDB;
LOCK TABLES t2 WRITE;

client 2:
FLUSH TABLES WITH READ LOCK;

client 1:
INSERT INTO t2 VALUES (1);

So client 1 acquired the lock and set m_lock_rows = RDB_LOCK_WRITE.
Then client 2 calls store_lock(TL_IGNORE) and m_lock_rows was wrongly
set to RDB_LOCK_NONE, as below

```
 #0  myrocks::ha_rocksdb::store_lock (this=0x7fffbc03c7c8, thd=0x7fffc0000ba0, to=0x7fffc0011220, lock_type=TL_IGNORE)
 #1  get_lock_data (thd=0x7fffc0000ba0, table_ptr=0x7fffe84b7d20, count=1, flags=2)
 #2  mysql_lock_abort_for_thread (thd=0x7fffc0000ba0, table=0x7fffbc03bbc0)
 #3  THD::notify_shared_lock (this=0x7fffc0000ba0, ctx_in_use=0x7fffbc000bd8, needs_thr_lock_abort=true)
 #4  MDL_lock::notify_conflicting_locks (this=0x555557a82380, ctx=0x7fffc0000cc8)
 #5  MDL_context::acquire_lock (this=0x7fffc0000cc8, mdl_request=0x7fffe84b8350, lock_wait_timeout=2)
 #6  Global_read_lock::lock_global_read_lock (this=0x7fffc0003fe0, thd=0x7fffc0000ba0)
```

Finally, client 1 "INSERT INTO..." hits the Assertion 'm_lock_rows == RDB_LOCK_WRITE'
failed in myrocks::ha_rocksdb::write_row()

Fix this bug by not setting m_locks_rows if lock_type == TL_IGNORE.

Closes facebook/mysql-5.6#838
Pull Request resolved: facebook/mysql-5.6#871

Differential Revision: D9417382

Pulled By: lth

fbshipit-source-id: c36c164e06c
laurynas-biveinis pushed a commit that referenced this pull request Jul 18, 2019
…E TO A SERVER

Problem
========================================================================
Running the GCS tests with ASAN seldomly reports a user-after-free of
the server reference that the acceptor_learner_task uses.

Here is an excerpt of ASAN's output:

==43936==ERROR: AddressSanitizer: heap-use-after-free on address 0x63100021c840 at pc 0x000000530ff8 bp 0x7fc0427e8530 sp 0x7fc0427e8520
WRITE of size 8 at 0x63100021c840 thread T3
    #0 0x530ff7 in server_detected /home/tvale/mysql/plugin/group_replication/libmysqlgcs/src/bindings/xcom/xcom/xcom_transport.c:962
    #1 0x533814 in buffered_read_bytes /home/tvale/mysql/plugin/group_replication/libmysqlgcs/src/bindings/xcom/xcom/xcom_transport.c:1249
    #2 0x5481af in buffered_read_msg /home/tvale/mysql/plugin/group_replication/libmysqlgcs/src/bindings/xcom/xcom/xcom_transport.c:1399
    #3 0x51e171 in acceptor_learner_task /home/tvale/mysql/plugin/group_replication/libmysqlgcs/src/bindings/xcom/xcom/xcom_base.c:4690
    #4 0x562357 in task_loop /home/tvale/mysql/plugin/group_replication/libmysqlgcs/src/bindings/xcom/xcom/task.c:1140
    #5 0x5003b2 in xcom_taskmain2 /home/tvale/mysql/plugin/group_replication/libmysqlgcs/src/bindings/xcom/xcom/xcom_base.c:1324
    #6 0x6a278a in Gcs_xcom_proxy_impl::xcom_init(unsigned short, node_address*) /home/tvale/mysql/plugin/group_replication/libmysqlgcs/src/bindings/xcom/gcs_xcom_proxy.cc:164
    #7 0x59b3c1 in xcom_taskmain_startup /home/tvale/mysql/plugin/group_replication/libmysqlgcs/src/bindings/xcom/gcs_xcom_control_interface.cc:107
    #8 0x7fc04a2e4dd4 in start_thread (/lib64/libpthread.so.0+0x7dd4)
    #9 0x7fc047ff2bfc in __clone (/lib64/libc.so.6+0xfebfc)

0x63100021c840 is located 64 bytes inside of 65688-byte region [0x63100021c800,0x63100022c898)
freed by thread T3 here:
    #0 0x7fc04a5d7508 in __interceptor_free (/lib64/libasan.so.4+0xde508)
    #1 0x52cf86 in freesrv /home/tvale/mysql/plugin/group_replication/libmysqlgcs/src/bindings/xcom/xcom/xcom_transport.c:836
    #2 0x52ea78 in srv_unref /home/tvale/mysql/plugin/group_replication/libmysqlgcs/src/bindings/xcom/xcom/xcom_transport.c:868
    #3 0x524c30 in reply_handler_task /home/tvale/mysql/plugin/group_replication/libmysqlgcs/src/bindings/xcom/xcom/xcom_base.c:4914
    #4 0x562357 in task_loop /home/tvale/mysql/plugin/group_replication/libmysqlgcs/src/bindings/xcom/xcom/task.c:1140
    #5 0x5003b2 in xcom_taskmain2 /home/tvale/mysql/plugin/group_replication/libmysqlgcs/src/bindings/xcom/xcom/xcom_base.c:1324
    #6 0x6a278a in Gcs_xcom_proxy_impl::xcom_init(unsigned short, node_address*) /home/tvale/mysql/plugin/group_replication/libmysqlgcs/src/bindings/xcom/gcs_xcom_proxy.cc:164
    #7 0x59b3c1 in xcom_taskmain_startup /home/tvale/mysql/plugin/group_replication/libmysqlgcs/src/bindings/xcom/gcs_xcom_control_interface.cc:107
    #8 0x7fc04a2e4dd4 in start_thread (/lib64/libpthread.so.0+0x7dd4)

previously allocated by thread T3 here:
    #0 0x7fc04a5d7a88 in __interceptor_calloc (/lib64/libasan.so.4+0xdea88)
    #1 0x543604 in mksrv /home/tvale/mysql/plugin/group_replication/libmysqlgcs/src/bindings/xcom/xcom/xcom_transport.c:721
    #2 0x543b4c in addsrv /home/tvale/mysql/plugin/group_replication/libmysqlgcs/src/bindings/xcom/xcom/xcom_transport.c:755
    #3 0x54af61 in update_servers /home/tvale/mysql/plugin/group_replication/libmysqlgcs/src/bindings/xcom/xcom/xcom_transport.c:1747
    #4 0x501082 in site_install_action /home/tvale/mysql/plugin/group_replication/libmysqlgcs/src/bindings/xcom/xcom/xcom_base.c:1572
    #5 0x55447c in import_config /home/tvale/mysql/plugin/group_replication/libmysqlgcs/src/bindings/xcom/xcom/site_def.c:486
    #6 0x506dfc in handle_x_snapshot /home/tvale/mysql/plugin/group_replication/libmysqlgcs/src/bindings/xcom/xcom/xcom_base.c:5257
    #7 0x50c444 in xcom_fsm /home/tvale/mysql/plugin/group_replication/libmysqlgcs/src/bindings/xcom/xcom/xcom_base.c:5325
    #8 0x516c36 in dispatch_op /home/tvale/mysql/plugin/group_replication/libmysqlgcs/src/bindings/xcom/xcom/xcom_base.c:4510
    #9 0x521997 in acceptor_learner_task /home/tvale/mysql/plugin/group_replication/libmysqlgcs/src/bindings/xcom/xcom/xcom_base.c:4772
    #10 0x562357 in task_loop /home/tvale/mysql/plugin/group_replication/libmysqlgcs/src/bindings/xcom/xcom/task.c:1140
    #11 0x5003b2 in xcom_taskmain2 /home/tvale/mysql/plugin/group_replication/libmysqlgcs/src/bindings/xcom/xcom/xcom_base.c:1324
    #12 0x6a278a in Gcs_xcom_proxy_impl::xcom_init(unsigned short, node_address*) /home/tvale/mysql/plugin/group_replication/libmysqlgcs/src/bindings/xcom/gcs_xcom_proxy.cc:164
    #13 0x59b3c1 in xcom_taskmain_startup /home/tvale/mysql/plugin/group_replication/libmysqlgcs/src/bindings/xcom/gcs_xcom_control_interface.cc:107
    #14 0x7fc04a2e4dd4 in start_thread (/lib64/libpthread.so.0+0x7dd4)

Analysis
========================================================================
The server structure is reference counted by the associated sender_task
and reply_handler_task.
When they finish, they unreference the server, which leads to its memory
being freed.

However, the acceptor_learner_task keeps a "naked" reference to the
server structure.
Under the right ordering of operations, i.e. the sender_task and
reply_handler_task terminating after the acceptor_learner_task acquires,
but before it uses, the reference to the server structure, leads to the
acceptor_learner_task accessing the server structure after it has been
freed.

Solution
========================================================================
Let the acceptor_learner_task also reference count the server structure
so it is not freed while still in use.

Reviewed-by: André Negrão <[email protected]>
Reviewed-by: Venkatesh Venugopal <[email protected]>
RB: 21209
laurynas-biveinis pushed a commit that referenced this pull request Jul 18, 2019
… keyring

Issue: when the server is started without a keyring, the redo log encryption
checks were never performed, redo log encryption wasn't initialized, but the
setting was left on, misleading the user. Later operations which triggered
this check could possibly fail.

Issue #2: redo log encryption didn't work properly when it was turned on using
the command line: in some cases, the redo log remained unencrypted (until
another operation triggered the redo log encryption setup methods to be called
again)

Both issues were caused by the refactoring of the periodic, once every
second encryption checks in upstream, in PS-5189.

This commit:

* changes the code so redo log encryption routines are called at least twice
during every startup with the encryption settings on. This fixes issue #2.
* changes the code so redo log encryption routines are called at least once
during every startup, even without a keyring, or a read-only server: this fixes
issues #1
* adds a test ensuring that in an invalid configuration (without a keyring) the
the server remains running, but the redo log encryption setting correctly displays the
OFF status
* adds a test ensuring that when the redo log is correctly configured (using the
dynamic variable or a command line parameter) the log is encrypted, and no unencrypted
data is present in it.
* backports additional redo log related bugfixes/refactorings from 8.0
(92198f4). The original commit contains fixes
for both the redo and the undo log, the backport only includes the redo log related
parts refactored to match our changes.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants