Skip to content

Commit

Permalink
WL9175 post-push: ASAN memory leak in Query_log_event.
Browse files Browse the repository at this point in the history
In WL-included builds ASAN run witnessed missed ~Query_log_event invocation.
The destruct-or was not called due to the WL's changes in the error propagation
that specifically affect LC MTS.
The failure is exposed in particular by rpl_trigger as the following
stack:

  #0 0x9ecd98 in __interceptor_malloc (/export/home/pb2/test/sb_2-22611026-1489061390.32/mysql-commercial-8.0.1-dmr-linux-x86_64-asan/bin/mysqld+0x9ecd98)
  #1 0x2b1a245 in my_raw_malloc(unsigned long, int) obj/mysys/../../mysqlcom-pro-8.0.1-dmr/mysys/my_malloc.cc:209:12
  #2 0x2b1a245 in my_malloc obj/mysys/../../mysqlcom-pro-8.0.1-dmr/mysys/my_malloc.cc:72
  #3 0x2940590 in Query_log_event::Query_log_event(char const*, unsigned int, binary_log::Format_description_event const*, binary_log::Log_event_type) obj/sql/../../mysqlcom-pro-8.0.1-dmr/sql/log_event.cc:4343:46
  #4 0x293d235 in Log_event::read_log_event(char const*, unsigned int, char const**, Format_description_log_event const*, bool) obj/sql/../../mysqlcom-pro-8.0.1-dmr/sql/log_event.cc:1686:17
  #5 0x293b96f in Log_event::read_log_event()
  #6 0x2a2a1c9 in next_event(Relay_log_info*)

Previously before the WL
Mts_submode_logical_clock::wait_for_workers_to_finish() had not
returned any error even when Coordinator thread is killed.

The WL patch needed to refine such behavior, but at doing so
it also had to attend log_event.cc::schedule_next_event() to register
an error to follow an existing pattern.
While my_error() does not take place the killed Coordinator continued
scheduling, ineffectively though - no Worker gets engaged (legal case
of deferred scheduling), and without noticing its killed status up to
a point when it resets the event pointer in
apply_event_and_update_pos():

  *ptr_ev= NULL; // announcing the event is passed to w-worker

The reset was intended for an assigned Worker to perform the event
destruction or by Coordinator itself when the event is deferred.
As neither is the current case the event gets unattended for its termination.

In contrast in the pre-WL sources the killed Coordinator does find a Worker.
However such Worker could be already down (errored out and exited), in
which case apply_event_and_update_pos() reasonably returns an error and executes

  delete ev

in exec_relay_log_event() error branch.

**Fixed** with deploying my_error() call in log_event.cc::schedule_next_event()
error branch which fits to the existing pattern.
THD::is_error() has been always checked by Coordinator before any attempt to
reset *ptr_ev= NULL. In the errored case Coordinator does not reset and
destroys the event itself in the exec_relay_log_event() error branch pretty similarly to
how the pre-WL sources do.

Tested against rpl_trigger and rpl suites to pass.

Approved on rb#15667.
  • Loading branch information
Andrei Elkin committed Mar 10, 2017
1 parent ba659bf commit e39a757
Showing 1 changed file with 11 additions and 3 deletions.
14 changes: 11 additions & 3 deletions sql/log_event.cc
Original file line number Diff line number Diff line change
Expand Up @@ -2827,16 +2827,24 @@ static bool schedule_next_event(Log_event* ev, Relay_log_info* rli)
error= rli->current_mts_submode->schedule_next_event(rli, ev);
switch (error)
{
case ER_MTS_CANT_PARALLEL:
char llbuff[22];
case ER_MTS_CANT_PARALLEL:
llstr(rli->get_event_relay_log_pos(), llbuff);
my_error(ER_MTS_CANT_PARALLEL, MYF(0),
ev->get_type_str(), rli->get_event_relay_log_name(), llbuff,
"The master event is logically timestamped incorrectly.");
return true;
case ER_MTS_INCONSISTENT_DATA:
/* Don't have to do anything. */
return true;
llstr(rli->get_event_relay_log_pos(), llbuff);
{
char errfmt[]=
"Coordinator experienced an error or was killed while scheduling "
"an event at relay-log name %s position %s.";
char errbuf[sizeof(errfmt) + FN_REFLEN + sizeof(llbuff)];
sprintf(errbuf, errfmt, rli->get_event_relay_log_name(), llbuff);
my_error(ER_MTS_INCONSISTENT_DATA, MYF(0), errbuf);
return true;
}
default:
return false;
}
Expand Down

0 comments on commit e39a757

Please sign in to comment.