Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Possible race condition in oak-online-alter-table #9

Open
GoogleCodeExporter opened this issue Mar 14, 2015 · 6 comments
Open

Possible race condition in oak-online-alter-table #9

GoogleCodeExporter opened this issue Mar 14, 2015 · 6 comments

Comments

@GoogleCodeExporter
Copy link

I created a simple table with an autoincrement key, and wrote a procedure to 
insert into it in a loop.  I started this procedure, and then ran 
oak-online-alter-table several times.  It succeeded several times, but then 
sometimes my stored procedure died with the following error, which makes it 
look like there is a race condition in the process of altering the table:

5141> call doinsert(50000);
ERROR 1146 (42S02): Table 'test.__oak_b' doesn't exist

Here is the full output of the tool:

$ python oak-online-alter-table.py --host=127.0.0.1 --database=test --table=b 
--alter="engine=innodb" --user=root --port=5141
-- Connecting to MySQL
-- Table test.b is of engine innodb
-- Checking for UNIQUE columns on test.b, by which to chunk
-- Possible UNIQUE KEY column names in test.b:
-- - a
-- Table test.__oak_b has been created
-- Table test.__oak_b has been altered
-- Checking for UNIQUE columns on test.__oak_b, by which to chunk
-- Possible UNIQUE KEY column names in test.__oak_b:
-- - a
-- Checking for UNIQUE columns on test.b, by which to chunk
-- - Found following possible unique keys:
-- - a (int)
-- Chosen unique key is 'a'
-- Shared columns: a
-- Created AD trigger
-- Created AU trigger
-- Created AI trigger
-- Attempting to lock tables

-- Tables locked WRITE
-- a (min, max) values: ([1L], [2416L])
-- Tables unlocked
-- Copying range (1), (1000), progress: 0%
-- Copying range (1000), (2000), progress: 41%
-- Copying range (2000), (2416), progress: 82%
-- Copying range 100% complete. Number of rows: 2416
-- Deleting range (1), (1000), progress: 0%
-- Deleting range (1000), (2000), progress: 41%
-- Deleting range (2000), (2416), progress: 82%
-- Deleting range 100% complete. Number of rows: 0
-- Table test.b has been renamed to test.__arc_b,
-- and table test.__oak_b has been renamed to test.b
-- Table test.__arc_b was found and dropped
-- ALTER TABLE completed

Do you know what is happening, and can it be solved?

Original issue reported on code.google.com by baron.schwartz on 22 Feb 2011 at 1:22

@GoogleCodeExporter
Copy link
Author

My initial thoughts are that this is caused by the triggers still working while 
the table names alter. Table alteration *should* auto-commit any working 
trigger.
Additional input I would like to have:
- MySQL version
- on what INSERT value (if possible to tell) the procedure failed, and whether 
that value is to be found in the new "b" table.

I'll try repeating the experiment.

Original comment by [email protected] on 22 Feb 2011 at 6:23

  • Added labels: Component-Logic

@GoogleCodeExporter
Copy link
Author

test case:

USE test;

DELIMITER ;

DROP TABLE IF EXISTS a;
CREATE TABLE a (
id INT UNSIGNED PRIMARY KEY
);

DELIMITER //

CREATE PROCEDURE doinsert(count INT)
BEGIN
  DECLARE counter INT UNSIGNED DEFAULT 0;
  WHILE counter < count DO
    INSERT INTO a (id) VALUES (counter);
    SET counter = counter+1;
  END WHILE;
END //

DELIMITER ;


Original comment by [email protected] on 22 Feb 2011 at 7:55

@GoogleCodeExporter
Copy link
Author

Have managed to get same results using MySQL 5.1.51, table + SP as described 
above.

Original comment by [email protected] on 22 Feb 2011 at 2:10

  • Changed state: Accepted

@GoogleCodeExporter
Copy link
Author

This becomes weirder.
I've added DELAYED logging into some MyISAM "log" table on each trigger 
invocation.
Apparently the stored procedure many times (> 50%) fails on:
ERROR 1146 (42S02): Table 'test.log' doesn't exist
even though I do nothing to rename the log table.

I do happen to TRUNCATE it *just before* issuing the procedure.
If I replace the "TRUNCATE log" with "DELETE FROM log", there is no failure on 
the log table.

With all invocation of the original problem, and for all occurances of 
described bug, there was no difference in the row data of original table & 
ghost table (I verified by avoiding dropping the original-renamed-to-arc table, 
and comparing max values with ghost-renamed-to-original table).
So my initial thought that this is a trigger failure appears ti be mistaken.

I now have an ugly suspicion, which I'm not sure how to prove, that a stored 
procedure does not like the fact a table has been truncated/renamed while (or 
just before?!? weird) the procedure is invoked.

I will try to build a test case which is entriely unrelated to 
oak-online-alter-table so as to support this claim.

Original comment by [email protected] on 25 Feb 2011 at 6:26

@GoogleCodeExporter
Copy link
Author

It might be a bug in MySQL's locking.

Original comment by baron.schwartz on 28 Feb 2011 at 11:48

@GoogleCodeExporter
Copy link
Author

Still unsure about how to prove/disprove this is a bug with MySQL (as I suspect 
it is).
Are you at all waiting on this or does work on mk-osc make this obsolete?

Original comment by [email protected] on 16 Mar 2011 at 6:31

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

1 participant