Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Bug: Database\BaseUtils::optimizeTable return values and improper use of getResultArray #4059

Open
sneakyimp opened this issue Jan 4, 2021 · 7 comments
Labels
bug Verified issues on the current code behavior or pull requests that will fix them database Issues or pull requests that affect the database layer

Comments

@sneakyimp
Copy link
Contributor

Describe the bug
It is not entirely clear what the return value of BaseUtils::optimizeTable is supposed to be, but from its usage in tests/system/Database/Live/DBUtilsTest, it appears to be cast as a boolean value. This function has a few problems.

The first is that it checks the $query value returned by $this->db->query() against false to determine success. Unless I'm missing something, that $query value will never be false because the function that handles the query, BaseConnection::query() will still instantiate and return some Database\ResultInterface with resultID set to the false value returned by the db-specific query function. I wonder if we should ever be instantiating a Result object with resultID of false?

The second problem is that this optimizeTable function immediately calls getResultArray on the ResultInterface object returned, despite the fact that the query being run isn't really the sort of query that should return a result set. While it is true that a MySQLi OPTIMIZE TABLE query does return a fetchable result array on a successful operation, the SQLSRV ALTER INDEX... query that gets run by this optimizeTable function is clearly marked among the isWriteType queries, which means the requested SQL server statement returned will not have a scrollable cursor, and any attempt to iterate it is probably ill-advised. Attempting to call sqlsrv_num_rows on such a resultID will return false, signifying an error condition.

The third problem is that optimizeTable returns current($query), where $query is the (in most cases empty) array returned from getResultArray(). Note that current([]) is false, which is probably not the desired return value for the optimizeTable function. If you run optimizeTable using SQLSRV in production mode, the return value of getResultArray() is in fact [] so that return value of optimizeTable is false. I'm guessing this is not the desired outcome.

I think we should modify the handling of the optimizeTable queries to delegate more responsibility to the various db-specific Util classes rather than trying to have so much code in common. The various databases provide pretty different responses and it seems wiser to me to handle the db-specific code in the relevant class. I also think we should have optimizeTable specifically return a boolean value of true or false and nothing else and clarify the javadoc comments.

Finally, I'm not certain but it looked like the original developer was expecting Database\BaseConnection::query() to return false if the query returns false. It does not. Instead, it instantiates a Result object in line 662https://github.com/codeigniter4/CodeIgniter4/blob/develop/system/Database/BaseConnection.php#L662. IMHO, this function returning BaseResult|Query|false is unnecessarily complicated.

CodeIgniter 4 version
This issue is present in the CI4 develop branch in the very latest code as of this writing.

Affected module(s)
system\Database\BaseUtils
system\Database*\Utils
system\Database\BaseConnection
tests\system\Database\Live\DbUtilsTest

Expected behavior, and steps to reproduce if appropriate
I think the function Database\BaseUtil::optimizeTable should return true if the table optimization succeeded and false otherwise.

To reproduce the problem, just download the latest development branch and edit your .env file to specify:

CI_ENVIRONMENT = production

database.tests.DBDriver = SQLSRV

Be sure to set the appropriate credentials to connect to your SQL Server database.

Put this code in a controller and run it:

	function bar() {
		$db = \Config\Database::connect("tests");

		$util = (new Database())->loadUtils($db);
		
		$d = $util->optimizeTable('db_job');
		var_dump($d);
	}

Despite the successful optimization query being run, the output should be:

bool(false)

You'll also get false if you run it with SQLite3. I've not tested Postgres

Context

  • OS: Ubuntu 18.04
  • Web server: CLI actually, but PHP-FPM and apache as well
  • PHP version: PHP 7.2
@sneakyimp sneakyimp added the bug Verified issues on the current code behavior or pull requests that will fix them label Jan 4, 2021
@michalsn
Copy link
Member

michalsn commented Jan 4, 2021

Thank you for looking into this. I guess we should check if other drivers behave the same way and fix it if necessary.

@sneakyimp
Copy link
Contributor Author

sneakyimp commented Jan 4, 2021

Thank you for looking into this. I guess we should check if other drivers behave the same way and fix it if necessary.

I believe there may be a problem in BaseConnection::query() at line 662 which is not evident if you are CI_ENVIRONMENT=development or otherwise have DBDebug turned on. I realize this function is absolutely vital and any adjustments could have far-ranging effects, but it will never return false unless you have DBDebug enabled. In production mode, it will always return some Result object, which becomes true when cast as a boolean. Perhaps, when simpleQuery returns false, this function should return false instead of some instance of ResultInterface with resultID=false? Unfortunately, the javadoc comments for this critical function do not make clear what its behavior should be. I'm guessing that, if the query fails we want BaseConnection::query() to return false or some value that casts as false/null/empty/0 when checked as a boolean.

EDIT: this problem is not apparent if you have DBDebug turned on.

The javadoc comments for BaseUtil::optimizeTable() are likewise unclear and there's no return type specified. Based on the only usage of this function in tests/system/Database/Live/DbUtilsTest.php, where it is cast as a boolean, it looks like we are expecting a boolean. Curiously, the testUtilsOptimizeTable test in that file asserts that the return result is false (!) for every db but MySQLi.

I concocted this test in a controller and ran it with CI_ENVIRONMENT=production:

	function test() {
		$db = \Config\Database::connect("tests");
		echo get_class($db) . "\n";

		$util = (new Database())->loadUtils($db);
		
		$d = $util->optimizeTable('db_job');
		var_dump((bool) $d);
	}

NOTE that db_job is a database that does exist in the test database. I've tested everything but Postgres:

CodeIgniter\Database\MySQLi\Connection
bool(true)

CodeIgniter\Database\SQLite3\Connection
bool(false)

CodeIgniter\Database\SQLSRV\Connection
bool(false)

If you then modify the test to try and optimize some table that does not exist:

	function test() {
		$db = \Config\Database::connect("tests");
		echo get_class($db) . "\n";

		$util = (new Database())->loadUtils($db);
		
		$d = $util->optimizeTable('table_does_not_exist');
		var_dump((bool) $d);
	}

then you get these results:

CodeIgniter\Database\MySQLi\Connection
bool(true)

CodeIgniter\Database\SQLite3\Connection
bool(false)

CodeIgniter\Database\SQLSRV\Connection
bool(false)

Basically there is no distinction between trying to optimize a table that does exist versus one that does not. I think what BaseUtil::optimizeTable is truly testing for is whether the query returns an iterable record set or not. I don't think this function should be calling getResultArray on the query result.

As a solution, I propose that the BaseUtil::optimizeTable function be modified to have a boolean return type and that it return TRUE when the respective table optimization queries run successfully and FALSE otherwise. We should clarify this in the javadoc comments also because these will guide others in its usage. To accomplish this, we will need to create db-specific code in the various db-specific Utils classes because mysqli_query still returns an iterable mysqli_result object, rather than a boolean value as the PHP docs specify. Both sqlite3_query and sqlsrv_query return FALSE.

Note the var_dump results of BaseConnection::simpleQuery() for the these db drivers :

=== CodeIgniter\Database\MySQLi\Connection ===
OPTIMIZE TABLE `table_does_not_exist`
object(mysqli_result)#48 (5) {
  ["current_field"]=>
  int(0)
  ["field_count"]=>
  int(4)
  ["lengths"]=>
  NULL
  ["num_rows"]=>
  int(2)
  ["type"]=>
  int(0)
}

OPTIMIZE TABLE `db_job`
object(mysqli_result)#48 (5) {
  ["current_field"]=>
  int(0)
  ["field_count"]=>
  int(4)
  ["lengths"]=>
  NULL
  ["num_rows"]=>
  int(2)
  ["type"]=>
  int(0)
}


=== CodeIgniter\Database\SQLite3\Connection ===
REINDEX `table_does_not_exist`
bool(false)

REINDEX `db_job`
bool(true)


=== CodeIgniter\Database\SQLSRV\Connection ===
ALTER INDEX all ON "table_does_not_exist" REORGANIZE
bool(false)

ALTER INDEX all ON "db_job" REORGANIZE
resource(123) of type (SQL Server Statement)

As you can see in the case of MySQLi, we must parse the iterable result set to find out if the result succeded or not. In the case of SQLite3 and SQLSRV, it should be sufficient ot check the result for false.

In the case of MySQLi, the result returned will depend on success or failure. Calling getResultArray on the iterable mysqli_result yields the following

=== FAILURE ===
OPTIMIZE TABLE `table_does_not_exist`
array(2) {
  [0]=>
  array(4) {
    ["Table"]=>
    string(29) "ci4_test.table_does_not_exist"
    ["Op"]=>
    string(8) "optimize"
    ["Msg_type"]=>
    string(5) "Error"
    ["Msg_text"]=>
    string(51) "Table 'ci4_test.table_does_not_exist' doesn't exist"
  }
  [1]=>
  array(4) {
    ["Table"]=>
    string(29) "ci4_test.table_does_not_exist"
    ["Op"]=>
    string(8) "optimize"
    ["Msg_type"]=>
    string(6) "status"
    ["Msg_text"]=>
    string(16) "Operation failed"
  }
}

=== SUCCESS ===
OPTIMIZE TABLE `db_job`
array(2) {
  [0]=>
  array(4) {
    ["Table"]=>
    string(15) "ci4_test.db_job"
    ["Op"]=>
    string(8) "optimize"
    ["Msg_type"]=>
    string(4) "note"
    ["Msg_text"]=>
    string(65) "Table does not support optimize, doing recreate + analyze instead"
  }
  [1]=>
  array(4) {
    ["Table"]=>
    string(15) "ci4_test.db_job"
    ["Op"]=>
    string(8) "optimize"
    ["Msg_type"]=>
    string(6) "status"
    ["Msg_text"]=>
    string(2) "OK"
  }
}

@michalsn
Copy link
Member

michalsn commented Jan 4, 2021

I guess the MySQLi driver was implemented/ported as first, so that's why we have it implemented this way. I will try to look into PostgreSQL.

The only thing I'm concerned about is introducing a BC change when we decide to return a bool value every time. But from the other side... it would be actually a bugfix for other drivers than MySQLi.

I think it would be enough to add a changelog note about such a change.

@sneakyimp
Copy link
Contributor Author

I guess the MySQLi driver was implemented/ported as first, so that's why we have it implemented this way. I will try to look into PostgreSQL.

Please note that it returns true in MySQLi whether the table exists or not, regardless of whether CI_ENVIRONMENT IS production or development. For SQLSRV and SQLite3, optimizeTable returns false for successful optimization of existing db tables and throws an exception for nonexistent ones because DBDebug is on. If DBDebug is off, optimizeTable returns false for SQLSRV and SQLite3.

I am still concerned about the behavior of BaseConnection::query() returning a Result object when the query fails and simpleQuery returns false. See line 662. If DBDebug is on, this will mercifully throw an exception, but if you are in production mode (!) then it returns a Database\ResultInterface object with resultID=false. Should I create a separate issue for that? It seems problematic to me when we are creating ResultInterface objects when resultID is a boolean.

The only thing I'm concerned about is introducing a BC change when we decide to return a bool value every time. But from the other side... it would be actually a bugfix for other drivers than MySQLi.

I don't know what a "BC change" is? Could you clarify?

I think it would be enough to add a changelog note about such a change.

I would imagine that the ChangeLog has a special process for any changes? It seems like an important document. Should one try to change this document in any pull request? Or should one request changes be made by one of you VIPs?

@michalsn
Copy link
Member

michalsn commented Jan 5, 2021

Should I create a separate issue for that? It seems problematic to me when we are creating ResultInterface objects when resultID is a boolean.

Yes, please. We will look into that later on - one issue at a time.

I don't know what a "BC change" is? Could you clarify?

https://github.com/codeigniter4/CodeIgniter4/blob/develop/contributing/pull_request.md#breaking-changes
In this case, we would change the result type of the method, but I really want to call it a bugfix.

From what I remember changelog you're referring to is generated automatically before the release and contains all changes being made. The changelog we cherish more is located in the user guide: https://github.com/codeigniter4/CodeIgniter4/blob/develop/user_guide_src/source/changelogs/v4.0.5.rst We list there all the significant changes that have been made that every developer should know about. Usually, if there is such a need, we ask the PR creator to add changes to his PR before merging.

@sneakyimp
Copy link
Contributor Author

If anyone knows of other situations where a 'write' query returns a mysqli_result instead of a boolean. It would probably be helpful to know what those are. It has been suggested on the phpdoc mailing list that this is a PHP/MySQLi bug rather than a documentation problem. Note that OPTIMIZE TABLE (and indexing in general) is beyond the scope of standard sql so an ad-hoc solution may be necessary.

@kenjis
Copy link
Member

kenjis commented Nov 30, 2023

This feature is not documented to begin with. Why?
https://codeigniter4.github.io/CodeIgniter4/database/utilities.html

I sent PR #8277

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Verified issues on the current code behavior or pull requests that will fix them database Issues or pull requests that affect the database layer
Projects
None yet
Development

No branches or pull requests

4 participants