Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Refactor BaseBuilder *Batch() Methods #6536

Merged
merged 40 commits into from
Sep 27, 2022

Conversation

sclubricants
Copy link
Member

@sclubricants sclubricants commented Sep 12, 2022

This PR aligns all the *Batch() methods allowing them to share the same code.

This solves #5674

This is part of #6407

Checklist:

  • Securely signed commits
  • Component(s) with PHPDoc blocks, only if necessary or adds value
  • Unit testing, with >80% coverage
  • User guide updated
  • Conforms to style guide

@sclubricants sclubricants added database Issues or pull requests that affect the database layer refactor Pull requests that refactor code 4.3 labels Sep 12, 2022
@sclubricants
Copy link
Member Author

@codeigniter4/database-team Thoughts?

@sclubricants
Copy link
Member Author

The new setData() method allows setting data in multiple rows or just one row at a time. This way it can be used as a collector in a loop.

        $db = db_connect();

        $data = [
            ['id' => 1, 'value' => 11,],
            ['id' => 2, 'value' => 12,],
            ['id' => 3, 'value' => 13,],
        ];

        $oneRow =  ['id' => 4, 'value' => 14,];

        $moreData = [
            ['value' => 15, 'id' => 5,],
            ['value' => 16, 'id' => 6,],
            ['value' => 17, 'id' => 7,],
        ];

        $EvenMoreData = [
            ['value' => 18, 'id' => 8,],
            ['value' => 19, 'id' => 9,],
            ['value' => 20, 'id' => 10,],
        ];
        
        dd($db->table('test')->testMode()
            ->setData($data)
            ->setData($oneRow)
            ->setData($moreData)
            ->insertBatch($EvenMoreData, null, 1));
0 => string (51) "INSERT INTO `db_test` (`id`, `value`) VALUES (1,11)"
1 => string (51) "INSERT INTO `db_test` (`id`, `value`) VALUES (2,12)"
2 => string (51) "INSERT INTO `db_test` (`id`, `value`) VALUES (3,13)"
3 => string (51) "INSERT INTO `db_test` (`id`, `value`) VALUES (4,14)"
4 => string (51) "INSERT INTO `db_test` (`id`, `value`) VALUES (5,15)"
5 => string (51) "INSERT INTO `db_test` (`id`, `value`) VALUES (6,16)"
6 => string (51) "INSERT INTO `db_test` (`id`, `value`) VALUES (7,17)"
7 => string (51) "INSERT INTO `db_test` (`id`, `value`) VALUES (8,18)"
8 => string (51) "INSERT INTO `db_test` (`id`, `value`) VALUES (9,19)"
9 => string (52) "INSERT INTO `db_test` (`id`, `value`) VALUES (10,20)"

Also, if in any of the *batch() methods the batch size is set to 0 then it will process everything in one batch.

Copy link
Member

@michalsn michalsn left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me. Nice improvements.

Two things:

Changing public method variable names should be considered as a BC break. Not sure what is our current policy on this, but I would say "no" - at least to the moment, it is not listed under the breaking changes in the changelog.

I’m also not a fan of changing the exception messages to setData() has no data.. Previously messages were way more informative. If the new user will write the code:

$builder->updateBatch([], 'id');

he will have no idea what is going on. At least, this is my impression.

Of course, we need a changelog and some updates to the user guide - with code examples.

Nice work. Let’s see what others things about it.

system/Database/BaseBuilder.php Outdated Show resolved Hide resolved
system/Database/BaseBuilder.php Show resolved Hide resolved
@sclubricants
Copy link
Member Author

I’m also not a fan of changing the exception messages to setData() has no data.. Previously messages were way more informative.

This crossed my mind at one point. How about:

    public function updateBatch($set = null, $constraints = null, int $batchSize = 100)
    {
        $this->onConstraint($constraints);

        if ($set !== null && $set !== []) {
            $this->setData($set, true);
        } elseif (empty($this->QBSet)) {
            if ($this->db->DBDebug) {
                throw new DatabaseException('updateBatch() has no data.');
            }

            return false; // @codeCoverageIgnore
        }

        return $this->batchExecute('_updateBatch', $batchSize);
    }

@michalsn
Copy link
Member

Yes, that would work better.

@sclubricants sclubricants force-pushed the RefactorBaseBuilderBatchNew branch from 93776bc to f837ccb Compare September 16, 2022 16:29
@sclubricants
Copy link
Member Author

Added documentation, change log, etc.

@@ -278,7 +278,7 @@ public function testUpdateBatchConstraintsRawSqlandAlias()
[
'id' => 2,
'name' => 'Ahmadinejad Changes',
'country' => 'Iraq', // same length but different
'country' => 'Uruguay',
Copy link
Member

@kenjis kenjis Sep 17, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why did you remove the existing test case instead of adding a test case?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not where I removed any tests. I looked back through the commits and didn't see any. Sometimes I'll rename them if I add to them. In general though I try to build on existing tests where appropriate to try and not create redundant tests. I try and get the maximum code coverage for each test.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I searched // same length but different in UpdateTest.php, but not found.
You had a test case Iraq for same length but different, but it was removed
because Uruguay is not the same length.
Is it okay?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I changed this test around a few times.. I was having trouble useing RawSql with the % character. Ultimately I fixed the issue and changed the test to do what I was originally wanting to do.

Copy link
Member Author

@sclubricants sclubricants Sep 19, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I changed to Uruguay and used LIKE 'U%'. It was a better test because this forced me to solve the escaping of % in RawSql.

@@ -2385,7 +2385,7 @@ protected function _updateBatch(string $table, array $keys, array $values): stri
",\n",
array_map(
static fn ($key, $value) => $key . ($value instanceof RawSql ?
' = ' . $value :
' = ' . str_replace('%', '%%', $value) :
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do we need to change the RawSql string?
I think if we change the string, it is not raw.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

return sprintf($sql, $data);

Have to escape % which might show up in RawSql.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you use sprintf(), you must escape all % (except what you intended to replace), not only in RawSql strings.

Isn't it dangerous to use sprintf() to generate SQL strings?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wouldn't think if you did it properly. It is used quite a bit already. Most of the code where % is likely to show up is in the data to be placed. I just had to escape the RawSql.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I sent a bug report: #6555

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I changed to str_replace. Its not often you use a % in table or column names but I suppose it could happen. I had added query caching to insertBatch as well so I went ahead and changed those as well.

@kenjis
Copy link
Member

kenjis commented Sep 17, 2022

Changed RawSql test
Changed RawSql test to use LIKE 'U%' This also required escaping sprintf() in _updateBatch()

This is almost what you did. We can know easily what you did from the commit,
but don't know why you changed and why it required escaping.
Please write why in a commit messages as possible.

Copy link
Member

@michalsn michalsn left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I will have some requests involving the user guide side.

Not sure if you would be up for it, but adding some context to the example queries in the user guide - what we want to achieve in every example would be very informative.

You unlocked new and powerful features to the Query Builder, and it would be a shame not to advertise it correctly. The more detailed information we provide about these queries, the more users will benefit - they need to understand how the new methods work and what opportunities they bring.

user_guide_src/source/database/query_builder/092.php Outdated Show resolved Hide resolved
@sclubricants
Copy link
Member Author

Some things I'm pondering for later include:

As in the update query you can select data into a psuedo table. A method that compiles this and gives an alias could be created.

(select 1 id, 'a name' name UNION ALL
select 2 id, 'another name' name) AS psuedo_table

We could wrap this in a RawSql object. Then this could be used anywhere instead of an actual table name. For instance with joins. You could also find a way to use it with the table() method. Then you could query the table.

The same can be done with a query:

(Select id, name FROM mytable Where id > 5) AS psuedo_table

I use queries on joins all the time as well as query queries.

These are just some things in the back of my mind. I'm trying to keep them in mind so that the things I'm doing now lend themselves to this in the future. I'd like to add upsert in first though and get all the batch methods working well together.

@sclubricants sclubricants force-pushed the RefactorBaseBuilderBatchNew branch from 3eef619 to 6513ec0 Compare September 18, 2022 17:55
Copy link
Member

@michalsn michalsn left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some small updates to the user guide.

user_guide_src/source/database/query_builder.rst Outdated Show resolved Hide resolved
Copy link
Member

@michalsn michalsn left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Probably the last round from me.

There is a missing description for the setAlias() method in the user guide. It is a public method, so we should have it in the method reference list.

I don't really like the {:_table_:} part, and I would prefer to see something like {!QUERY-PART!} or similar, but that might just be me, so.... don't feel obligated to change it.

@sclubricants
Copy link
Member Author

The reason I chose table is because this is in effect what it is. The data goes in a JOIN statement which supports joining tables.

As I mentioned above:

(select 1 id, 'a name' name UNION ALL
select 2 id, 'another name' name) AS psuedo_table

So the data that goes there could just as well be substituted for an actual table name.

Would you like if I changed to: {!TABLE-PART!} or {!TABLE-DATA!} or better yet {!DATA~TABLE!} ?

@sclubricants sclubricants force-pushed the RefactorBaseBuilderBatchNew branch from 1ef4606 to a800da0 Compare September 23, 2022 16:35
@michalsn
Copy link
Member

You know... I am not convinced of any version (even my own). Maybe kenjis will have an opinion on this - we can always stay with what we have, too.

Copy link
Member

@michalsn michalsn left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This PR looks good to me. Nice work.

@kenjis kenjis added the breaking change Pull requests that may break existing functionalities label Sep 24, 2022
@sclubricants sclubricants force-pushed the RefactorBaseBuilderBatchNew branch from cf96036 to 1e5033c Compare September 27, 2022 00:51
@sclubricants sclubricants requested a review from kenjis September 27, 2022 05:08
@kenjis kenjis removed the docs needed Pull requests needing documentation write-ups and/or revisions. label Sep 27, 2022
Copy link
Member

@kenjis kenjis left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you!

@MGatner
Copy link
Member

MGatner commented Sep 27, 2022

🥳🎉🎊

@sclubricants sclubricants deleted the RefactorBaseBuilderBatchNew branch October 14, 2022 21:24
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
breaking change Pull requests that may break existing functionalities database Issues or pull requests that affect the database layer enhancement PRs that improve existing functionalities refactor Pull requests that refactor code
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants