Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

FIX: Repeated iteration should return all records #9098

Merged
merged 2 commits into from
Jul 3, 2019
Merged

FIX: Repeated iteration should return all records #9098

merged 2 commits into from
Jul 3, 2019

Conversation

sminnee
Copy link
Member

@sminnee sminnee commented Jun 27, 2019

See #9097

This is a bugfix whose root-cause is a long-standing misunderstanding of the Iterator::rewind() API. Therefore this change is correcting that – rewind() no longer returns a value; nor does its supporting method, seek().

Is this an API change or a bugfix? Given that this is generally speaking an internal API accessed via Iterator, and our implementation is wrong, I recommend that we include it on the 4.4 branch. To put it another way, the stated API was the bug.

@sminnee sminnee changed the base branch from 4 to 4.4 June 27, 2019 23:32
@sminnee
Copy link
Member Author

sminnee commented Jun 27, 2019

To put it another way - what's more likely: that people are going to be iterating on queries more than once, or calling and that they are calling seek() and rewind() directly and expecting a result? I think the former. We can only fix one.

Arguably, I could keep PDOQuery::rewind() and seek() returning a value because of the memory-inefficient way it has been built. Happy to get feedback on that.

@sminnee
Copy link
Member Author

sminnee commented Jun 27, 2019

The other thing I could do is preserve the behaviour of seek() by

  • Defining seekWithoutReturn() on MySQLQuery and PostgreSQLQuery, marked experimental.
  • Calling that if it exists in favour of seek() (but not including it as abstract in Query)

That, and excluding our mis-implementation of Iterator::rewind() from the public API, would mean that we're not breaking a public API with this change. Bit messier code, but that seems okay. I'd put a simpler fix on master too.

@ScopeyNZ
Copy link
Contributor

ScopeyNZ commented Jun 28, 2019

Can't we just fix this by:

    public function seek($row)
    {
        if (is_object($this->handle)) {
            $this->handle->data_seek($row);
            $record = $this->nextRecord();
            $this->handle->data_seek($row);
            return $record;
        }
        return null;
    }

Yes it's weird but it's non-breaking. And hopefully the cost of seeking is negligible in this case?

@sminnee
Copy link
Member Author

sminnee commented Jun 28, 2019

And hopefully the cost of seeking is negligible in this case?

Doubtful.

@sminnee
Copy link
Member Author

sminnee commented Jun 28, 2019

But we should probably do some profiling.

@ScopeyNZ
Copy link
Contributor

I don't really think this API is well used. I'd much rather accept the patch as-is rather than introducing a seekWithoutReturn API. If the seek cost is measurable then I think it's also worth accepting this patch as is - assuming that rewind is sometimes called without it actually being necessarily required (more than it's called and then used).

@ScopeyNZ
Copy link
Contributor

Also, I'm okay with dropping the return on rewind (that's Iterator API definition that we're breaking). seek is our own API definition though.

@sminnee
Copy link
Member Author

sminnee commented Jun 28, 2019

OK well your change doubles the execution time of calling map() on a 4-records resultset 100 times.

However, for a 100 executions it goes from 0.55ms to 1.1ms. So it's probably negligible, especially given that this is a somewhat rare API. The proportionate difference was also less with 10,000 executions (74ms vs 99ms).

So I'll put Guy's suggestion in the PR.

Copy link
Contributor

@ScopeyNZ ScopeyNZ left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Perfect. I think this is good.

  • Seeking is probably already rare
  • seek is our own API definition - I feel worse about breaking it so seeking twice is a nice compromise
  • Happy to remove the return in rewind because the Iterator interface declares a void return type anyway.
  • SS5 will not have this problem

@sminnee
Copy link
Member Author

sminnee commented Jun 28, 2019

This PR will be needed to get non-PDO PGSQL passing
silverstripe/silverstripe-postgresql#99

@sminnee
Copy link
Member Author

sminnee commented Jun 28, 2019

It looks like filterByCallback at least is returning the first record twice, so we're erroring the other way! :-|

TLDR: this still needs a bit of work

@sminnee
Copy link
Member Author

sminnee commented Jun 30, 2019

OK looks like the fix to MySQLStatement::rewind() has sorted this out; we'll just need a fix for PGSQL (and a linting fix)

@sminnee
Copy link
Member Author

sminnee commented Jun 30, 2019

I'm also gonna rebase this against 4 so that the php 7.4 test suite is reinstated.

…tion

API: Query::seek() and Query::rewind() no longer return a value.

Although breaking an API inside a patch release may seem odd, this in
fact is correcting a long-standing bug in our implementation of 
Iterator::rewind(), so I think it’s appropriate.

#9097
Its implementation is more naive than Query’s and leads to unnecessary
seek()ing. This causes issues with the previous commit.
@sminnee
Copy link
Member Author

sminnee commented Jul 3, 2019

OK relevant postgresql fix has been merged now and I've pushed a linting fix. I think this is ready to merge if the tests pass, but there shouldn't be any "benign" test failure now.

@sminnee
Copy link
Member Author

sminnee commented Jul 3, 2019

💚

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants