Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Different filterMany behavior when no matches found #3453

Closed
artemik opened this issue Aug 10, 2024 · 4 comments · Fixed by #3474
Closed

Different filterMany behavior when no matches found #3453

artemik opened this issue Aug 10, 2024 · 4 comments · Fixed by #3474
Assignees
Labels
Milestone

Comments

@artemik
Copy link

artemik commented Aug 10, 2024

Actual behavior (2 of them)

When filterMany clause has no matches, the behavior on whether the parent entity is returned or not is not deterministic.

Behavior 1
For example, if we take example from the docs:

List<Customer> customers =
    new QCustomer()
    .contacts.filterMany("firstName istartsWith ?", "Rob")
    .findList();

then customers without contacts having "firstName" starting with "Rob" are not returned. That's because LEFT JOIN is performed first and then WHERE is applied.

Behavior 2.1
Behavior changes if Ebean is forced to pull contacts in a separate fetch query (by adding .contacts.fetchQuery()):

List<Customer> customers =
    new QCustomer()
    .contacts.fetchQuery()
    .contacts.filterMany("firstName istartsWith ?", "Rob")
    .findList();

then all customers are returned, even those that don't have any contacts at all. That's because the first query always pulls all customers, while filtering of contacts happens in a separate query. (Note! Based on Ebean docs, due to optimization considerations, Ebean can trigger a fetch query even when user doesn't ask for it. It means behavior is very indeterministic and dangerous).

Behavior 2.2
If we don't force a fetch query, but just change filter to check for "null" on some field, for example:

List<Customer> customers =
    new QCustomer()
    .contacts.filterMany(new QContact().firstName.isNull())
    .findList();

then again all customers are returned, even those that don't have any contacts at all. That's because contacts are joined with LEFT JOIN and when there are no contacts, their columns will be NULL (that's how LEFT JOIN works), therefore "no contacts" and "contacts with null firstName" are treated the same here.

Conclusion
Such behavior difference is not intuitive and dangerous and even outside of user control (because of possiblity of Ebean triggering fetch query on its own, like I mentioned above). Behavior 1 seems like a bug.

To me, the 2nd behavior makes more sense, so that filerMany really just filters out dependent entities, but doesn't affect what main entities are returned, i.e. main entities would always be returned.

Possible fix
I see 2 options:

  • move filterMany clauses to the ON clause of LEFT JOIN, instead of WHERE clause, to guarantee that at least a single main entity is always returned, even if there are no matches. But I don't know if all cases and all DBs would support it.
  • keep filterMany clauses in WHERE with LEFT JOIN (as is now), but add OR contacts.id IS NULL which will only trigger when there are no contacts at all (since IDs are always non-null for any existing record), and will therefore pass through the customer record even when there are no joined contacts.

Both options need some thinking of course.


What's the difference between .contacts. ... and .contacts.filterMany(...)?

At first, I thought .contacts. ... is meant to filter main entity based on dependents and therefore must always use INNER JOIN. But then I saw cases where it uses LEFT JOIN as well, which doesn't make sense to me. For example:

List<Customer> customers =
    new QCustomer()
    .contacts.firstName.eq("John")
    .findList();

should mean "give me customers who have contacts with firstName John". And indeed, the resulting query applies t1.firstName = 'John' which makes the whole query return only customers having a 'John' contact. For this specific query the JOIN type doesn't matter. Bot with null checks like .contacts.firstName.isNull() a LEFT JOIN would be a problem similar to the above - it triggers false positive when there are no contacts for a customer. So why does Eben use a LEFT JOIN for such queries sometimes (as I see in the logs)? INNER JOIN seems to match the contract better and avoids such problem in the first place.

If .contacts. ... would always only use INNER JOIN for Contacts table, it would nicely fit together with filterMany contract:

  • .contacts. ... - "give me customers whose contacts are ABC"
  • .contacts.filterMany(...) - "give me customers and filter their contacts (if any) like ABC".

On a related note - on https://ebean.io/docs/query/where page under "Notes on join type" there seems to sound a misconception similar to the problems above - it seems to say that if a join is LEFT JOIN, then its corresponding WHERE clause parts will be optional, but it's not.


Is it possible to fix it somehow? At the very least, I think there should be a big note in the docs: https://ebean.io/docs/query/filterMany

Ebean 15.5.0


P.S. Also, it's often needed to do both - return customers having specific contacts (.contacts. ...), and load only these same specific contacts for them (.contacts.filterMany(...)). Currently it requires duplicating filtering logic and making 2 JOINS. There should be a way to do both things with one JOIN. Maybe there should be some strictFilterMany() function.

@artemik artemik changed the title Clarify filterMany behavior when no matches found Different filterMany behavior when no matches found Aug 11, 2024
@artemik
Copy link
Author

artemik commented Aug 11, 2024

I updated the ticket with more details.

@rbygrave
Copy link
Member

Behavior 1 seems like a bug.

Yes it does. I'll review this (look if we can we move the predicate to be part of the join or not)

@artemik
Copy link
Author

artemik commented Aug 14, 2024

@rbygrave thanks for your attention on this, and thanks for a great library!

@rbygrave
Copy link
Member

WRT

.contacts.firstName.eq("John")
... And indeed, the resulting query applies t1.firstName = 'John'

If you look closer you see that it does not use the t1 alias for this predicate. For this predicate against a ToMany path ebean will use u* table alias to support the predicate and not the usual t* table alias. That is, we see:

select /* QCustomerTest.isEmpty */ distinct t0.id, t0.status, ... 
from customer t0 
join contact u1 on u1.customer_id = t0.id 
where u1.first_name like ? escape'|'

Note that the alias here is u1 and not a t1. The u1 tells us it's a predicate on a ToMany path.

The concept here is that ebean can use t* table alias & joins for both projection and predicates and we see this for ToOne paths. For ToMany paths (ignoring filterMany) ebean uses u* table alias and joins to support predicates on these ToMany paths [and can't use these joins and table alias for projection].

For example:

    var cu = QCustomer.alias();
    var co = QContact.alias();

    new QCustomer()
      .select(cu.status)
      .contacts.fetch(co.firstName, co.lastName)
      .contacts.firstName.startsWith("Rob")
      .findList();

We see 2 joins to the contact table, a t1 for the projection and a u1 for the predicate [and this is because its a ToMany path]

select distinct t0.id, t0.status, 
    t1.id, t1.first_name, t1.last_name                         -- t1 used for projection
from customer t0 
left join contact t1 on t1.customer_id = t0.id         -- t1 projection
join contact u1 on u1.customer_id = t0.id              -- u1 predicate
where u1.first_name like ? escape'|' 
order by t0.id;

So if you look at that sql again, you should see the use of u* table alias for predicates on any properties that are on a ToMany path relative to the root.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants