Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Duplicated entities in findMany when a ManyToOne relation is fetched (Postgresql) #3466

Closed
apflieger opened this issue Aug 31, 2024 · 11 comments · Fixed by #3469
Closed

Duplicated entities in findMany when a ManyToOne relation is fetched (Postgresql) #3466

apflieger opened this issue Aug 31, 2024 · 11 comments · Fixed by #3469
Assignees
Labels
Milestone

Comments

@apflieger
Copy link

apflieger commented Aug 31, 2024

Expected behavior

When a ManyToOne relationship is fetched using a join on the root query, ebean handles the deduplication of the rows so Finder.query().findList() returns uniques entities by their ids. This is done here https://github.com/ebean-orm/ebean/blob/master/ebean-core/src/main/java/io/ebeaninternal/server/query/CQuery.java#L433 (I spent some time to debug this)
This code assumes that rows with the same root id avec grouped. But... I have a case where postgres doesn't respect that.

Actual behavior

findList() returns duplicated entities is such case.

I narrowed down the request done by ebean:

select a.id,
       t4.id,
       p.id
from attendance a
         left join ticket t4 on t4.id = a.ticket_id
         left join payment p on p.ticket_id = t4.id
where (teacher_amount > 0 and transfer_method = 'UNDEFINED')

And postgres returns this:
Capture d’écran 2024-08-31 à 15 27 44

Versions

We use Postgresql 16.4
Ebean 15.5.0

Some context

This problem appeared suddenly, I can't find what triggered it. We didn't upgrade postgres around these dates nor Ebean...
It is actually quite critical, it messed up our payment system, customers got billed multiple times for the same item 😬

@rbygrave
Copy link
Member

Ok, that sucks. I presume you have added in an orderBy clause at this stage for a workaround.

@rbygrave
Copy link
Member

Are you able to produce some sort of test case? Or the query with the issue? Ideally with the SQL generated? As in, a ManyToOne [or OneToOne] only produces one [or none] rows for the join ... so I'm thinking there must have also been a ToMany involved in the query?

@rbygrave
Copy link
Member

rbygrave commented Sep 1, 2024 via email

@apflieger
Copy link
Author

apflieger commented Sep 1, 2024

Right, it's a ToMany that is eagerly fetched. Here are the entities concerned:

class Ticket  {

    @OneToMany
    private List<Payment> payments;
}

This is the query. (The fetch group is a bit more complex because there are more ToOne paths

new Finder<>(Attendance.class).query()
    .select(FetchGroup.of(Attendance.class)
            .fetch("ticket", FetchGroup.of(Ticket.class)
                    .fetch("payments", FetchGroup.of(Payment.class)
                            .build())
                    .build())
            .build())
    .where()
    .and()
    .gt("teacher_amount", 0)
    .eq("transfer_method", EPaymentMethod.UNDEFINED)
    .endAnd()
    .findList();

The SQL generated is what I wrote in the first message. I just removed many columns and the other ToOne joins. There's no ordering.

It's like Postgres changed the ordering suddenly.

@rbygrave
Copy link
Member

rbygrave commented Sep 1, 2024

When there is a ToMany path, Ebean should automatically add an orderBy clause if one is not already defined, and if one is defined it will check that it includes the @Id property.

That is, we should see an order by clause here (added by ebean due to the fetch of a ToMany path, in this case ticket.payments).

The internal code that does this is the CQueryPredicates.deriveOrderByWithMany() method.

There are a couple of odd things in the query above.

  1. The and() and endAnd() are not required at all. That is, the top level expression list adds predicates with AND so those could be left out altogether.
  2. The teacher_amount and transfer_method look like column names instead of property names. I would have expected those to be teacherAmount and transferMethod. You can get away with this for this case but it isn't really ideal - the property names would include the correct table alias so more explicit and better.
  3. The nest FetchGroup is ok'ish but its unclear why a simple single FetchGroup is used here instead.

Instead I would expect to see:

new Finder<>(Attendance.class).query()
    .fetch("ticket")
    .fetch("ticket.payments")
    .where()
    .gt("teacherAmount", 0)
    .eq("transferMethod", EPaymentMethod.UNDEFINED)
    .findList();

@rbygrave
Copy link
Member

rbygrave commented Sep 1, 2024

Definitely a bug here though ...

@apflieger
Copy link
Author

Noted for the column names.
Our code that build FetchGroups is more complex. We have an abstraction to define fetch paths and columns, it was simpler to use FetchGroup to build this. So far is works very well.

I'll debug a bit more with this in mind. Thank you.

@artemik
Copy link

artemik commented Sep 1, 2024

Postgres ordering could've been changed due to a different SQL plan triggering. Anyway, yes, sounds like Ebean bug. SQL without "order by" of course doesn't guarantee any order.

@rbygrave I think FetchGroup usage that @apflieger provided here is just an expanded version for the sake of this ticket. Likely they instead reference it by select(fetchGroup) from somewhere. Just saying that, because I like that feature (combination of FetchGroups) and would like to continue seeing it working in Ebean (in case for some reason it would turn up to be the cause here).

@apflieger
Copy link
Author

@rbygrave I think FetchGroup usage that @apflieger provided here is just an expanded version for the sake of this ticket. Likely they instead reference it by select(fetchGroup) from somewhere. Just saying that, because I like that feature (combination of FetchGroups) and would like to continue seeing it working in Ebean (in case for some reason it would turn up to be the cause here).

Yes.

When there is a ToMany path, Ebean should automatically add an orderBy clause if one is not already defined, and if one is defined it will check that it includes the @id property.

If I understand the code correctly, this behaviour seems to consider ToMany relationships only on the root entity of the query. In my case, it's not on the root. We have Attendance -> ToOne ticket -> ToMany Payment. Thus CQueryPredicates.deriveOrderByWithMany() gets a null parameter (no ToMany props found in the BeanDescriptor) and no order by is added.

rbygrave added a commit that referenced this issue Sep 2, 2024
…fetched (Postgresql)

Internally in OrmQueryDetail.markQueryJoins() the SpiQueryManyJoin is determined and returned. This change uses this and effectively removes the code from OrmQueryRequest.determineMany() and BeanDescriptor.manyProperty() which was the source of this issue (performing a similar task but not correctly taking the full path into account and hence the source of this bug).
@rbygrave rbygrave self-assigned this Sep 2, 2024
@rbygrave rbygrave added the bug label Sep 2, 2024
@rbygrave rbygrave added this to the 14.5.1 milestone Sep 2, 2024
rbygrave added a commit that referenced this issue Sep 2, 2024
#3466 - Duplicated entities in findMany when a ManyToOne relation is fetched
@apflieger
Copy link
Author

That was quick. Thank you very much

@rbygrave
Copy link
Member

rbygrave commented Sep 2, 2024

If I understand the code correctly, ...

Yes, you were understanding that all correctly etc. Good stuff!!

@rbygrave rbygrave added the XL label Sep 2, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants