Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Constraints on eager-loaded hasMany doesn't retrieve data correctly #16217

Closed
alexc-hollywood opened this issue Nov 1, 2016 · 20 comments
Closed

Comments

@alexc-hollywood
Copy link

alexc-hollywood commented Nov 1, 2016

  • Laravel Version: 5.2.x
  • PHP Version: 7
  • Database Driver & Version: MySQL 5.7

Description:

Constraint on eager-load only retrieves 1 record, when ->take(10) is specified.

Steps To Reproduce:

  • Each User model has their instagram data updated daily, and inserted into an AccountData model:
    User::hasMany(AccountData::class);
  • User has at least 400 sibling update records inserted. Going through every user to perform updates kills memory, some have up to 2000. Only the last/previous record is needed, so the hasMany relation is constrained to take only the last 10 or so (to be safe).

Example code, before constraint:

User::with(['account_data'])->chunk(10, function ($users) {
  foreach ($users as $user) {
    // do something with $user->account_data->last(). 
   // could have 400+ account_data
  }
}

This will produce 400 records in the hasMany relation.

Then with optimized version:

User::with([
  'account_data' => function($query) {
    $query->orderBy('created_at', 'DESC')->take(10);
  }])->chunk(10, function ($users) {
  foreach ($users as $user) {
    // zero records, 1 record, or the first ever record
    // account_data->count() is 1
  }
}

This is expected to eager load the previous 10 records for each user, but only ever returns a) no records, or b) the first ever record. The result is unpredictable.

When manually pulling the user's relation count, the results are consistent:

// test here is:
$user->account_data->count(); // eager-loaded
$user->account_data()->count(); // go to db
Eager-loaded count is 1 data records.
Forced count from DB is 3 records.

Eager-loaded count is 0 data records.
Forced count from DB is 0 records.

Eager-loaded count is 0 data records.
Forced count from DB is 0 records.

Eager-loaded count is 4 data records.
Forced count from DB is 552 records.

Eager-loaded count is 1 data records.
Forced count from DB is 3 records.

Eager-loaded count is 1 data records.
Forced count from DB is 29 records.

Eager-loaded count is 0 data records.
Forced count from DB is 0 records.

Edit: SQL generated by DB::getQueryLog() is:

array (
  'query' => 'select * from "account_data" where "account_data"."user_id" in (?, ?, ?, ?, ?, ?, ?, ?, ?, ?) and "account_data"."deleted_at" is null order by "created_at" desc limit 10',
  'bindings' => 
  array (
    0 => 128,
    1 => 115,
    2 => 116,
    3 => 48,
    4 => 129,
    5 => 130,
    6 => 69,
    7 => 125,
    8 => 66,
    9 => 120,
  ),
  'time' => 39.350000000000001,
),

Say there are 1000 users being chunked - appears that the eager load constraint is being applied to the child records, rather than for each user individually, i.e. to gets the last 10 records from the hasMany relation for each parent item.

Is this a case of me not applying the constraint correctly?

@themsaid
Copy link
Member

themsaid commented Nov 2, 2016

Does it work as expected without chunk? When you use get() instead do you get users with 10 records each?

@themsaid
Copy link
Member

themsaid commented Nov 3, 2016

ping @azcoppen

@srmklive
Copy link
Contributor

srmklive commented Nov 4, 2016

@azcoppen Wouldn't this work in your User model instead of you using the chunk method?

public function account_data()
{
    return $this->hasMany(AccountsData::class)->orderBy('created_at','desc')->take(10);
}

@chrissm79
Copy link

I was just about to create an issue for a very similar issue/request. What we want is something like the following (in SQL)

SELECT `account_data`.* FROM (
    (SELECT * FROM `account_data` WHERE `account_data`.`user_id` = 1 ORDER BY `created_at` DESC LIMIT 10) 
    UNION ALL 
    (SELECT * FROM `account_data` WHERE `account_data`.`user_id` = 2 ORDER BY `created_at` DESC LIMIT 10) 
    UNION ALL 
    (SELECT * FROM `account_data` WHERE `account_data`.`user_id` = 3 ORDER BY `created_at` DESC LIMIT 10) 
) AS `account_data`

And have it map to our user eloquent models in the collection. Instead, what we get is:

select * from `account_data` where `account_data`.`user_id` in (?, ?, ?) limit 10

I've never worked with the Query Builder before, but I assume it's a tweak that needs to take place somewhere in there. Does anyone know where to start looking?

@themsaid
Copy link
Member

themsaid commented Nov 6, 2016

@chrissm79 please use laravel code, it's hard to tell what your laravel issue is looking at sql queries only :)

@chrissm79
Copy link

@themsaid My Laravel issue is the same as the original post, I was just showing a possible solution (in SQL, as I don't know how Laravel can do this using Eloquent). But here is a breakdown:

  • Laravel Version: 5.3.x
  • PHP Version: 7.x
  • Database Driver & Version: MySQL 5.7

Description:

Let's say I want to eager load the first 10 Posts for each User. Many people think the following would work:

User::with(['posts' => function ($q) {
    $q->take(10);
})->get();

However, what happens is that you get the first 10 posts total instead of the first 10 posts for each user. The problem is that Laravel eager loads this relationship using a WHERE IN rather than aUNION. I'm not sure how to produce the following using Laravel's Query Builder, but the SQL statement below would produce the results we want (10 posts for each user):

SELECT `posts`.* FROM (
    (SELECT * FROM `posts` WHERE `posts`.`user_id` = 1 LIMIT 10) 
    UNION ALL 
    (SELECT * FROM `posts` WHERE `posts`.`user_id` = 2 LIMIT 10) 
    UNION ALL 
    (SELECT * FROM `posts` WHERE `posts`.`user_id` = 3 LIMIT 10) 
) AS `posts`

If Laravel used this type of UNION we could actually eager load just the first 10 posts for each user.

If I can figure out how to pull this off using just the Query Builder I'll add a new comment with the solution.

Steps To Reproduce:

User::with(['posts' => function ($q) {
    $q->take(10);
})->get();

If there are 3 users in the DB, then we want to get a total of 30 posts (10 per each user). But, the code above only gets 10 posts total.

@chrissm79
Copy link

@themsaid Got it working for HasMany (just haven't tested other types of relationships yet). Created a Collection Macro called fetch. It allows me to lazy eager load and limit the amount of related models per parent model.

Note: this does not work in Sqlite

So now this will on grab the first two posts per user (instead of the first two posts total)

$users->fetch(['posts' => function ($q, $user) {
    $q->take(2);
}]);

Collection Macro:

Collection::macro('fetch', function ($relations) {
    if (count($this->items) > 0) {
        if (is_string($relations)) {
            $relations = [$relations];
        }

        $query = $this->first()->newQuery()->with($relations);
        $builder = new FetchQueryBuilder;
        $this->items = $builder->eagerLoadRelations($query, $this->items);
    }
    return $this;
});

Modified Query Builder:

use Closure;
use Illuminate\Database\Eloquent\Builder;

class FetchQueryBuilder
{
    /**
     * Eager load relationships on collection.
     *
     * @param  Builder $builder
     * @param  array   $models
     * @return array
     */
    public function eagerLoadRelations(Builder $builder, array $models)
    {
        foreach ($builder->getEagerLoads() as $name => $constraints) {
            if (strpos($name, '.') === false) {
                $models = $this->loadRelation($builder, $models, $name, $constraints);
            }
        }

        return $models;
    }

    /**
     * Eagerly load the relationship on a set of models.
     *
     * @param  Builder  $builder
     * @param  array    $models
     * @param  string   $name
     * @param  Closure  $constraints
     * @return array
     */
    protected function loadRelation(Builder $builder, array $models, $name, Closure $constraints)
    {
        $queries = collect($models)->map(function ($model) use ($builder, $name, $constraints) {
            $relation = $builder->getRelation($name);

            $relation->addEagerConstraints([$model]);

            call_user_func_array($constraints, [$relation, $model]);

            return $relation;
        });

        $bindings = $queries->map(function ($query) {
            return $query->getBindings();
        })->collapse()->toArray();

        $sql = $queries->map(function ($query) {
            return '(' . $query->toSql() . ')';
        })->implode(' UNION ALL ');

        $relatedModel = $queries->first()->getModel();
        $table = $relatedModel->getTable();

        $fetch = \DB::select("SELECT `{$table}`.* FROM ({$sql}) AS `{$table}`", $bindings);
        $results = $relatedModel->hydrate($fetch, $relatedModel->getConnectionName())->all();
        $relation = $builder->getRelation($name);

        return $relation->match($models, $relatedModel->newCollection($results), $name);
    }
}

@alexc-hollywood
Copy link
Author

@themsaid sorry didn't see the ping - thanks for getting back to me on it. @chrissm79 has put it more elegantly than i could, and that fix is a great solution! I see the same results using chunk(), get(), and absolutely nothing with load(). The issue has come up repeatedly with a client who is getting frustrated. The site is growing fast, and requires a lot of scheduler work - but the longer a user lives in the system, the more data they generate. Once you hit a certain point, memory just crashes, so eager-loading and chunking only the latest data is essential to keep things working smoothly.

The expected result i want when requesting the last 10 posts for 400 users is a recordset of 4000 rows, as pointed out. Going to test that hasMany workaround with a new relation.

I guess my original question is answered though, which is whether i was expecting the wrong thing.

@chrissm79
Copy link

@azcoppen @themsaid I've updated the modified query builder to handle the belongToMany relationship. Not sure if this takes care of the remaining relationships or not... just using it for my own use case (GraphQL). I'm also not a fan of using the ReflectionMethod, but since the methods I need access to are protected, it seemed like the lesser of two evils (rather than refactoring the BelongToMany relationship).

If I can find some time this weekend, I'll create some tests and check out the other relationships to see if everything still works and possibly submit a PR. Not sure what to do about SQLite in this case though since it won't accept the supplied query.

Usage

$users = User::all();

$users->fetch(['posts' => function ($q, $user) {
    $q->take(2);
}]);

Collection Macro

Collection::macro('fetch', function ($relations) {
    if (count($this->items) > 0) {
        if (is_string($relations)) {
            $relations = [$relations];
        }

        $query = $this->first()->newQuery()->with($relations);
        $this->items = app(QueryBuilder::class)->eagerLoadRelations($query, $this->items);
    }
    return $this;
});

Updated Query Builder

use Closure;
use ReflectionMethod;
use Illuminate\Database\Eloquent\Builder;
use Illuminate\Database\Eloquent\Relations\Relation;
use Illuminate\Database\Eloquent\Model;

class QueryBuilder
{
    /**
     * Eager load relationships on collection.
     *
     * @param  Builder $builder
     * @param  array   $models
     * @return array
     */
    public function eagerLoadRelations(Builder $builder, array $models)
    {
        foreach ($builder->getEagerLoads() as $name => $constraints) {
            if (strpos($name, '.') === false) {
                $models = $this->loadRelation($builder, $models, $name, $constraints);
            }
        }

        return $models;
    }

    /**
     * Eagerly load the relationship on a set of models.
     *
     * @param  Builder  $builder
     * @param  array    $models
     * @param  string   $name
     * @param  Closure  $constraints
     * @return array
     */
    protected function loadRelation(Builder $builder, array $models, $name, Closure $constraints)
    {
        $relation = $builder->getRelation($name);
        $queries = $this->getQueries($builder, $models, $name, $constraints);
        $related = $queries->first()->getModel();

        $bindings = $queries->map(function ($query) {
            return $query->getBindings();
        })->collapse()->toArray();

        $sql = $queries->map(function ($query) {
            return '(' . $query->toSql() . ')';
        })->implode(' UNION ALL ');

        $table = $related->getTable();
        $results = \DB::select("SELECT `{$table}`.* FROM ({$sql}) AS `{$table}`", $bindings);
        $hydrated = $this->hydrate($related, $relation, $results);

        return $relation->match($models, $related->newCollection($hydrated), $name);
    }

    /**
     * Get queries to fetch relationships.
     *
     * @param  Builder  $builder
     * @param  array    $models
     * @param  string   $name
     * @param  Closure  $constraints
     * @return array
     */
    protected function getQueries(Builder $builder, array $models, $name, Closure $constraints)
    {
        return collect($models)->map(function ($model) use ($builder, $name, $constraints) {
            $relation = $builder->getRelation($name);

            $relation->addEagerConstraints([$model]);

            call_user_func_array($constraints, [$relation, $model]);

            if (method_exists($relation, 'getSelectColumns')) {
                $r = new ReflectionMethod(get_class($relation), 'getSelectColumns');
                $r->setAccessible(true);
                // Not sure if this is absolutely correct...
                $select = $r->invoke($relation, ['*']);
                $relation->addSelect($select);
            }

            $relation->initRelation([$model], $name);

            return $relation;
        });
    }

    /**
     * Hydrate related models.
     *
     * @param  Model    $related
     * @param  Relation $relation
     * @param  array    $results
     * @return array
     */
    protected function hydrate(Model $related, Relation $relation, array $results)
    {
        $models = $related->hydrate($results, $related->getConnectionName())->all();

        if (count($models) > 0 && method_exists($relation, 'hydratePivotRelation')) {
            $r = new ReflectionMethod(get_class($relation), 'hydratePivotRelation');
            $r->setAccessible(true);
            $r->invoke($relation, $models);
        }

        return $models;
    }
}

@spawnia
Copy link
Contributor

spawnia commented Sep 16, 2018

@chrissm79 While investigating nuwave/lighthouse#319 i now came across this Issue. It seems that i have come full circle now, as you already solved this.

@everyone You can view the newest version of this over at https://github.com/nuwave/lighthouse/

@themsaid I would love this to become part of Laravel, as i think it would make the behaviour of eager loading more intuitive. How can we go about getting this into the Core?

spawnia added a commit to nuwave/lighthouse that referenced this issue Sep 16, 2018
**Related Issue(s)**

Adds test for #319 
Reverts misguided WIP #320 

laravel/framework#16217

**PR Type**

Test/Fix

**Changes**

- Add test to ensure the global scope is called when using the @hasmany directive
- Add TODO for removing custom fetch logic once Laravel resolves their issue with eager loading

**Breaking changes**

Nope
@mfn
Copy link
Contributor

mfn commented Sep 17, 2018

How can we go about getting this into the Core?

There was only one PR attempt which was done 2 years ago so I bet it's safe to see this needs a PR for a start… no?

@yaquawa
Copy link
Contributor

yaquawa commented Oct 8, 2018

OMG… Why this issue has not been fixed yet??
It has been 2 years…

@staudenmeir
Copy link
Contributor

I released the rejected PR as a package: https://github.com/staudenmeir/eloquent-eager-limit

@giovannipds
Copy link

giovannipds commented Jan 7, 2019

Well, hello there. I may be dealing with something related, you guys can probably know that.
I know this is not a very common use case but I tried for a while enforce always returning relationship (like a default or something) but then eager loading broke up. Check it out:

<?php
namespace App\Models;
use Illuminate\Database\Eloquent\Model;

class Agent extends Model
{
    public function shipping_prices() {
        return $this->hasMany('App\Models\ShippingPrice')
                    ->orWhere('agent_id', 1);
    }
}

And then some logic on tinker to think about:

Agent::find(1)->shipping_prices->count() # 48
Agent::find(18)->shipping_prices->count() # 48
Agent::where('id', 1)->with('shipping_prices')->first()->shipping_prices->count() # 48
Agent::where('id', 18)->with('shipping_prices')->first()->shipping_prices->count() # 0

Why this is not correct? Is this connected to this issue?
I was expecting the last rule above should return 48 too but seems to not be the case...
Laravel 5.6.* here.

@staudenmeir
Copy link
Contributor

@giovannipds Your issue is not related to this thread.

It's not possible to use a relationship with an orWhere() like this.

@giovannipds
Copy link

@staudenmeir thanks anyway!

@derekmd
Copy link
Contributor

derekmd commented Jan 19, 2019

Duplicate issues:

As noted by @staudenmeir, his pull request was declined by Taylor due to the maintenance burden for a solution not supported by all databases that Laravel projects run.

Since this problem has come up so much and it can now be fixed by a third-party package, maybe the best solution is to update https://laravel.com/docs/5.7/eloquent-relationships#eager-loading under heading "Constraining Eager Loads" to clearly communicate that limit() and take() aren't supported unless that package is installed and MySQL, etc. are the most recent version?

@staudenmeir
Copy link
Contributor

@derekmd Naturally, I think that's a good idea. Would you write it?

We could also throw an exception (that links to the documentation) if someone tries to apply a limit to their relationship when eager loading.

@driesvints
Copy link
Member

Closing this as the related PR has been rejected and there is a package to fix it. Thanks for the update to the docs @derekmd.

@staudenmeir
Copy link
Contributor

Eager loading with limit will be supported natively in Laravel 11 (#49695).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests