Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Optimize out O(n) queries when updating ManyRelatedField #4917

Closed
orf opened this issue Feb 21, 2017 · 6 comments
Closed

Optimize out O(n) queries when updating ManyRelatedField #4917

orf opened this issue Feb 21, 2017 · 6 comments

Comments

@orf
Copy link
Contributor

orf commented Feb 21, 2017

Hi,
I was surprised to find that when updating a m2m relation DRF makes O(n) queries when updating ManyRelatedField instances due to this loop: https://github.com/tomchristie/django-rest-framework/blob/master/rest_framework/relations.py#L489-L498

I think it was designed this way to throw a reasonable error if a related PK does not exist, and to be as extensible as possible.

However I think the most common case of using a ManyRelatedField isn't served well by this and could be improved. The most common case would be just a ManyRelatedField with the related objects pk as the value, and the current implementation will make N queries for the models.

And to override this you would need to create a custom field and override the to_internal_value function to fetch it.

For this common case wouldn't something like this implementation be suitable, and couldn't something like this be included in DRF?

       def to_internal_value(self, data):
           if isinstance(data, type('')) or not hasattr(data, '__iter__'):
                self.fail('not_a_list', input_type=type(data).__name__)
           if not self.allow_empty and len(data) == 0:
                self.fail('empty')

           values = list(self.child_relation.get_queryset.filter(pk__in=data))

           if len(values) != len(data):
               missing_primary_keys = set(v.pk for v in values) - set(data)
               self.fail('missing_ids', ids_not_found=list(missing_primary_keys))

           return values

Just use pk__in to fetch all the models in a single query, then compare the length of the result with the length of the given data. If they don't match then we are missing a record and we can use a set operator to find out which ones and raise a validation error with the missing primary keys?

@tomchristie
Copy link
Member

Couldn't they be the same length but different values?

@orf
Copy link
Contributor Author

orf commented Apr 24, 2017

Yeah, good point, I guess you should just use the set difference from the get-go. It's more about using '__in' instead of repeated single queries though.

E.g:

           values = list(self.child_relation.get_queryset.filter(pk__in=data))
           missing_primary_keys = set(v.pk for v in values) - set(data)
           if missing_primary_keys:
               self.fail('missing_ids', ids_not_found=list(missing_primary_keys))

Or something to that effect. Perhaps I'm missing the point and there is a good reason for the repeated queries though?

@tomchristie
Copy link
Member

It's probably worth throwing up a pull request with this suggestion. If it doesn't change any API and it doesn't break any tests then it sounds like we'd be good. 👍

@orf
Copy link
Contributor Author

orf commented Apr 24, 2017

I can give it a go and see if anything fails, but I'm not sure how to handle the self.child_relation.to_internal_value call in the original code. That is missing and that would likely break something, somewhere.

@orf
Copy link
Contributor Author

orf commented Apr 24, 2017

Well, I got all the tests to pass by using an isinstance check for PrimaryKeyRelatedField. Not sure if this is the best way, using the pk_optimization stuff didn't seem to work.

carltongibson added a commit to carltongibson/django-rest-framework that referenced this issue May 17, 2017
carltongibson added a commit to carltongibson/django-rest-framework that referenced this issue May 17, 2017
carltongibson added a commit to carltongibson/django-rest-framework that referenced this issue May 17, 2017
@carltongibson
Copy link
Collaborator

Closing as a known limitation in line with #5150. (Full featured PRs welcomed.)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants