Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Out of memory error when using queryset iterator #256

Open
lucasmoeskops opened this issue May 7, 2024 · 2 comments · May be fixed by #263
Open

Out of memory error when using queryset iterator #256

lucasmoeskops opened this issue May 7, 2024 · 2 comments · May be fixed by #263

Comments

@lucasmoeskops
Copy link

lucasmoeskops commented May 7, 2024

What happened?

A search engine indexer module uses a queryset iterator to iterate over all the records to index. This sends a sql query to the database to fetch all data, but also to iterate over it and not to actually fetch all of this. Cachalot intercepts the request and seems to ignore the fact that it is going to iterate, instead fetching all the data from the table. This caused on out of memory error on the server.

What should've happened instead?

Cachalot should understand that the query was sent by an iterator and just let it query the database the way it did OR have support for this iteration process and get the right data from the cache somehow.

Steps to reproduce

OS -> Ubuntu
Django version -> 4.2.9
Database -> Postgres
django-cachalot -> 2.6.2

@lucasmoeskops lucasmoeskops changed the title Bug when using queryset iterator Out of memory error when using queryset iterator May 7, 2024
@danlamanna
Copy link
Contributor

danlamanna commented Aug 28, 2024

I've also run into this because cachalot wants to store the entire result set in local memory (and then in redis). I think it might make sense to be more opinionated around iterators. Given .iterator is pretty common in larger django projects, it's perhaps not ideal to have to wrap every block in with cachalot_disabled()....

@Andrew-Chen-Wang Would a PR making a modification similar to this be considered?

diff --git a/cachalot/monkey_patch.py b/cachalot/monkey_patch.py
index 37b2932..ebb7e59 100644
--- a/cachalot/monkey_patch.py
+++ b/cachalot/monkey_patch.py
@@ -1,4 +1,6 @@
+import logging
 import re
+import types
 from collections.abc import Iterable
 from functools import wraps
 from time import time
@@ -19,6 +21,7 @@ from .utils import (
     UncachableQuery, is_cachable, filter_cachable,
 )
 
+logger = logging.getLogger(__name__)
 
 WRITE_COMPILERS = (SQLInsertCompiler, SQLUpdateCompiler, SQLDeleteCompiler)
 
@@ -62,6 +65,13 @@ def _get_result_or_execute_query(execute_query_func, cache,
                 pass
 
     result = execute_query_func()
+
+    if result.__class__ == types.GeneratorType:
+        logger.info(
+            "Skipping caching of a generator due to potentially large result set."
+        )
+        return result
+
     if result.__class__ not in ITERABLES and isinstance(result, Iterable):
         result = list(result) 

@Andrew-Chen-Wang
Copy link
Collaborator

creating a new setting and a short FAQ of a result of this would be great

danlamanna added a commit to ImageMarkup/isic that referenced this issue Aug 29, 2024
danlamanna added a commit to ImageMarkup/isic that referenced this issue Aug 29, 2024
@danlamanna danlamanna linked a pull request Sep 3, 2024 that will close this issue
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants