Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix OOM due to large prompt cache #39

Merged
merged 1 commit into from
Feb 27, 2024
Merged

Fix OOM due to large prompt cache #39

merged 1 commit into from
Feb 27, 2024

Conversation

joerunde
Copy link
Collaborator

Motivation

OOM errors could be caused either by filling up the prefix cache with prefixes, or by sending large amounts of requests with very large prefixes.

Modifications

This handles the OOM problem with large prefixes by both:

  • Taking the max prefix cache size into account when running the memory usage estimator, to ensure a full prefix cache does not cause an OOM
  • Taking the prefix length into consideration when deciding if a request will fit into a batch, to avoid large prefixes causing unexpected large memory allocations

This includes an api breaking change to the config, as the prefix cache will not be enabled unless a user explicitly sets PREFIX_STORE_PATH to some non-empty value.

Result

It is now not easy to cause an OOM by loading TGIS with requests for many different prefixes, nor requests with large prefixes.

Related Issues

This handles the OOM problem with large prefixes by both:
- Taking the max prefix cache size into account when running the memory usage estimator, to ensure a full prefix cache does not cause an OOM
- Taking the prefix length into consideration when deciding if a request will fit into a batch, to avoid large prefixes causing unexpected large memory allocations

This includes an **api breaking change to the config**, as the prefix cache will not be enabled unless a user explicitly sets PREFIX_STORE_PATH to some non-empty value.

Signed-off-by: Joe Runde <[email protected]>
Copy link
Collaborator

@maxdebayser maxdebayser left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@joerunde joerunde merged commit 12d9106 into main Feb 27, 2024
7 checks passed
@joerunde joerunde deleted the oom-fix branch February 27, 2024 22:13
@ckadner ckadner mentioned this pull request Feb 28, 2024
10 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants