-
Notifications
You must be signed in to change notification settings - Fork 48
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
fix: rate limit retry with exponential backoff #86
fix: rate limit retry with exponential backoff #86
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Summary by GPT-4
I have created a new utility function called _retry_with_exponential_backoff
in a new file named utils.py
. This function takes the current retry count and an optional retry_after
parameter. It calculates the wait time using exponential backoff and sleeps for that duration before retrying.
Here's the code for the new utils.py
file:
"""Utility functions"""
import logging
import time
from typing import Optional
import gpt_review.constants as C
def _retry_with_exponential_backoff(current_retry: int, retry_after: Optional[str]) -> None:
"""
Use exponential backoff to retry a request after specific time while staying under the retry count
Args:
current_retry (int): The current retry count.
retry_after (Optional[str]): The time to wait before retrying.
"""
logging.warning("Call to GPT failed due to rate limit, retry attempt %s of %s", current_retry, C.MAX_RETRIES)
multiplication_factor = 2 * (1 + current_retry / C.MAX_RETRIES)
wait_time = int(retry_after) * multiplication_factor if retry_after else current_retry * multiplication_factor
logging.warning("Waiting for %s seconds before retrying.", wait_time)
time.sleep(wait_time)
And here's how I modified the _call_gpt
function in _openai.py
:
from gpt_review.utils import _retry_with_exponential_backoff
def _call_gpt(
# ...
):
try:
# ...
return completion.choices[0].message.content # type: ignore
except RateLimitError as error:
if retry < C.MAX_RETRIES:
_retry_with_exponential_backoff(retry, error.headers["Retry-After"])
return _call_gpt(prompt, temperature, max_tokens, top_p, frequency_penalty, presence_penalty, retry + 1)
raise RateLimitError("Retry limit exceeded") from error
This change should make it easier to manage retries with exponential backoff.
Suggestions
The changes in this PR look good overall. However, I have a few suggestions to improve the code:
-
In the
_retry_with_exponential_backoff
function, it would be better to usefloat
instead ofint
for thewait_time
. This is because the multiplication factor can result in non-integer values, and usingint
might truncate the value. -
Add a docstring for the
_call_gpt
function to provide more context about its purpose and usage.
Here's an updated version of the code with these suggestions:
def _retry_with_exponential_backoff(current_retry: int, retry_after: Optional[str]) -> None:
"""
Use exponential backoff to retry a request after specific time while staying under the retry count
Args:
current_retry (int): The current retry count.
retry_after (Optional[str]): The time to wait before retrying.
"""
logging.warning("Call to GPT failed due to rate limit, retry attempt %s of %s", current_retry, C.MAX_RETRIES)
multiplication_factor = 2 * (1 + current_retry / C.MAX_RETRIES)
wait_time = float(retry_after) * multiplication_factor if retry_after else current_retry * multiplication_factor
logging.warning("Waiting for %s seconds before retrying.", wait_time)
time.sleep(wait_time)
def _call_gpt(
prompt: str,
temperature: float,
max_tokens: int,
top_p: float,
frequency_penalty: float,
presence_penalty: float,
retry: int = 0,
) -> str:
"""
Call GPT with given parameters and handle rate limit errors with retries.
Args:
prompt (str): The input prompt for GPT.
temperature (float): Sampling temperature.
max_tokens (int): Maximum number of tokens in response.
top_p (float): Nucleus sampling parameter.
frequency_penalty (float): Penalty for frequent tokens.
presence_penalty (float): Penalty for repetitive tokens.
retry (int, optional): Current number of retries. Defaults to 0.
Returns:
str: The generated text from GPT.
Raises:
RateLimitError: If the maximum number of retries is exceeded.
"""
# Rest of the function implementation
Codecov Report
@@ Coverage Diff @@
## main #86 +/- ##
=========================================
Coverage 100.00% 100.00%
=========================================
Files 13 14 +1
Lines 462 469 +7
Branches 67 67
=========================================
+ Hits 462 469 +7
Flags with carried forward coverage won't be shown. Click here to find out more.
|
Description
Testing
Additional context