-
Notifications
You must be signed in to change notification settings - Fork 2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Azure Embedding Quota Limit #936
Comments
i have the same issue. |
Fair point, we'll have to think how to make this smoother. A) @danieldekay, is it the same docs that you're running reports on? Have a look at this PR: "Documents, crawled urls, and website will be chunked and loaded to the inputted vector store if vector_store is not None." Meaning, if you run GPTR with the same Langchain vectorstore, perhaps it will cut down the embeddings processes. B) the "cooling off" feature is also a good idea. Did you mention somewhere that there's a Langchain method we can leverage to get the required "cool off" period? Once we have that, we can go about adding the websocket message. Adding an exception handler block would also be a good first step which publishes a websocket message to the frontend |
@ElishaKay - it's a standard web research report based on Bing. Langchain has support for a rate limiter: maybe that is also an option |
Awesome. Adding to the resilience channel on Discord. For anyone reading who hasn't joined the Discord, Join here to access the above link |
I solved the issue by and changing to azure_openai:text-embedding-3-large If you think this is correct solution i can add a pull request |
Has someone solved this? |
Sure @roninio, Green light for the PR - maybe we should also set a default azure embedding model in the config? There's a good chance this is also a cause of a problem for the Open_AI API - i.e. that we should upgrade the embedding model. Sounds like we should edit that file to: match os.environ["EMBEDDING_PROVIDER"]:
case "openai":
self.embedding_model = "text-embedding-3-large"
case "azure_openai":
self.embedding_model = "text-embedding-3-large" |
Hi created a pull request #979 |
Describe the bug
I am running a detailed report with Azure Openai, and am hitting quota limits. While I have a rate limit activated of 500k tokens per minute, it seems to still throw an error and not handle the throttling request well.
openai.RateLimitError: Error code: 429 - {'error': {'code': '429', 'message': 'Requests to the Embeddings_Create Operation under Azure OpenAI API version 2024-02-15-preview have exceeded call rate limit of your current OpenAI S0 pricing tier. Please retry after 86400 seconds. Please go here: https://aka.ms/oai/quotaincrease if you would like to further increase the default rate limit.'}}
Of course the error message should be false, as my rate limits are not per day, and waiting 24h is not an option.
Expected behavior
The text was updated successfully, but these errors were encountered: