You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I struggle to use this model meta-llama/Meta-Llama-3.1-70B-Instruct with suspect error code with a Gateway respoonse timeout from nginx not the albert-api itself.
Here are the error I got
Error: 504 Server Error: Gateway Time-out for url: https://albert.api.etalab.gouv.fr/v1/chat/completions, retrying in 5 seconds...
Albert API error: <html>
<head><title>504 Gateway Time-out</title></head>
<body>
<center><h1>504 Gateway Time-out</h1></center>
<hr><center>nginx/1.27.0</center>
</body>
</html>
Does ther error migh be caused by
a GPU OOM ?
a HTTP timeout ?
The text was updated successfully, but these errors were encountered:
I have the intuition that there is no timeout error catching in the API, which leads to this error. If some gpu/model is stuck an do not respond, the api just wait and the GPU memory might be unnecessary loaded ? (at least after a few minutes, its unlikely that we will get a response from the model...)
Maybe adding a TTL of a few minutes (5 minutes ?) in the api, and returning a timeout error and free the GPU memory used could help scaling the infra while improving and the user experience ?
Hi,
I struggle to use this model
meta-llama/Meta-Llama-3.1-70B-Instruct
with suspect error code with a Gateway respoonse timeout from nginx not the albert-api itself.Here are the error I got
Does ther error migh be caused by
The text was updated successfully, but these errors were encountered: