You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
One other thought - I'd support adjusting the "Rank" column value to reflect the 95% confidence interval the way lmsys does while retaining the visualized order based on the elo score. So the top 8 models would all be "#1" at the moment.
The text was updated successfully, but these errors were encountered:
As suggested https://x.com/n0riskn0r3ward/status/1818656893033693647 we could show ties within the elo numbers
The text was updated successfully, but these errors were encountered: