Do more to prevent ties #36

KennethEnevoldsen · 2024-08-12T08:51:55Z

When rating in the benchmark, we do get a lot of ties among the top models. A solution is probably to remove autogenerated examples for which the models agree. This would give us more a richer annotations scheme.

isaac-chung · 2024-08-13T08:03:22Z

By "autogenerated examples" do you mean e.g. the "examples" on the "Arena (battle)" screen?

KennethEnevoldsen · 2024-08-13T12:00:08Z

We can keep the pre-defined examples, but I was mainly thinking about the ones gained from the "random sample"-button.

KennethEnevoldsen mentioned this issue Aug 12, 2024

Check "both are bad" / "tie" votes & consider displaying top-K #31

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Do more to prevent ties #36

Do more to prevent ties #36

KennethEnevoldsen commented Aug 12, 2024

isaac-chung commented Aug 13, 2024

KennethEnevoldsen commented Aug 13, 2024

Do more to prevent ties #36

Do more to prevent ties #36

Comments

KennethEnevoldsen commented Aug 12, 2024

isaac-chung commented Aug 13, 2024

KennethEnevoldsen commented Aug 13, 2024