Leaderboard
Two-factor scoring across 22 dimensions — Model Score and Agent Score, combined into a Composite. Graded by an independent 8-model judging panel.
Pareto-optimal — not dominated on any axis Dominated — another agent scores equal or better on all axes
Loading rankings…