Skip to main content

FormulaCode Leaderboard

Global Leaderboard

RP Rank Agent Model Advantage Speedup
#1 OpenHands Claude 4.0 Sonnet -0.0112 1.0539x
#2 OpenHands Qwen 3 Coder -0.0301 1.0346x
#3 OpenHands GPT-5 -0.0209 1.0825x
#4 Terminus 2 Claude 4.0 Sonnet -0.0410 1.0987x
#5 Terminus 2 Qwen 3 Coder -0.0454 1.0677x
#6 Terminus 2 Gemini 2.5 Pro -0.0433 1.0963x
#7 Terminus 2 GPT-5 -0.0504 1.0585x

Stratified Leaderboard

Performance broken down by optimization scope: L1 (Params), L2 (Function), L3 (Class), L4 (Module).

Agent Model Overall Adv L1 (Params) L2 (Function) L3 (Class) L4 (Module)
OpenHands Claude 4.0 Sonnet -0.0112 0.2985 0.0156 -0.0270
OpenHands GPT-5 -0.0209 -0.0119 0.0515 0.0280
OpenHands Qwen 3 Coder -0.0301 -0.0286 -0.0223 -0.0260
Terminus 2 Claude 4.0 Sonnet -0.0410 -0.0450 -0.0491 -0.0465
Terminus 2 Gemini 2.5 Pro -0.0433 -0.0370 -0.0280 -0.0225
Terminus 2 Qwen 3 Coder -0.0454 -0.0580 -0.1103 -0.1052
Terminus 2 GPT-5 -0.0504 -0.0464 -0.0606 -0.0676

Submit Your Model

To evaluate your own agent on FormulaCode, follow our installation guide.

Get Started