Open-source agent tops Gemini benchmark on cheap model

Open-source agent tops Gemini benchmark on cheap model

Hacker News·3w·GodelNumbering

GodelNumbering built Dirac, an open-source agent that achieved top performance on TerminalBench using Google's Gemini 3.5 Flash Preview—a low-cost model. The result suggests capable autonomous agents don't require expensive frontier models, potentially lowering the bar for indie builders experimenting with AI tooling.