Sketch
Bench
Accuracy
Cost
Speed
Replay
Matrix
Table
About
Last Sync: 08/03/2026, 05:41:37
Filter Models [5/5] +
Accuracy Distribution
Success rate based on wordbank.
01
Gemini 3 Flash (dynamic)
74%
02
GPT-5 Mini (medium)
69%
03
Gemini 3.1 Flash Lite (medium)
62%
04
Gemini 3.1 Flash Lite (minimal)
46%
05
Claude Haiku 4.5
33%