SketchBench

Full Table

Detailed per-run data

RankModelSolvedFailedGuessesCostTimeCompleted
#1GPT-5.4 Mini (high)77/1000695$2.745527.3s18/03/2026, 15:23:50
#2Gemini 3 Flash (dynamic)75/1000716$0.28665.5s12/03/2026, 03:08:38
#3GPT-5 Mini (medium)71/1003855$0.924739.9s12/03/2026, 03:09:20
#4Gemini 3.1 Flash Lite (medium)62/1000978$0.32788.9s12/03/2026, 03:09:27
#5GPT-5.4 Mini (medium)60/10001019$0.841649.0s17/03/2026, 17:51:34
#6Gemini 3.1 Flash Lite (minimal)46/10001294$0.10394.0s12/03/2026, 03:09:31
#7GPT-5.4 Nano (medium)44/10001326$0.251797.2s17/03/2026, 17:53:17
#8Claude Haiku 4.533/10001522$0.43597.4s12/03/2026, 03:09:38