Full Table
Detailed per-run data
| Rank ↑ | Model | Solved | Failed | Guesses | Cost | Time | Completed |
|---|---|---|---|---|---|---|---|
| #1 | Gemini 3 Flash (dynamic) | 74/100 | 0 | 736 | $0.28 | 663.8s | 08/03/2026, 04:39:27 |
| #2 | GPT-5 Mini (medium) | 69/100 | 3 | 890 | $0.92 | 4733.6s | 08/03/2026, 05:05:41 |
| #3 | Gemini 3.1 Flash Lite (medium) | 62/100 | 0 | 978 | $0.32 | 787.4s | 08/03/2026, 04:35:46 |
| #4 | Gemini 3.1 Flash Lite (minimal) | 46/100 | 0 | 1294 | $0.10 | 393.7s | 08/03/2026, 04:28:33 |
| #5 | Claude Haiku 4.5 | 33/100 | 0 | 1522 | $0.43 | 591.2s | 08/03/2026, 05:13:46 |