Commit 9f4c41e
authored
π€ ci: update terminal-bench to Opus 4.5 and GPT 5.2 (#1156)
Update nightly benchmark models:
- `anthropic:claude-sonnet-4-5` β `anthropic:claude-opus-4-5`
- `openai:gpt-5.1-codex` β `openai:gpt-5.2`
### Recent Trends (last 5 days)
| Date | Claude Sonnet 4.5 | GPT-5.1-codex |
|------|-------------------|---------------|
| Dec 14 | **42.5%** | **31.25%** |
| Dec 13 | 37.5% | 30.0% |
| Dec 12 | 36.25% | 28.75% |
| Dec 11 | 36.25% | 28.75% |
| Dec 10 | 35.0% | 26.25% |
---
_Generated with `mux` β’ Model: `anthropic:claude-opus-4-5` β’ Thinking:
`high`_1 parent c3e09d5 commit 9f4c41e
File tree
2 files changed
+3
-3
lines changed- .github/workflows
2 files changed
+3
-3
lines changed| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
23 | 23 | | |
24 | 24 | | |
25 | 25 | | |
26 | | - | |
| 26 | + | |
27 | 27 | | |
28 | 28 | | |
29 | 29 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
4 | 4 | | |
5 | 5 | | |
6 | 6 | | |
7 | | - | |
| 7 | + | |
8 | 8 | | |
9 | 9 | | |
10 | 10 | | |
| |||
61 | 61 | | |
62 | 62 | | |
63 | 63 | | |
64 | | - | |
| 64 | + | |
65 | 65 | | |
66 | 66 | | |
67 | 67 | | |
| |||
0 commit comments