Skip to content

Commit 9f4c41e

Browse files
authored
πŸ€– ci: update terminal-bench to Opus 4.5 and GPT 5.2 (#1156)
Update nightly benchmark models: - `anthropic:claude-sonnet-4-5` β†’ `anthropic:claude-opus-4-5` - `openai:gpt-5.1-codex` β†’ `openai:gpt-5.2` ### Recent Trends (last 5 days) | Date | Claude Sonnet 4.5 | GPT-5.1-codex | |------|-------------------|---------------| | Dec 14 | **42.5%** | **31.25%** | | Dec 13 | 37.5% | 30.0% | | Dec 12 | 36.25% | 28.75% | | Dec 11 | 36.25% | 28.75% | | Dec 10 | 35.0% | 26.25% | --- _Generated with `mux` β€’ Model: `anthropic:claude-opus-4-5` β€’ Thinking: `high`_
1 parent c3e09d5 commit 9f4c41e

File tree

2 files changed

+3
-3
lines changed

2 files changed

+3
-3
lines changed

β€Ž.github/workflows/nightly-terminal-bench.ymlβ€Ž

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -23,7 +23,7 @@ jobs:
2323
id: set-models
2424
run: |
2525
if [ "${{ inputs.models }}" = "all" ] || [ -z "${{ inputs.models }}" ]; then
26-
echo 'models=["anthropic:claude-sonnet-4-5","openai:gpt-5.1-codex"]' >> $GITHUB_OUTPUT
26+
echo 'models=["anthropic:claude-opus-4-5","openai:gpt-5.2"]' >> $GITHUB_OUTPUT
2727
else
2828
# Convert comma-separated to JSON array
2929
models="${{ inputs.models }}"

β€Ž.github/workflows/terminal-bench.ymlβ€Ž

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -4,7 +4,7 @@ on:
44
workflow_call:
55
inputs:
66
model_name:
7-
description: "Model to use (e.g., anthropic:claude-sonnet-4-5)"
7+
description: "Model to use (e.g., anthropic:claude-opus-4-5)"
88
required: false
99
type: string
1010
thinking_level:
@@ -61,7 +61,7 @@ on:
6161
required: false
6262
type: string
6363
model_name:
64-
description: "Model to use (e.g., anthropic:claude-sonnet-4-5, openai:gpt-5.1-codex)"
64+
description: "Model to use (e.g., anthropic:claude-opus-4-5, openai:gpt-5.2)"
6565
required: false
6666
type: string
6767
thinking_level:

0 commit comments

Comments
Β (0)