🤖 fix: enable xhigh reasoning for gpt-5.2 (#1117)

ThomasK33 · web-flow · commit 7a8e6671d74a · 2025-12-12T11:57:09.000-06:00
Enable `xhigh` thinking for `openai:gpt-5.2` by updating the per-model
thinking policy.

- Root cause: `gpt-5.2` was falling back to the default policy
(`off/low/medium/high`), so any `xhigh` selection got clamped before
building OpenAI provider options.
- Fix: allow `xhigh` in `getThinkingPolicyForModel()` for `gpt-5.2`
(including version-suffixed and mux-gateway forms) and add tests.

Validation:
- `bun test src/browser/utils/thinking/policy.test.ts`
- `make typecheck`
- `make static-check`

---

&lt;details&gt;
&lt;summary&gt;📋 Implementation Plan&lt;/summary&gt;

# Enable xhigh reasoning for `openai:gpt-5.2`

## Context / Problem
The newly released `openai:gpt-5.2` model supports OpenAI’s
`reasoningEffort: "xhigh"`, but mux currently **cannot actually request
xhigh** for this model.

### Root cause (code-level)
Mux clamps the requested “thinking level” to a per-model capability
subset via:
- `src/browser/utils/thinking/policy.ts` → `getThinkingPolicyForModel()`
/ `enforceThinkingPolicy()`
- This policy is used both:
  - in the **UI** (Thinking slider/options), and
- in the **backend request builder** (via `buildProviderOptions()` which
calls `enforceThinkingPolicy()`).

Right now `gpt-5.2` is not special-cased, so it falls into the **default
policy**:

- Default: `["off", "low", "medium", "high"]`
- Result: any attempt to set `xhigh` gets clamped (typically to
`"medium"`), so OpenAI never receives `reasoningEffort: "xhigh"`.

OpenAI request construction is already correct once `xhigh` is allowed:
- `src/common/utils/ai/providerOptions.ts` maps `ThinkingLevel` → OpenAI
`reasoningEffort`.
- `src/common/types/thinking.ts` includes `xhigh: "xhigh"` in
`OPENAI_REASONING_EFFORT`.

## Recommended approach (minimal change) — **Update thinking policy for
`gpt-5.2`**
**Net LoC estimate (product code only): ~+10–25 LoC** (policy +
comments; tests separate)

### What to change
1. **Allow `xhigh` for `gpt-5.2`** in `getThinkingPolicyForModel()`:
   - File: `src/browser/utils/thinking/policy.ts`
- Add a special case similar to `gpt-5.2-pro` and `gpt-5.1-codex-max`.

   Suggested policy:
   - `openai:gpt-5.2` → `["off", "low", "medium", "high", "xhigh"]`

   Notes:
   - Keep the `gpt-5.2-pro` branch *above* the new `gpt-5.2` branch.
- Use the same “version suffix tolerant” regex style already used
elsewhere.
- Example: `^gpt-5\.2(?!-[a-z])` so it matches `gpt-5.2` and
`gpt-5.2-2025-12-11` but not `gpt-5.2-pro`.

2. **Update comments to match reality**
   - File: `src/browser/utils/thinking/policy.ts`
- Update the “default policy” comment that currently implies `xhigh` is
only for codex-max.
   - File: `src/common/types/thinking.ts`
- Update the comment on `OPENAI_REASONING_EFFORT.xhigh` (currently says
only `gpt-5.1-codex-max`).
   - Optional (nice-to-have): `src/common/utils/tokens/models-extra.ts`
- The `gpt-5.2` comment block doesn’t mention xhigh. Add a short note
for consistency.

### Why this works
Once the policy allows `xhigh`, the normal request path already does the
right thing:
- UI can select/store `xhigh`.
- Backend uses `buildProviderOptions(modelString, thinkingLevel, ...)`.
- `buildProviderOptions()` will:
  - preserve `xhigh` (no clamping),
  - set `openai.reasoningEffort = "xhigh"`, and
- include `reasoning.encrypted_content` so tool-use works correctly for
reasoning models.

## Tests / Validation
1. Update/add unit tests for the policy:
   - File: `src/browser/utils/thinking/policy.test.ts`
   - Add cases:
- `getThinkingPolicyForModel("openai:gpt-5.2")` returns 5 levels
including `xhigh`.
- `getThinkingPolicyForModel("mux-gateway:openai/gpt-5.2")` returns
same.
- `getThinkingPolicyForModel("openai:gpt-5.2-2025-12-11")` returns same.
     - `enforceThinkingPolicy("openai:gpt-5.2", "xhigh") === "xhigh"`.

2. Run targeted tests:
   - `bun test src/browser/utils/thinking/policy.test.ts`

3. Run repo-wide correctness gates (expected in CI):
   - `make typecheck`
   - `make lint` (or `make lint-fix` if needed)

## Rollout notes / UX impact
- The Thinking slider will show an extra step for `gpt-5.2` once
selected.
- Command palette already offers `xhigh`; after this change, choosing
`xhigh` on `gpt-5.2` will no longer silently clamp back to `medium`.
- No changes required in provider config (`knownModels.ts` etc.).

## Alternative approach (more scalable, higher scope)
**Drive thinking policy from model metadata** (e.g., a single
authoritative model capabilities table that includes supported thinking
levels).

**Net LoC estimate (product code only): ~+80–200 LoC**

This would reduce future “forgot to special-case model X” issues, but
requires designing a shared model-capabilities schema and updating
multiple call sites (policy derivation, UI, provider options clamping).

## Execution checklist (when switching to Exec mode)
- [ ] Edit `src/browser/utils/thinking/policy.ts` to add `gpt-5.2` xhigh
support.
- [ ] Update relevant comments (policy + thinking mapping).
- [ ] Update `src/browser/utils/thinking/policy.test.ts` with new
assertions.
- [ ] Run `bun test ...policy.test.ts`.
- [ ] Run `make typecheck`.

&lt;/details&gt;

---
_Generated with `mux`_

Signed-off-by: Thomas Kosiewski &lt;tk@coder.com&gt;
diff --git a/src/browser/utils/thinking/policy.test.ts b/src/browser/utils/thinking/policy.test.ts
@@ -64,6 +64,36 @@ describe("getThinkingPolicyForModel", () => {
     ]);
   });
 
+  test("returns 5 levels including xhigh for gpt-5.2", () => {
+    expect(getThinkingPolicyForModel("openai:gpt-5.2")).toEqual([
+      "off",
+      "low",
+      "medium",
+      "high",
+      "xhigh",
+    ]);
+  });
+
+  test("returns 5 levels including xhigh for gpt-5.2 behind mux-gateway", () => {
+    expect(getThinkingPolicyForModel("mux-gateway:openai/gpt-5.2")).toEqual([
+      "off",
+      "low",
+      "medium",
+      "high",
+      "xhigh",
+    ]);
+  });
+
+  test("returns 5 levels including xhigh for gpt-5.2 with version suffix", () => {
+    expect(getThinkingPolicyForModel("openai:gpt-5.2-2025-12-11")).toEqual([
+      "off",
+      "low",
+      "medium",
+      "high",
+      "xhigh",
+    ]);
+  });
+
   test("returns 5 levels including xhigh for gpt-5.1-codex-max behind mux-gateway", () => {
     expect(getThinkingPolicyForModel("mux-gateway:openai/gpt-5.1-codex-max")).toEqual([
       "off",
@@ -205,6 +235,20 @@ describe("enforceThinkingPolicy", () => {
     });
   });
 
+  describe("GPT-5.2 (5 levels including xhigh)", () => {
+    test("allows xhigh for base model", () => {
+      expect(enforceThinkingPolicy("openai:gpt-5.2", "xhigh")).toBe("xhigh");
+    });
+
+    test("allows xhigh behind mux-gateway", () => {
+      expect(enforceThinkingPolicy("mux-gateway:openai/gpt-5.2", "xhigh")).toBe("xhigh");
+    });
+
+    test("allows xhigh for versioned model", () => {
+      expect(enforceThinkingPolicy("openai:gpt-5.2-2025-12-11", "xhigh")).toBe("xhigh");
+    });
+  });
+
   describe("xhigh fallback for non-codex-max models", () => {
     test("falls back to medium when xhigh requested on standard model", () => {
       // Standard models don't support xhigh, so fall back to medium (preferred fallback)
diff --git a/src/browser/utils/thinking/policy.ts b/src/browser/utils/thinking/policy.ts
@@ -25,10 +25,11 @@ export type ThinkingPolicy = readonly ThinkingLevel[];
  *
  * Rules:
  * - openai:gpt-5.1-codex-max → ["off", "low", "medium", "high", "xhigh"] (5 levels including xhigh)
+ * - openai:gpt-5.2 → ["off", "low", "medium", "high", "xhigh"] (5 levels including xhigh)
  * - openai:gpt-5.2-pro → ["medium", "high", "xhigh"] (3 levels)
  * - openai:gpt-5-pro → ["high"] (only supported level, legacy)
  * - gemini-3 → ["low", "high"] (thinking level only)
- * - default → ["off", "low", "medium", "high"] (standard 4 levels)
+ * - default → ["off", "low", "medium", "high"] (standard 4 levels; xhigh is opt-in per model)
  *
  * Tolerates version suffixes (e.g., gpt-5-pro-2025-10-06).
  * Does NOT match gpt-5-pro-mini (uses negative lookahead).
@@ -55,6 +56,11 @@ export function getThinkingPolicyForModel(modelString: string): ThinkingPolicy {
     return ["medium", "high", "xhigh"];
   }
 
+  // gpt-5.2 supports 5 reasoning levels including xhigh (Extra High)
+  if (/^gpt-5\.2(?!-[a-z])/.test(withoutProviderNamespace)) {
+    return ["off", "low", "medium", "high", "xhigh"];
+  }
+
   // gpt-5-pro (legacy) only supports high
   if (/^gpt-5-pro(?!-[a-z])/.test(withoutProviderNamespace)) {
     return ["high"];
@@ -65,7 +71,7 @@ export function getThinkingPolicyForModel(modelString: string): ThinkingPolicy {
     return ["low", "high"];
   }
 
-  // Default policy: standard 4 levels (xhigh only for codex-max)
+  // Default policy: standard 4 levels (off/low/medium/high). Models with xhigh must opt in above.
   return ["off", "low", "medium", "high"];
 }
 
diff --git a/src/common/types/thinking.ts b/src/common/types/thinking.ts
@@ -68,7 +68,7 @@ export const OPENAI_REASONING_EFFORT: Record<ThinkingLevel, string | undefined>
   low: "low",
   medium: "medium",
   high: "high",
-  xhigh: "xhigh", // Extra High - only supported by gpt-5.1-codex-max
+  xhigh: "xhigh", // Extra High - supported by models that expose xhigh (e.g., gpt-5.1-codex-max, gpt-5.2)
 };
 
 /**
diff --git a/src/common/utils/tokens/models-extra.ts b/src/common/utils/tokens/models-extra.ts
@@ -43,6 +43,7 @@ export const modelsExtra: Record<string, ModelData> = {
   // GPT-5.2 - Released December 11, 2025
   // $1.75/M input, $14/M output
   // Cached input: $0.175/M
+  // Supports off, low, medium, high, xhigh reasoning levels
   "gpt-5.2": {
     max_input_tokens: 400000,
     max_output_tokens: 128000,

Original file line number	Diff line number	Diff line change
`@@ -25,10 +25,11 @@ export type ThinkingPolicy = readonly ThinkingLevel[];`
`25`	`25`	`*`
`26`	`26`	`* Rules:`
`27`	`27`	`* - openai:gpt-5.1-codex-max → ["off", "low", "medium", "high", "xhigh"] (5 levels including xhigh)`
	`28`	`+ * - openai:gpt-5.2 → ["off", "low", "medium", "high", "xhigh"] (5 levels including xhigh)`
`28`	`29`	`* - openai:gpt-5.2-pro → ["medium", "high", "xhigh"] (3 levels)`
`29`	`30`	`* - openai:gpt-5-pro → ["high"] (only supported level, legacy)`
`30`	`31`	`* - gemini-3 → ["low", "high"] (thinking level only)`
`31`		`- * - default → ["off", "low", "medium", "high"] (standard 4 levels)`
	`32`	`+ * - default → ["off", "low", "medium", "high"] (standard 4 levels; xhigh is opt-in per model)`
`32`	`33`	`*`
`33`	`34`	`* Tolerates version suffixes (e.g., gpt-5-pro-2025-10-06).`
`34`	`35`	`* Does NOT match gpt-5-pro-mini (uses negative lookahead).`
`@@ -55,6 +56,11 @@ export function getThinkingPolicyForModel(modelString: string): ThinkingPolicy {`
`55`	`56`	`return ["medium", "high", "xhigh"];`
`56`	`57`	`}`
`57`	`58`
	`59`	`+ // gpt-5.2 supports 5 reasoning levels including xhigh (Extra High)`
	`60`	`+ if (/^gpt-5\.2(?!-[a-z])/.test(withoutProviderNamespace)) {`
	`61`	`+ return ["off", "low", "medium", "high", "xhigh"];`
	`62`	`+ }`
	`63`	`+`
`58`	`64`	`// gpt-5-pro (legacy) only supports high`
`59`	`65`	`if (/^gpt-5-pro(?!-[a-z])/.test(withoutProviderNamespace)) {`
`60`	`66`	`return ["high"];`
`@@ -65,7 +71,7 @@ export function getThinkingPolicyForModel(modelString: string): ThinkingPolicy {`
`65`	`71`	`return ["low", "high"];`
`66`	`72`	`}`
`67`	`73`
`68`		`- // Default policy: standard 4 levels (xhigh only for codex-max)`
	`74`	`+ // Default policy: standard 4 levels (off/low/medium/high). Models with xhigh must opt in above.`
`69`	`75`	`return ["off", "low", "medium", "high"];`
`70`	`76`	`}`
`71`	`77`