engine: complete Expr3 pipeline migration, remove lower_late #133

bpowers · 2025-12-28T14:29:50Z

This commit completes the cleanup tasks for array pass 1:

Unified lowering pipeline: lower() now goes through pass 0 -> Expr3
-> pass 1 -> lower_from_expr3, returning Vec to support temp
array assignments. All call sites updated.
Removed Expr3 -> Expr2 conversion: lower_from_expr3 now handles all
Expr3 variants directly (Var, Op1, Op2, If, App, Subscript) without
converting back to Expr2.
Deleted ~800 lines of legacy code: lower_late, convert_expr3_to_expr2,
convert_index_expr3_to_expr2, convert_builtin_expr3_to_expr2, old
SubscriptConfig/normalize_subscripts, and get_dimension_name_subscript.
Fixed ArrayBounds for Range subscripts: Added compute_range_size and
expr_to_index helpers to calculate the actual range size (e.g., 3:5
gives size 3, not the full dimension size 5). This was causing
incorrect temp array sizing and index out of bounds errors.
Cleaned up dead_code annotations: Removed annotations from Expr3
types and methods that are now used. Kept annotations with comments
for items needed in pass 2 (StaticSubscript, TempArrayElement,
references_a2a_dimension).

The net result is ~570 fewer lines of code while maintaining full test
coverage (383 tests pass). The codebase is now simpler and more
maintainable for implementing pass 2 and other array features.

This commit completes the cleanup tasks for array pass 1: 1. Unified lowering pipeline: lower() now goes through pass 0 -> Expr3 -> pass 1 -> lower_from_expr3, returning Vec<Expr> to support temp array assignments. All call sites updated. 2. Removed Expr3 -> Expr2 conversion: lower_from_expr3 now handles all Expr3 variants directly (Var, Op1, Op2, If, App, Subscript) without converting back to Expr2. 3. Deleted ~800 lines of legacy code: lower_late, convert_expr3_to_expr2, convert_index_expr3_to_expr2, convert_builtin_expr3_to_expr2, old SubscriptConfig/normalize_subscripts, and get_dimension_name_subscript. 4. Fixed ArrayBounds for Range subscripts: Added compute_range_size and expr_to_index helpers to calculate the actual range size (e.g., 3:5 gives size 3, not the full dimension size 5). This was causing incorrect temp array sizing and index out of bounds errors. 5. Cleaned up dead_code annotations: Removed annotations from Expr3 types and methods that are now used. Kept annotations with comments for items needed in pass 2 (StaticSubscript, TempArrayElement, references_a2a_dimension). The net result is ~570 fewer lines of code while maintaining full test coverage (383 tests pass). The codebase is now simpler and more maintainable for implementing pass 2 and other array features.

bpowers · 2025-12-28T14:30:14Z

@codex review

claude · 2025-12-28T14:32:46Z

Code Review: PR #133 - Complete Expr3 Pipeline Migration

Summary

This PR successfully completes the migration to the Expr3 pipeline by removing ~570 lines of legacy code while maintaining full test coverage (383 tests pass). The changes unify the lowering pipeline and fix a critical bug in array bounds calculation for range subscripts.

✅ Strengths

1. Excellent Code Cleanup

Removed ~800 lines of dead code including lower_late, convert_expr3_to_expr2, and old normalize_subscripts
Net reduction of ~570 lines while maintaining functionality
Cleaned up #[allow(dead_code)] annotations systematically with clear comments for pass-2 items

2. Unified Pipeline Architecture
The new lower() function flow is clean and well-documented:

Pass 0 → Expr3 → Pass 1 → lower_from_expr3 → Vec<Expr>

Returning Vec<Expr> to support temp array assignments is a good design choice.

3. Critical Bug Fix
The fix for range subscript bounds calculation (e.g., 3:5 now correctly gives size 3 instead of 5) prevents index out-of-bounds errors:

fn compute_range_size(start: &Expr2, end: &Expr2, dim: &Dimension) -> Option<usize> {
    let start_idx = Self::expr_to_index(start, dim)?;
    let end_idx = Self::expr_to_index(end, dim)?;
    // Range is inclusive on both ends, so size is end - start + 1
    if end_idx >= start_idx {
        Some(end_idx - start_idx + 1)
    } else {
        None // Invalid range (end before start)
    }
}

4. Direct Variant Handling
The new lower_from_expr3 handles all Expr3 variants directly without converting back to Expr2, which is much cleaner and eliminates unnecessary conversions.

🔍 Issues and Concerns

1. Integer Overflow Risk ⚠️
In expr2.rs:336-337:

let idx = *val as isize - 1;
if idx >= 0 { Some(idx as usize) } else { None }

Problem: Casting f64 to isize can overflow for large values. For example, 1e100 as isize produces undefined behavior.

Recommendation: Add bounds checking:

if *val >= 1.0 && *val <= isize::MAX as f64 {
    Some((*val as isize - 1) as usize)
} else {
    None
}

2. Potential Panic Points ⚠️
Several .unwrap() calls that could panic:

compiler.rs:1291: subscript.as_str().parse::<f64>().unwrap()
compiler.rs:1300: .position(...).unwrap()
compiler.rs:1375-1376: .find(...).unwrap().len()

Recommendation: According to CLAUDE.md: "Strongly prefer .unwrap() over .unwrap_or_default()" during development to catch bugs. This is acceptable IF these are guaranteed by invariants. However, add debug assertions or comments explaining why these cannot fail.

3. Code Duplication
The dimension name handling logic in lower_from_expr3 (lines 1274-1318) duplicates similar logic that likely exists elsewhere. The dimension-to-index conversion at lines 1288-1304 is particularly verbose.

Recommendation: Consider extracting this to a helper method like resolve_dimension_to_index().

4. Missing Edge Case Handling
In compute_range_size, when end_idx < start_idx, the function returns None but the caller at line 476 uses .unwrap_or(dims[i].len()) which silently falls back to the full dimension.

Recommendation: This could hide bugs. Consider returning an error for invalid ranges instead of silently accepting them.

5. Comment Clarity
Line 1289: "Subscript is already a 1-based index as a string" - The relationship between the string representation and numeric value deserves more explanation.

🎯 Performance Considerations

Positive:

Eliminating Expr2 ↔ Expr3 conversions reduces allocation overhead
Unified pipeline should improve cache locality

Concern:

The new lower() returns Vec<Expr> which allocates. For expressions without temps, this means allocation overhead for a single-element vector. Consider using SmallVec<[Expr; 1]> if profiling shows this is hot.

🔒 Security Considerations

No security concerns identified. Array bounds are properly validated throughout.

🧪 Test Coverage

The PR mentions 383 tests pass, which is excellent. However:

Recommendation: Add specific regression tests for the range bounds bug fix (e.g., test that arr[3:5] produces a 3-element array, not 5).

📝 Code Quality

Positive:

Follows repository conventions (prefer .unwrap(), use unreachable!(), etc.)
Good use of inline comments explaining complex logic
Proper error handling with descriptive error messages

Nitpicks:

Line 1238: #[allow(dead_code)] on Pass1Result struct - add comment explaining why it's kept
Consider adding debug assertions for invariants that "cannot fail"

✨ Final Verdict

Strong Approve - This is high-quality refactoring work that:

Significantly simplifies the codebase
Fixes a real bug
Maintains full test coverage
Sets up a cleaner foundation for pass 2

The concerns raised are mostly about defensive programming and code clarity rather than correctness issues. The unwrap() calls are acceptable given the project's development philosophy, though adding comments explaining invariants would help future maintainers.

Action Items (Optional Improvements)

Add bounds checking to f64 → isize cast to prevent UB
Add regression test for range bounds fix
Extract dimension resolution logic to reduce duplication
Consider adding comments explaining why certain unwrap() calls are safe
Consider returning errors for invalid ranges instead of silent fallback

Great work on this cleanup! The codebase is measurably better after this PR.

chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

chatgpt-codex-connector · 2025-12-28T14:34:35Z

src/simlin-engine/src/compiler.rs

+        Expr::AssignTemp(id, inner, view) => {
+            let size = view.dims.iter().product::<usize>();
+            temp_sizes_map.insert(*id, size);
+            extract_temp_sizes(inner, temp_sizes_map);


Preserve max temp size per reused temp id

Because Pass1Context is created per lower() call, temp IDs restart at 0 for every expression, so the same temp ID is reused across many runlist entries. This insert overwrites any prior size for that ID instead of keeping the maximum, meaning a later, smaller temp view can shrink temp_sizes. When an earlier (larger) expression runs, AssignTemp will write past the allocated temp buffer and panic or corrupt results. Consider tracking max per temp ID rather than overwriting.

Useful? React with 👍 / 👎.

codecov · 2025-12-28T14:35:07Z

Codecov Report

❌ Patch coverage is 80.95238% with 4 lines in your changes missing coverage. Please review.
✅ Project coverage is 72.90%. Comparing base (4a4330f) to head (972899a).
⚠️ Report is 1 commits behind head on main.

Files with missing lines	Patch %	Lines
src/simlin-engine/src/ast/expr2.rs	80.95%	4 Missing ⚠️

Additional details and impacted files

@@            Coverage Diff             @@
##             main     #133      +/-   ##
==========================================
+ Coverage   68.89%   72.90%   +4.00%     
==========================================
  Files          41       41              
  Lines       12352    12072     -280     
==========================================
+ Hits         8510     8801     +291     
+ Misses       3842     3271     -571

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

github-actions · 2025-12-28T14:36:44Z

❌ Visual regression tests failed. View artifacts to see the differences.

1. Fix temp size tracking: Preserve max size per reused temp ID. Since Pass1Context resets temp IDs to 0 for each lower() call, the same ID can be reused across expressions. Use entry().and_modify().or_insert() to keep the maximum size, preventing buffer overflow when a larger expression runs after a smaller one was processed. 2. Fix integer overflow: Guard against f64->isize overflow in expr_to_index by checking val is in range [1, isize::MAX]. 3. Add subscript_to_index helper: Extract verbose dimension-to-index conversion logic into a reusable helper method with clear documentation explaining that indexed dimensions store 1-based indices as strings. 4. Improve compute_range_size documentation: Clarify that None is returned for both non-constant bounds and invalid ranges, with the latter being validated later in build_view_from_ops.

claude · 2025-12-28T15:35:47Z

Code Review for PR #133

This is a substantial refactoring that successfully completes the Expr3 pipeline migration. The overall approach is sound and the code quality is good. Below are my findings organized by category.

Strengths

Significant code reduction: Removing ~570 net lines (1381 deletions, 834 additions) while maintaining all 383 tests is excellent. The removal of legacy functions like lower_late, convert_expr3_to_expr2, and the old normalize_subscripts demonstrates a clean consolidation.
Unified pipeline: The new lower() function cleanly separates concerns:
- Pass 0: structural normalization
- Expr3 conversion: wildcard resolution
- Pass 1: temp decomposition
- Final lowering via lower_from_expr3
Critical ArrayBounds fix: The addition of compute_range_size() and expr_to_index() in expr2.rs properly calculates range subscript sizes (e.g., 3:5 gives size 3) instead of using the full dimension size. This prevents index out of bounds errors.
Well-documented: The new helper functions have clear docstrings explaining their behavior and edge cases.
Proper error handling: The expr_to_index function includes overflow guards which is good defensive programming.

Potential Issues

1. Return type change requires careful migration (Medium Priority)

The lower() function now returns Result<Vec<Expr>> instead of Result<Expr>. I verified call sites handle this, but the .unwrap() on exprs.pop() assumes the Vec is non-empty. Consider adding a debug assertion:

debug_assert!(!exprs.is_empty(), "lower() must return at least one expression");
let main_expr = exprs.pop().unwrap();

2. Overflow handling inconsistency (Low Priority)

In expr2.rs:346, the use of saturating_sub(1) means that val = 0.5 would cast to 0 then saturate to 0 instead of returning None. Values in (0, 1) would incorrectly map to index 0. Consider checking val.fract() == 0.0 or document that non-integer values are truncated per system dynamics conventions.

3. Potential panic in subscript_to_index (Medium Priority)

In compiler.rs:725-734, the function uses .unwrap_or(1.0) as fallback which silently defaults to index 1 if parsing fails. Given this is during compilation, consider propagating the error or using a descriptive panic message.

4. Dead code annotations with comments (Low Priority)

Several items have #[allow(dead_code)] // Used in pass 2 annotations. Consider adding a tracking issue reference or implementing pass 2 features soon to avoid accumulating future-use code.

Code Quality Observations

Good separation of concerns: The new lower_from_expr3() handles all Expr3 variants directly without round-tripping through Expr2.
Test coverage: Consider adding specific tests for range subscript edge cases (e.g., 1:1, end < start), non-integer index values, and the new Vec return behavior with temp assignments.

Minor Issues

Typo in comment (compiler.rs:1065): / Pass 0: should be // Pass 0:
Potential optimization: In compute_range_size, you call expr_to_index twice. If on a hot path, consider a single helper returning both indices.

Security and Performance

No security concerns identified
Performance should improve slightly due to fewer allocations
The reduction in code paths should make the compiler slightly faster

Recommendation

Approve with minor suggestions. This is solid work that genuinely improves the codebase. The issues raised are mostly edge cases and defensive improvements rather than blocking problems. The test suite passing (383 tests) gives good confidence.

The key suggestion is to add defensive assertions around the Vec unwrap operations and consider documenting the integer conversion behavior for indices.

Great work on this refactoring! The ~570 line reduction with zero test regressions is impressive.

github-actions · 2025-12-28T15:39:33Z

❌ Visual regression tests failed. View artifacts to see the differences.

chatgpt-codex-connector bot reviewed Dec 28, 2025

View reviewed changes

bpowers merged commit 43bc7aa into main Dec 28, 2025
11 of 13 checks passed

bpowers deleted the claude/cleanup-array-pass1-mLwN4 branch December 28, 2025 19:34

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

engine: complete Expr3 pipeline migration, remove lower_late #133

engine: complete Expr3 pipeline migration, remove lower_late #133

bpowers commented Dec 28, 2025

Uh oh!

bpowers commented Dec 28, 2025

Uh oh!

claude bot commented Dec 28, 2025

Uh oh!

chatgpt-codex-connector bot left a comment

Uh oh!

chatgpt-codex-connector bot Dec 28, 2025

Uh oh!

codecov bot commented Dec 28, 2025 •

edited

Loading

Uh oh!

github-actions bot commented Dec 28, 2025

Uh oh!

claude bot commented Dec 28, 2025

Uh oh!

github-actions bot commented Dec 28, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

engine: complete Expr3 pipeline migration, remove lower_late #133

engine: complete Expr3 pipeline migration, remove lower_late #133

Conversation

bpowers commented Dec 28, 2025

Uh oh!

bpowers commented Dec 28, 2025

Uh oh!

claude bot commented Dec 28, 2025

Code Review: PR #133 - Complete Expr3 Pipeline Migration

Summary

✅ Strengths

🔍 Issues and Concerns

🎯 Performance Considerations

🔒 Security Considerations

🧪 Test Coverage

📝 Code Quality

✨ Final Verdict

Action Items (Optional Improvements)

Uh oh!

chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

💡 Codex Review

Uh oh!

chatgpt-codex-connector bot Dec 28, 2025

Choose a reason for hiding this comment

Uh oh!

codecov bot commented Dec 28, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

github-actions bot commented Dec 28, 2025

Uh oh!

claude bot commented Dec 28, 2025

Code Review for PR #133

Strengths

Potential Issues

Code Quality Observations

Minor Issues

Security and Performance

Recommendation

Uh oh!

github-actions bot commented Dec 28, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

codecov bot commented Dec 28, 2025 •

edited

Loading