-
Notifications
You must be signed in to change notification settings - Fork 21
[Enhancement] Stop looping when runtime is stable (CF-934) #967
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
…sistent-loop-break
…sistent-loop-break
PR Reviewer Guide 🔍Here are some key observations to aid the review process:
|
PR Code Suggestions ✨Explore these optional code suggestions:
|
…sistent-loop-break
…sistent-loop-break
The optimization achieves a **437% speedup** by eliminating expensive function calls and using more efficient algorithms for median calculation and min/max operations. **Key optimizations applied:** 1. **Custom median calculation**: Replaced `statistics.median(recent)` with a custom implementation using `sorted(recent)` and direct indexing. This eliminates the overhead of the statistics module's generic median function. 2. **Reused sorted array**: The sorted array from median calculation is reused for min/max operations (`recent_sorted[0]`, `recent_sorted[-1]`) instead of calling `min(recent)` and `max(recent)` separately, eliminating redundant iterations. 3. **Replaced `statistics.mean()` calls**: Substituted `mean(recent[:half])` and `mean(recent[half:])` with direct `sum()/length` calculations, removing function call overhead. 4. **Early termination optimization**: Changed the `all()` generator expression to an explicit loop with early `break`, avoiding unnecessary iterations when the first non-conforming value is found. **Performance impact analysis:** The line profiler shows dramatic improvements in the most expensive operations: - `first = mean(recent[:half])`: 2.07ms → 47.7μs (98% reduction) - `second = mean(recent[half:])`: 1.54ms → 35.4μs (98% reduction) - `m = median(recent)`: 220μs → eliminated, replaced with ~55μs custom implementation **Hot path significance:** Based on the function reference, `should_stop()` is called in the main test loop (`pytest_runtestloop`) after every iteration to determine if benchmarking should continue. This makes it a critical hot path function where the 437% speedup directly translates to faster benchmark completion times. **Test case performance:** The optimization performs exceptionally well across all test scenarios, with speedups ranging from 400-500% for typical cases involving stable data, fluctuations, and large datasets. Even edge cases with insufficient data show 10-25% improvements.
⚡️ Codeflash found optimizations for this PR📄 438% (4.38x) speedup for
|
…sistent-loop-break
…sistent-loop-break
|
@claude review this |
|
Claude finished @aseembits93's task in 3m 26s —— View job Code Review Complete ✅I've reviewed this PR and identified several issues that need attention, along with some positive observations. Critical Issues 🔴1. Type Hint Error in
|
| raise FileNotFoundError(msg) | ||
|
|
||
| if should_make_copy: | ||
| file_path = path |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
related to this PR?
| @@ -0,0 +1,2 @@ | |||
| def calculate_best_summed_runtime(grouped_runtime_info: dict[any, list[int]]) -> int: | |||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
entire file for a single function 🥲
| # 2) Window spread is small | ||
| r_max = recent[0] | ||
| r_min = recent[-1] | ||
| spread_ok = (r_max - r_min) / r_min <= spread_rel_tol |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
r_min could be 0?
|
|
||
| while total_time >= SHORTEST_AMOUNT_OF_TIME: # need to run at least one for normal tests | ||
| count += 1 | ||
| total_time = self._get_total_time(session) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
see if total_time changes inside this loop
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
it doesn't, I checked
|
|
||
| if self.enable_stability_check: | ||
| loop_end = _ORIGINAL_PERF_COUNTER_NS() | ||
| dt = loop_end - loop_start # nano-seconds |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
_ORIGINAL_PERF_COUNTER_NS() - loop_start
|
|
||
| elapsed += dt | ||
|
|
||
| best_runtime_until_now = calculate_best_summed_runtime(self.usable_runtime_data_by_test_case) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
does repeated work?
| window_size = int(STABILITY_WINDOW_SIZE * estimated_total_loops + 0.5) | ||
| if ( | ||
| count >= session.config.option.codeflash_min_loops | ||
| and window_size > 1 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
maybe remove the check here
|
|
||
| count: int = 0 | ||
| runtimes = [] | ||
| elapsed = 0.0 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if elapsed is in ns then it's an int
did some experiments for this with (optimize-me & docling), with the current values, I got
Details