⚡️ Speed up method DiscreteDP.compute_greedy by 17%
#73
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
📄 17% (0.17x) speedup for
DiscreteDP.compute_greedyinquantecon/markov/ddp.py⏱️ Runtime :
513 microseconds→437 microseconds(best of135runs)📝 Explanation and details
The optimized code achieves a 17% speedup by replacing pure Python loops in the state-wise maximization operations with Numba JIT-compiled functions (
@njit(cache=True)).Key Optimizations:
Numba JIT Compilation for State-wise Max Operations: The original code called
_s_wise_maxand_s_wise_max_argmaxdirectly (which are already Numba-compiled inutilities.py), but added wrapper functions_njit_s_wise_max_1dand_njit_s_wise_max_argmax_1dthat are explicitly JIT-compiled. More importantly, for the 2D case (product formulation), it replaced NumPy'svals.max(axis=1)andvals.argmax(axis=1)with a custom_njit_s_wise_max_2dfunction that uses explicit loops compiled by Numba.Why This Is Faster:
np.max(axis=1)andnp.argmax(axis=1)incur Python interpreter overhead and temporary array allocations. The Numba-compiled loop directly iterates over the data with no intermediate allocations.@njit(cache=True)means the compiled machine code is cached to disk, eliminating compilation overhead on subsequent runs.Performance Impact by Test Case:
_njit_s_wise_max_2dreplacing NumPy operations.Workload Considerations:
The optimization is particularly beneficial when:
bellman_operatororcompute_greedyare called repeatedly (e.g., in value iteration, policy iteration loops)The line profiler shows that 95-99% of time is spent in
s_wise_max, making it the critical hot path. By optimizing this bottleneck with JIT compilation, the overall runtime improves significantly.✅ Correctness verification report:
🌀 Click to see Generated Regression Tests
To edit these changes
git checkout codeflash/optimize-DiscreteDP.compute_greedy-mjw20yktand push.