⚡️ Speed up function _gridmake2 by 106%
#70
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
📄 106% (1.06x) speedup for
_gridmake2inquantecon/_ce_util.py⏱️ Runtime :
2.33 milliseconds→1.13 milliseconds(best of5runs)📝 Explanation and details
The optimized code achieves a 105% speedup by replacing NumPy's high-level array operations (
np.tile,np.repeat,np.column_stack) with Numba JIT-compiled loops that directly construct the output array.Key Optimizations
1. Numba JIT Compilation (@njit)
Two specialized helper functions (
_gridmake2_1d_1dand_gridmake2_nd_1d) are decorated with@njit, enabling machine-code compilation. This eliminates Python interpreter overhead and enables CPU-level optimizations.2. Direct Memory Allocation
Instead of creating intermediate arrays with
np.tileandnp.repeat, then combining them withnp.column_stack, the optimized code:3. Efficient Memory Access Pattern
The loop iterates over
x2elements, writing contiguous blocks ofx1values. This provides good cache locality since sequential memory writes are efficient.Performance Characteristics
Test Results Show:
Why This Works:
np.column_stackinternally copies data multiple times to build the outputnp.tileandnp.repeateach allocate temporary arraysImpact on Workloads
From
function_references,_gridmake2is called bygridmake(), which creates cartesian products for computational economics applications. The function appears to be a building block for grid construction, likely used in:Since
gridmakemay call_gridmake2multiple times for >2 arrays (chained calls in the loop), the 105% speedup compounds when constructing large grids, making this optimization particularly valuable for high-dimensional problems common in quantitative economics.The optimization is most beneficial when
_gridmake2is called repeatedly with medium-to-large arrays, which is typical in grid-based computational methods.✅ Correctness verification report:
🌀 Click to see Generated Regression Tests
To edit these changes
git checkout codeflash/optimize-_gridmake2-mjvz595gand push.