Skip to content

Conversation

@codeflash-ai
Copy link

@codeflash-ai codeflash-ai bot commented Nov 13, 2025

📄 25% (0.25x) speedup for TxInput.to_json in electrum/transaction.py

⏱️ Runtime : 780 microseconds 626 microseconds (best of 39 runs)

📝 Explanation and details

The optimization replaces the BCDataStream-based witness parsing with a direct byte-level parser that eliminates object creation and method call overhead.

Key optimizations:

  1. Inlined compact-size decoding: The original code creates a BCDataStream object and calls methods for each compact-size read. The optimized version directly parses the Bitcoin compact-size encoding using int.from_bytes() and pointer arithmetic, eliminating multiple object method calls.

  2. Single-pass parsing: Instead of creating a BCDataStream wrapper and calling read_compact_size() multiple times (once for element count, once per element length), the optimized code processes the witness bytes in a single pass with manual pointer tracking.

  3. Fallback safety: The optimization uses a try-except block to fall back to the original BCDataStream approach if any parsing error occurs, ensuring identical behavior for malformed data.

Performance impact:

  • Line profiler shows the optimized witness_elements() method runs in 1.07ms vs 1.31ms (19% faster)
  • Tests with witness data show dramatic improvements: up to 132% faster for large witness stacks (100+ elements)
  • Simple witness parsing cases show 80-100% speedups
  • Non-witness cases remain virtually unchanged in performance

Why this works:
Bitcoin's compact-size encoding is simple enough that direct parsing with int.from_bytes() is much faster than the generic BCDataStream wrapper. The optimization eliminates per-element object method overhead while maintaining exact behavioral compatibility through the fallback mechanism.

This optimization is particularly valuable for transaction processing workloads where witness parsing occurs frequently, as evidenced by the substantial speedups in witness-heavy test cases.

Correctness verification report:

Test Status
⚙️ Existing Unit Tests 🔘 None Found
🌀 Generated Regression Tests 2083 Passed
⏪ Replay Tests 🔘 None Found
🔎 Concolic Coverage Tests 2 Passed
📊 Tests Coverage 100.0%
🌀 Generated Regression Tests and Runtime
import binascii

# imports
import pytest
from electrum.transaction import TxInput

# function to test (provided above)
# (Assume the code block above is present and correct, including TxInput, BCDataStream, etc.)

# Helper class for TxOutpoint (minimal implementation for tests)
class TxOutpoint:
    def __init__(self, txid: bytes, out_idx: int):
        self.txid = txid
        self.out_idx = out_idx

# ----------------------
# BASIC TEST CASES
# ----------------------

def test_basic_minimal_input():
    # Test a minimal TxInput with only required fields
    prevout = TxOutpoint(b'\x01'*32, 0)
    txin = TxInput(prevout=prevout)
    codeflash_output = txin.to_json(); result = codeflash_output # 1.74μs -> 1.78μs (2.36% slower)

def test_basic_with_scriptsig():
    # Test TxInput with a non-empty scriptSig
    prevout = TxOutpoint(b'\x02'*32, 1)
    script_sig = b'\x6a\x14'*10  # arbitrary script
    txin = TxInput(prevout=prevout, script_sig=script_sig)
    codeflash_output = txin.to_json(); result = codeflash_output # 1.46μs -> 1.39μs (4.60% faster)

def test_basic_with_coinbase_flag():
    # Test TxInput with coinbase output flag set
    prevout = TxOutpoint(b'\x03'*32, 2)
    txin = TxInput(prevout=prevout, is_coinbase_output=True)
    codeflash_output = txin.to_json(); result = codeflash_output # 1.06μs -> 1.14μs (6.66% slower)

def test_basic_with_nsequence():
    # Test TxInput with custom nsequence
    prevout = TxOutpoint(b'\x04'*32, 3)
    txin = TxInput(prevout=prevout, nsequence=123456)
    codeflash_output = txin.to_json(); result = codeflash_output # 928ns -> 1.02μs (8.93% slower)

# ----------------------
# EDGE TEST CASES
# ----------------------

def test_edge_prevout_hash_all_zero():
    # prevout txid is all zeros
    prevout = TxOutpoint(b'\x00'*32, 0)
    txin = TxInput(prevout=prevout)
    codeflash_output = txin.to_json(); result = codeflash_output # 980ns -> 1.05μs (6.58% slower)

def test_edge_prevout_hash_all_ff():
    # prevout txid is all 0xff
    prevout = TxOutpoint(b'\xff'*32, 0)
    txin = TxInput(prevout=prevout)
    codeflash_output = txin.to_json(); result = codeflash_output # 974ns -> 1.06μs (7.77% slower)

def test_edge_prevout_n_max():
    # prevout_n is at maximum uint32
    prevout = TxOutpoint(b'\x10'*32, 0xffffffff)
    txin = TxInput(prevout=prevout)
    codeflash_output = txin.to_json(); result = codeflash_output # 997ns -> 1.03μs (3.58% slower)

def test_edge_scriptsig_empty_bytes():
    # scriptSig is empty bytes, should be serialized as empty string
    prevout = TxOutpoint(b'\x20'*32, 0)
    txin = TxInput(prevout=prevout, script_sig=b'')
    codeflash_output = txin.to_json(); result = codeflash_output # 1.23μs -> 1.30μs (5.30% slower)

def test_edge_scriptsig_large():
    # scriptSig is large (but <1000 bytes)
    prevout = TxOutpoint(b'\x21'*32, 0)
    script_sig = b'\x6a'*999
    txin = TxInput(prevout=prevout, script_sig=script_sig)
    codeflash_output = txin.to_json(); result = codeflash_output # 2.09μs -> 1.94μs (7.72% faster)

def test_edge_nsequence_zero():
    # nsequence is zero
    prevout = TxOutpoint(b'\x22'*32, 0)
    txin = TxInput(prevout=prevout, nsequence=0)
    codeflash_output = txin.to_json(); result = codeflash_output # 1.04μs -> 994ns (4.73% faster)

def test_edge_nsequence_max():
    # nsequence is maximum uint32
    prevout = TxOutpoint(b'\x23'*32, 0)
    txin = TxInput(prevout=prevout, nsequence=0xffffffff)
    codeflash_output = txin.to_json(); result = codeflash_output # 958ns -> 1.02μs (6.54% slower)

def test_edge_witness_empty():
    # witness is empty bytes (should produce empty witness list)
    prevout = TxOutpoint(b'\x24'*32, 0)
    txin = TxInput(prevout=prevout, witness=b'')
    codeflash_output = txin.to_json(); result = codeflash_output # 2.01μs -> 2.13μs (5.41% slower)

def test_edge_witness_single_element():
    # witness with one element (e.g. [b'\x01\x02\x03'])
    prevout = TxOutpoint(b'\x25'*32, 0)
    # witness encoding: 1 element of length 3: 0x01 (count), 0x03 (len), 0x01 0x02 0x03 (data)
    witness = b'\x01\x03\x01\x02\x03'
    txin = TxInput(prevout=prevout, witness=witness)
    codeflash_output = txin.to_json(); result = codeflash_output # 8.90μs -> 4.08μs (118% faster)

def test_edge_witness_multiple_elements():
    # witness with multiple elements (e.g. [b'\x01', b'\x02\x03'])
    prevout = TxOutpoint(b'\x26'*32, 0)
    # witness encoding: 2 elements: 0x02 (count), 0x01 0x01, 0x01, 0x02 0x02, 0x02 0x03
    # Actually: 0x02 (count), 0x01 (len), 0x01, 0x02 (len), 0x02 0x03
    witness = b'\x02\x01\x01\x02\x02\x03'
    txin = TxInput(prevout=prevout, witness=witness)
    codeflash_output = txin.to_json(); result = codeflash_output # 7.58μs -> 3.87μs (95.7% faster)

def test_edge_witness_max_elements():
    # witness with many elements (e.g. 10 elements of 1 byte each)
    prevout = TxOutpoint(b'\x27'*32, 0)
    elements = [bytes([i]) for i in range(10)]
    # build witness encoding: count byte, then for each: length byte, data
    witness = bytes([10]) + b''.join([b'\x01' + bytes([i]) for i in range(10)])
    txin = TxInput(prevout=prevout, witness=witness)
    codeflash_output = txin.to_json(); result = codeflash_output # 10.4μs -> 5.32μs (95.7% faster)

def test_edge_coinbase_true_false():
    # Test both coinbase True and False
    prevout = TxOutpoint(b'\x28'*32, 0)
    txin1 = TxInput(prevout=prevout, is_coinbase_output=True)
    txin2 = TxInput(prevout=prevout, is_coinbase_output=False)

def test_edge_prevout_n_negative():
    # Negative prevout_n should be allowed by the class, but test what happens
    prevout = TxOutpoint(b'\x29'*32, -1)
    txin = TxInput(prevout=prevout)
    codeflash_output = txin.to_json(); result = codeflash_output # 1.07μs -> 1.07μs (0.093% slower)

def test_edge_scriptsig_non_ascii():
    # scriptSig with non-ASCII bytes
    prevout = TxOutpoint(b'\x30'*32, 0)
    script_sig = b'\xff\xfe\xfd\xfc'
    txin = TxInput(prevout=prevout, script_sig=script_sig)
    codeflash_output = txin.to_json(); result = codeflash_output # 1.27μs -> 1.20μs (5.15% faster)

def test_edge_witness_invalid_encoding():
    # witness with invalid encoding (count byte > actual elements)
    prevout = TxOutpoint(b'\x31'*32, 0)
    # count is 2, but only 1 element present
    witness = b'\x02\x01\x01'
    txin = TxInput(prevout=prevout, witness=witness)
    with pytest.raises(Exception):
        txin.to_json() # 8.64μs -> 11.5μs (24.8% slower)

def test_edge_witness_zero_elements():
    # witness with zero elements (count=0)
    prevout = TxOutpoint(b'\x32'*32, 0)
    witness = b'\x00'
    txin = TxInput(prevout=prevout, witness=witness)
    codeflash_output = txin.to_json(); result = codeflash_output # 5.53μs -> 3.06μs (81.1% faster)

# ----------------------
# LARGE SCALE TEST CASES
# ----------------------

def test_large_scriptsig():
    # scriptSig of 999 bytes
    prevout = TxOutpoint(b'\x40'*32, 0)
    script_sig = b'\x00'*999
    txin = TxInput(prevout=prevout, script_sig=script_sig)
    codeflash_output = txin.to_json(); result = codeflash_output # 1.98μs -> 2.01μs (1.49% slower)

def test_large_witness_elements():
    # witness with 100 elements, each 8 bytes
    prevout = TxOutpoint(b'\x41'*32, 0)
    n = 100
    elements = [bytes([i]*8) for i in range(n)]
    # encode: count byte, then for each: length byte, data
    witness = bytes([n]) + b''.join([b'\x08' + bytes([i]*8) for i in range(n)])
    txin = TxInput(prevout=prevout, witness=witness)
    codeflash_output = txin.to_json(); result = codeflash_output # 52.2μs -> 22.5μs (132% faster)
    for i in range(n):
        pass

def test_large_prevout_n():
    # prevout_n at maximum 32-bit unsigned int
    prevout = TxOutpoint(b'\x42'*32, 0xffffffff)
    txin = TxInput(prevout=prevout)
    codeflash_output = txin.to_json(); result = codeflash_output # 1.10μs -> 1.24μs (11.2% slower)

def test_large_multiple_inputs():
    # Simulate 1000 TxInputs and check serialization
    prevout = TxOutpoint(b'\x43'*32, 0)
    inputs = [
        TxInput(prevout=TxOutpoint(bytes([i%256])*32, i), script_sig=bytes([i%256]*10))
        for i in range(1000)
    ]
    # Just check that all to_json calls succeed and output is correct
    for i, txin in enumerate(inputs):
        codeflash_output = txin.to_json(); result = codeflash_output # 437μs -> 446μs (2.00% slower)

def test_large_witness_element_size():
    # witness with a single element of 999 bytes
    prevout = TxOutpoint(b'\x44'*32, 0)
    witness = b'\x01' + b'\xf7' + b'\x55'*999  # 0x01 (count), 0xf7 (999), 999 bytes
    txin = TxInput(prevout=prevout, witness=witness)
    codeflash_output = txin.to_json(); result = codeflash_output # 9.79μs -> 5.33μs (83.5% faster)
# codeflash_output is used to check that the output of the original code is the same as that of the optimized code.
from typing import Optional, Sequence, Union

# imports
import pytest  # used for our unit tests
from electrum.transaction import TxInput

# --- Helper classes for the test suite ---

class SerializationError(Exception):
    pass

class TxOutpoint:
    """Minimal stub for TxOutpoint used in TxInput."""
    def __init__(self, txid: bytes, out_idx: int):
        self.txid = txid
        self.out_idx = out_idx

# --- Unit Tests for TxInput.to_json ---

# ----------- BASIC TEST CASES -----------

def test_basic_minimal_input():
    # Minimal TxInput, no scriptSig, no witness, not coinbase
    txid = bytes.fromhex('00'*32)
    out_idx = 0
    prevout = TxOutpoint(txid, out_idx)
    txin = TxInput(prevout=prevout)
    codeflash_output = txin.to_json(); result = codeflash_output # 1.20μs -> 1.34μs (10.2% slower)

def test_basic_with_script_sig():
    # TxInput with scriptSig
    txid = bytes.fromhex('11'*32)
    prevout = TxOutpoint(txid, 1)
    script_sig = b'\x51\x52\x53'
    txin = TxInput(prevout=prevout, script_sig=script_sig)
    codeflash_output = txin.to_json(); result = codeflash_output # 1.20μs -> 1.32μs (9.44% slower)

def test_basic_coinbase_true():
    # TxInput with coinbase output
    txid = bytes.fromhex('ff'*32)
    prevout = TxOutpoint(txid, 2)
    txin = TxInput(prevout=prevout, is_coinbase_output=True)
    codeflash_output = txin.to_json(); result = codeflash_output # 972ns -> 1.06μs (8.39% slower)

def test_basic_with_witness_single_element():
    # Witness with a single element
    txid = bytes.fromhex('22'*32)
    prevout = TxOutpoint(txid, 3)
    # witness: 1 element, length 3, value b'\x01\x02\x03'
    witness = bytes([1, 3]) + b'\x01\x02\x03'
    txin = TxInput(prevout=prevout, witness=witness)
    codeflash_output = txin.to_json(); result = codeflash_output # 8.32μs -> 4.27μs (94.8% faster)

def test_basic_with_witness_multiple_elements():
    # Witness with two elements
    txid = bytes.fromhex('33'*32)
    prevout = TxOutpoint(txid, 4)
    # witness: 2 elements, first length 2 (b'\xaa\xbb'), second length 1 (b'\xcc')
    witness = bytes([2, 2]) + b'\xaa\xbb' + bytes([1]) + b'\xcc'
    txin = TxInput(prevout=prevout, witness=witness)
    codeflash_output = txin.to_json(); result = codeflash_output # 7.26μs -> 3.96μs (83.2% faster)

def test_basic_with_script_sig_and_witness():
    # Both scriptSig and witness present
    txid = bytes.fromhex('44'*32)
    prevout = TxOutpoint(txid, 5)
    script_sig = b'\x00\xff'
    witness = bytes([1, 1]) + b'\x99'
    txin = TxInput(prevout=prevout, script_sig=script_sig, witness=witness)
    codeflash_output = txin.to_json(); result = codeflash_output # 6.83μs -> 3.96μs (72.1% faster)

# ----------- EDGE TEST CASES -----------

def test_edge_prevout_idx_max():
    # prevout index at max 32-bit unsigned int
    txid = bytes.fromhex('55'*32)
    prevout = TxOutpoint(txid, 0xffffffff)
    txin = TxInput(prevout=prevout)
    codeflash_output = txin.to_json(); result = codeflash_output # 969ns -> 1.12μs (13.6% slower)

def test_edge_prevout_hash_all_ff():
    # prevout hash is all ff
    txid = bytes.fromhex('ff'*32)
    prevout = TxOutpoint(txid, 0)
    txin = TxInput(prevout=prevout)
    codeflash_output = txin.to_json(); result = codeflash_output # 1.01μs -> 1.05μs (2.96% slower)

def test_edge_script_sig_empty_bytes():
    # scriptSig is empty bytes
    txid = bytes.fromhex('66'*32)
    prevout = TxOutpoint(txid, 6)
    txin = TxInput(prevout=prevout, script_sig=b'')
    codeflash_output = txin.to_json(); result = codeflash_output # 1.23μs -> 1.20μs (2.92% faster)

def test_edge_witness_empty_bytes():
    # witness is empty bytes (should not add 'witness' key)
    txid = bytes.fromhex('77'*32)
    prevout = TxOutpoint(txid, 7)
    txin = TxInput(prevout=prevout, witness=b'')
    codeflash_output = txin.to_json(); result = codeflash_output # 1.87μs -> 2.01μs (7.06% slower)

def test_edge_witness_zero_elements():
    # witness encoding for zero elements
    txid = bytes.fromhex('88'*32)
    prevout = TxOutpoint(txid, 8)
    witness = bytes([0])
    txin = TxInput(prevout=prevout, witness=witness)
    codeflash_output = txin.to_json(); result = codeflash_output # 5.98μs -> 2.98μs (101% faster)

def test_edge_nsequence_min_max():
    # nsequence at minimum and maximum
    txid = bytes.fromhex('99'*32)
    prevout = TxOutpoint(txid, 9)
    txin_min = TxInput(prevout=prevout, nsequence=0)
    txin_max = TxInput(prevout=prevout, nsequence=0xffffffff)

def test_edge_witness_compact_size_253():
    # witness element count encoded as 253 (should read <H> for count)
    txid = bytes.fromhex('aa'*32)
    prevout = TxOutpoint(txid, 10)
    import struct

    # 253 elements, all empty
    witness = bytes([253]) + struct.pack('<H', 253) + b''.join([bytes([0]) for _ in range(253)])
    txin = TxInput(prevout=prevout, witness=witness)
    # Each element is empty, so hex is ''
    codeflash_output = txin.to_json(); result = codeflash_output # 109μs -> 43.1μs (153% faster)

def test_edge_witness_compact_size_254():
    # witness element count encoded as 254 (should read <I> for count)
    txid = bytes.fromhex('bb'*32)
    prevout = TxOutpoint(txid, 11)
    import struct

    # 5 elements, encoded as 254 + <I> 5
    witness = bytes([254]) + struct.pack('<I', 5) + b''.join([bytes([0]) for _ in range(5)])
    txin = TxInput(prevout=prevout, witness=witness)
    codeflash_output = txin.to_json(); result = codeflash_output # 9.57μs -> 5.18μs (84.6% faster)

def test_edge_witness_compact_size_255():
    # witness element count encoded as 255 (should read <Q> for count)
    txid = bytes.fromhex('cc'*32)
    prevout = TxOutpoint(txid, 12)
    import struct

    # 3 elements, encoded as 255 + <Q> 3
    witness = bytes([255]) + struct.pack('<Q', 3) + b''.join([bytes([0]) for _ in range(3)])
    txin = TxInput(prevout=prevout, witness=witness)
    codeflash_output = txin.to_json(); result = codeflash_output # 8.32μs -> 4.60μs (81.0% faster)




def test_large_scale_prevout_hash_and_idx():
    # Large prevout hash and max index
    txid = bytes.fromhex('ab'*32)
    prevout = TxOutpoint(txid, 0xffffffff)
    txin = TxInput(prevout=prevout)
    codeflash_output = txin.to_json(); result = codeflash_output # 1.71μs -> 1.74μs (1.84% slower)


def test_large_scale_script_sig_empty_and_witness_large():
    # Empty scriptSig and large witness
    txid = bytes.fromhex('de'*32)
    prevout = TxOutpoint(txid, 17)
    witness_bytes = bytes([100]) + b''.join([bytes([1]) + bytes([i % 256]) for i in range(100)])
    txin = TxInput(prevout=prevout, script_sig=b'', witness=witness_bytes)
    codeflash_output = txin.to_json(); result = codeflash_output # 50.8μs -> 21.8μs (132% faster)
    for i in range(100):
        pass
# codeflash_output is used to check that the output of the original code is the same as that of the optimized code.
from electrum.transaction import TxInput
from electrum.transaction import TxOutpoint

def test_TxInput_to_json():
    TxInput.to_json(TxInput(prevout=TxOutpoint((v1 := b''), 0), script_sig=v1, nsequence=0, witness=v1, is_coinbase_output=False))
🔎 Concolic Coverage Tests and Runtime
Test File::Test Function Original ⏱️ Optimized ⏱️ Speedup
codeflash_concolic_6p7ovzz5/tmp0ciim7lw/test_concolic_coverage.py::test_TxInput_to_json 2.32μs 2.42μs -4.01%⚠️

To edit these changes git checkout codeflash/optimize-TxInput.to_json-mhxotrm3 and push.

Codeflash Static Badge

The optimization replaces the `BCDataStream`-based witness parsing with a **direct byte-level parser** that eliminates object creation and method call overhead.

**Key optimizations:**

1. **Inlined compact-size decoding**: The original code creates a `BCDataStream` object and calls methods for each compact-size read. The optimized version directly parses the Bitcoin compact-size encoding using `int.from_bytes()` and pointer arithmetic, eliminating multiple object method calls.

2. **Single-pass parsing**: Instead of creating a `BCDataStream` wrapper and calling `read_compact_size()` multiple times (once for element count, once per element length), the optimized code processes the witness bytes in a single pass with manual pointer tracking.

3. **Fallback safety**: The optimization uses a try-except block to fall back to the original `BCDataStream` approach if any parsing error occurs, ensuring identical behavior for malformed data.

**Performance impact:**
- Line profiler shows the optimized `witness_elements()` method runs in **1.07ms vs 1.31ms** (19% faster)
- Tests with witness data show dramatic improvements: **up to 132% faster** for large witness stacks (100+ elements)
- Simple witness parsing cases show **80-100% speedups**
- Non-witness cases remain virtually unchanged in performance

**Why this works:**
Bitcoin's compact-size encoding is simple enough that direct parsing with `int.from_bytes()` is much faster than the generic `BCDataStream` wrapper. The optimization eliminates per-element object method overhead while maintaining exact behavioral compatibility through the fallback mechanism.

This optimization is particularly valuable for transaction processing workloads where witness parsing occurs frequently, as evidenced by the substantial speedups in witness-heavy test cases.
@codeflash-ai codeflash-ai bot requested a review from mashraf-222 November 13, 2025 17:13
@codeflash-ai codeflash-ai bot added ⚡️ codeflash Optimization PR opened by Codeflash AI 🎯 Quality: Medium Optimization Quality according to Codeflash labels Nov 13, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

⚡️ codeflash Optimization PR opened by Codeflash AI 🎯 Quality: Medium Optimization Quality according to Codeflash

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant