Skip to content

Null pointer dereference in array.array.tofile via reentrant writer #142884

@jackfromeast

Description

@jackfromeast

What happened?

array.array.tofile computes its chunk count before calling the sink’s write method, so it keeps iterating even if the buffer disappears. A reentrant writer can clear the array on the first callback, leaving self->ob_item null, yet the loop still calls PyBytes_FromStringAndSize on that pointer and crashes in the memcpy path.

Proof of Concept:

import array

CHUNK = 64 * 1024
victim = array.array('B', b'\0' * (CHUNK * 2))

class Writer:
    armed = True
    def write(self, data):
        if Writer.armed:
            Writer.armed = False
            victim.clear()
        return 0

victim.tofile(Writer())

Affected Versions:

Python Version Status Exit Code
Python 3.9.24+ (heads/3.9:111bbc15b26, Oct 28 2025, 16:51:20) Exception 1
Python 3.10.19+ (heads/3.10:014261980b1, Oct 28 2025, 16:52:08) [Clang 18.1.3 (1ubuntu1)] Exception 1
Python 3.11.14+ (heads/3.11:88f3f5b5f11, Oct 28 2025, 16:53:08) [Clang 18.1.3 (1ubuntu1)] Exception 1
Python 3.12.12+ (heads/3.12:8cb2092bd8c, Oct 28 2025, 16:54:14) [Clang 18.1.3 (1ubuntu1)] Exception 1
Python 3.13.9+ (heads/3.13:9c8eade20c6, Oct 28 2025, 16:55:18) [Clang 18.1.3 (1ubuntu1)] ASAN 1
Python 3.14.0+ (heads/3.14:2e216728038, Oct 28 2025, 16:56:16) [Clang 18.1.3 (1ubuntu1)] ASAN 1
Python 3.15.0a1+ (heads/main:f5394c257ce, Oct 28 2025, 19:29:54) [GCC 13.3.0] ASAN 1

Vulnerable Code:

static PyObject *
array_array_tofile_impl(arrayobject *self, PyTypeObject *cls, PyObject *f)
/*[clinic end generated code: output=4560c628d9c18bc2 input=5a24da7a7b407b52]*/
{
    Py_ssize_t nbytes = Py_SIZE(self) * self->ob_descr->itemsize;
    /* Write 64K blocks at a time */
    /* XXX Make the block size settable */
    int BLOCKSIZE = 64*1024;
    /* Bug: Caches the nblocks at the beginning */
    Py_ssize_t nblocks = (nbytes + BLOCKSIZE - 1) / BLOCKSIZE;
    Py_ssize_t i;

    if (Py_SIZE(self) == 0)
        goto done;


    array_state *state = get_array_state_by_class(cls);
    assert(state != NULL);

    for (i = 0; i < nblocks; i++) {
        /* self->ob_item has been null out */
        char* ptr = self->ob_item + i*BLOCKSIZE;
        Py_ssize_t size = BLOCKSIZE;
        PyObject *bytes, *res;

        if (i*BLOCKSIZE + size > nbytes)
            size = nbytes - i*BLOCKSIZE;
        bytes = PyBytes_FromStringAndSize(ptr, size);
        if (bytes == NULL)
            return NULL;
        res = PyObject_CallMethodOneArg(f, state->str_write, bytes);
        Py_DECREF(bytes);
        if (res == NULL)
            return NULL;
        Py_DECREF(res); /* drop write result */
    }

  done:
    Py_RETURN_NONE;
}

Sanitizer Output:

AddressSanitizer:DEADLYSIGNAL
=================================================================
==1299008==ERROR: AddressSanitizer: SEGV on unknown address 0x000000010000 (pc 0x745059388a4d bp 0x7ffeb5f943e0 sp 0x7ffeb5f943b8 T0)
==1299008==The signal is caused by a READ memory access.
    #0 0x745059388a4d in __memmove_avx_unaligned_erms ../sysdeps/x86_64/multiarch/memmove-vec-unaligned-erms.S:265
    #1 0x5e74a5aadddd in memcpy /usr/include/x86_64-linux-gnu/bits/string_fortified.h:29
    #2 0x5e74a5aadddd in PyBytes_FromStringAndSize Objects/bytesobject.c:162
    #3 0x745058bca48b in array_array_tofile_impl Modules/arraymodule.c:1611
    #4 0x745058bca675 in array_array_tofile Modules/clinic/arraymodule.c.h:472
    #5 0x5e74a5ade3c9 in method_vectorcall_FASTCALL_KEYWORDS_METHOD Objects/descrobject.c:381
    #6 0x5e74a5abee7f in _PyObject_VectorcallTstate Include/internal/pycore_call.h:169
    #7 0x5e74a5abef72 in PyObject_Vectorcall Objects/call.c:327
    #8 0x5e74a5d3d056 in _PyEval_EvalFrameDefault Python/generated_cases.c.h:1620
    #9 0x5e74a5d80e54 in _PyEval_EvalFrame Include/internal/pycore_ceval.h:121
    #10 0x5e74a5d81148 in _PyEval_Vector Python/ceval.c:2001
    #11 0x5e74a5d813f8 in PyEval_EvalCode Python/ceval.c:884
    #12 0x5e74a5e78507 in run_eval_code_obj Python/pythonrun.c:1365
    #13 0x5e74a5e78723 in run_mod Python/pythonrun.c:1459
    #14 0x5e74a5e7957a in pyrun_file Python/pythonrun.c:1293
    #15 0x5e74a5e7c220 in _PyRun_SimpleFileObject Python/pythonrun.c:521
    #16 0x5e74a5e7c4f6 in _PyRun_AnyFileObject Python/pythonrun.c:81
    #17 0x5e74a5ecd74d in pymain_run_file_obj Modules/main.c:410
    #18 0x5e74a5ecd9b4 in pymain_run_file Modules/main.c:429
    #19 0x5e74a5ecf1b2 in pymain_run_python Modules/main.c:691
    #20 0x5e74a5ecf842 in Py_RunMain Modules/main.c:772
    #21 0x5e74a5ecfa2e in pymain_main Modules/main.c:802
    #22 0x5e74a5ecfdb3 in Py_BytesMain Modules/main.c:826
    #23 0x5e74a5953645 in main Programs/python.c:15
    #24 0x74505922a1c9 in __libc_start_call_main ../sysdeps/nptl/libc_start_call_main.h:58
    #25 0x74505922a28a in __libc_start_main_impl ../csu/libc-start.c:360
    #26 0x5e74a5953574 in _start (/home/jackfromeast/Desktop/entropy/targets/grammar-afl++-latest/targets/cpython/python+0x2dd574) (BuildId: 202d5dbb945f6d5f5a66ad50e2688d56affd6ecb)

AddressSanitizer can not provide additional info.
SUMMARY: AddressSanitizer: SEGV ../sysdeps/x86_64/multiarch/memmove-vec-unaligned-erms.S:265 in __memmove_avx_unaligned_erms
==1299008==ABORTING

Linked PRs

Metadata

Metadata

Assignees

No one assigned

    Labels

    extension-modulesC modules in the Modules dirtype-crashA hard crash of the interpreter, possibly with a core dump

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions