Skip to content

Data race on frozendict reads during construction in the free-threading build #151722

@tonghuaroot

Description

@tonghuaroot

Data race on frozendict reads during construction in the free-threading build

Summary

In the free-threading build (--disable-gil), several frozendict (PEP 814)
read paths access the object's internal state without synchronization, on the
assumption that a frozendict is immutable and therefore safe to read
lock-free. That assumption holds once the object is fully built, but not while
it is being constructed from a user-defined mapping
: the half-built
frozendict is already observable from other threads, and its scalar fields and
entry table are being mutated concurrently. ThreadSanitizer reports a genuine
data race, and a reader can observe the object's length and contents change
underneath it, which violates the immutability the type is supposed to
guarantee.

Affected read paths: len(fd) (frozendict_length), repr(fd)
(frozendict_repr), hash(fd) (frozendict_hash). All three read raw
ma_used and/or walk the entry table via _PyDict_Next with no atomic load and
no critical section, while the corresponding mutable-dict paths
(dict_length, dict_repr) use GET_USED() / Py_BEGIN_CRITICAL_SECTION.

How a half-built frozendict becomes observable

frozendict(mapping) for a non-dict mapping takes the generic path
frozendict_newdict_update_commondict_merge, which calls back into
user Python: PyMapping_Keys(b), then PyObject_GetItem(b, key) per key
(Objects/dictobject.c, the generic branch of dict_merge). dict_new already
calls _PyObject_GC_TRACK(d) before the first item is inserted, so from the
very first callback the half-built frozendict is discoverable via
gc.get_objects(). A __getitem__ callback can take a second reference to it
and hand a live alias to another thread, which then reads the same object while
dict_merge keeps inserting and resizing it.

can_modify_dict() treats a frozendict as mutable only while it is uniquely
referenced (PyUnstable_Object_IsUniquelyReferenced(mp)), i.e. it assumes a
half-built frozendict is private. Once user code has run during construction,
that assumption is false.

ThreadSanitizer evidence (main HEAD 9688d252d330b0b586760a121ee8c8f7215176e8)

Build: ./configure --disable-gil --with-thread-sanitizer && make.

len read racing the construction-time size store:

WARNING: ThreadSanitizer: data race
  Read of size 8 ... by thread T1:
    #0 frozendict_length Objects/dictobject.c
    #1 PyObject_Size / builtin_len
  Previous atomic write of size 8 ... by main thread:
    #0 insertdict Objects/dictobject.c            (STORE_USED, already atomic)
    #1 setitem_take2_lock_held
    #2 dict_merge
    #3 frozendict_vectorcall
SUMMARY: ThreadSanitizer: data race in frozendict_length

repr and hash additionally race the entry table while a concurrent
dictresize reallocates it:

WARNING: ThreadSanitizer: data race
  ... by main thread:
    #0 dictresize Objects/dictobject.c
    #1 insert_combined_dict
    #2 insertdict
    #3 dict_merge -> frozendict_vectorcall
  Previous read of size 8 ... by thread T1:
    #0 _PyDict_Next          (reached from frozendict_repr / frozendict_hash)
SUMMARY: ThreadSanitizer: data race in dictresize

Beyond the sanitizer report, the violation is directly observable: a reader
watching len() of the half-built object sees it grow 0 → 499 → 999 → 1899,
and gc.is_tracked() is True at every step.

The construction (write) side is already atomic — insertdict's size update
uses STORE_USED (FT_ATOMIC_STORE_SSIZE_RELAXED) and the inserts run under
Py_BEGIN_CRITICAL_SECTION(a) on the target in dict_merge. The defect is on
the read side: the custom frozendict read functions deliberately dropped both
the atomic load and the critical section.

Reproduction

import threading, gc

stop = threading.Event()
box = []

def reader():
    while not stop.is_set():
        if box:
            try:
                len(box[0])     # or repr(box[0]) / hash(box[0])
            except Exception:
                pass

class Evil:
    def keys(self):
        return [f"k{i}" for i in range(8192)]
    def __getitem__(self, k):
        if not box:                      # leak the half-built frozendict
            for o in gc.get_objects():
                if isinstance(o, frozendict) and len(o) >= 1:
                    box.append(o)
                    break
        return 1

t = threading.Thread(target=reader, daemon=True)
t.start()
for _ in range(60):
    box.clear()
    frozendict(Evil())
stop.set()
t.join(timeout=2)

Run under a free-threading + TSan build; the race fires for len, repr and
hash.

Fix

frozendict reads should follow the same free-threading conventions the rest of
dictobject.c already uses:

  1. Read scalar ma_used through GET_USED()
    (FT_ATOMIC_LOAD_SSIZE_RELAXED), exactly like dict_length and like
    set_len for frozenset.
  2. Wrap the entry-table walk in frozendict_repr and frozendict_hash in a
    per-object critical section, exactly like dict_repr already wraps
    anydict_repr_impl, and like set_repr wraps set_repr_lock_held (which is
    the tp_repr of both set and frozenset). The construction side already
    holds Py_BEGIN_CRITICAL_SECTION(a) on the target in dict_merge, so the
    reader's critical section serializes against construction and the race goes
    away. The cached ma_hash short-circuit stays lock-free (atomic load),
    matching frozenset_hash.

With this change, ThreadSanitizer is clean for len, repr and hash against
a concurrently-constructing frozendict, and the existing test_dict,
test_free_threading and test_set suites still pass on the free-threading +
TSan build.

A PR implementing the above is ready.

One related path: lock-free frozendict copy (PR gh-141510 / #145920)

copy_lock_held was deliberately made lock-free for frozendict sources in
#145920 ("A frozendict mapping is immutable so a critical section is not needed
to ensure that a copy is consistent"), giving a large speedup for the normal
(finished) case. That path reads the source's ma_used / ma_keys and walks
the entry table, so it has the same latent race during construction
(fd.copy() / dict(fd) / fd | m on a half-built source still trips TSan
after the read-side fix above). Because re-adding a critical section there would
walk back a deliberate, merged performance decision, I left that path as-is and
applied only the scalar-atomic read. The clean way to also close the copy window
without giving up #145920's lock-free copy is to make a half-built frozendict
non-observable in the first place — defer GC-tracking / publication until the
constructor returns, which would make can_modify_dict's unique-reference
assumption actually hold and keep every lock-free read correct. That is a
broader design decision (it would apply to frozenset too, which is also
GC-tracked and observably grows during construction), so I am raising it here
rather than bundling it into the read-path PR. Happy to follow whichever
direction you prefer.

Environment

  • CPython main, free-threading build
    (--disable-gil --with-thread-sanitizer), reproduced on HEAD
    9688d252d330b0b586760a121ee8c8f7215176e8 (3.16.0a0).
  • Same code is present on the 3.15 branch (the frozendict read paths are
    byte-for-byte the same logic).

Linked

Linked PRs

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions