Data race on frozendict reads during construction in the free-threading build
Summary
In the free-threading build (--disable-gil), several frozendict (PEP 814)
read paths access the object's internal state without synchronization, on the
assumption that a frozendict is immutable and therefore safe to read
lock-free. That assumption holds once the object is fully built, but not while
it is being constructed from a user-defined mapping: the half-built
frozendict is already observable from other threads, and its scalar fields and
entry table are being mutated concurrently. ThreadSanitizer reports a genuine
data race, and a reader can observe the object's length and contents change
underneath it, which violates the immutability the type is supposed to
guarantee.
Affected read paths: len(fd) (frozendict_length), repr(fd)
(frozendict_repr), hash(fd) (frozendict_hash). All three read raw
ma_used and/or walk the entry table via _PyDict_Next with no atomic load and
no critical section, while the corresponding mutable-dict paths
(dict_length, dict_repr) use GET_USED() / Py_BEGIN_CRITICAL_SECTION.
How a half-built frozendict becomes observable
frozendict(mapping) for a non-dict mapping takes the generic path
frozendict_new → dict_update_common → dict_merge, which calls back into
user Python: PyMapping_Keys(b), then PyObject_GetItem(b, key) per key
(Objects/dictobject.c, the generic branch of dict_merge). dict_new already
calls _PyObject_GC_TRACK(d) before the first item is inserted, so from the
very first callback the half-built frozendict is discoverable via
gc.get_objects(). A __getitem__ callback can take a second reference to it
and hand a live alias to another thread, which then reads the same object while
dict_merge keeps inserting and resizing it.
can_modify_dict() treats a frozendict as mutable only while it is uniquely
referenced (PyUnstable_Object_IsUniquelyReferenced(mp)), i.e. it assumes a
half-built frozendict is private. Once user code has run during construction,
that assumption is false.
ThreadSanitizer evidence (main HEAD 9688d252d330b0b586760a121ee8c8f7215176e8)
Build: ./configure --disable-gil --with-thread-sanitizer && make.
len read racing the construction-time size store:
WARNING: ThreadSanitizer: data race
Read of size 8 ... by thread T1:
#0 frozendict_length Objects/dictobject.c
#1 PyObject_Size / builtin_len
Previous atomic write of size 8 ... by main thread:
#0 insertdict Objects/dictobject.c (STORE_USED, already atomic)
#1 setitem_take2_lock_held
#2 dict_merge
#3 frozendict_vectorcall
SUMMARY: ThreadSanitizer: data race in frozendict_length
repr and hash additionally race the entry table while a concurrent
dictresize reallocates it:
WARNING: ThreadSanitizer: data race
... by main thread:
#0 dictresize Objects/dictobject.c
#1 insert_combined_dict
#2 insertdict
#3 dict_merge -> frozendict_vectorcall
Previous read of size 8 ... by thread T1:
#0 _PyDict_Next (reached from frozendict_repr / frozendict_hash)
SUMMARY: ThreadSanitizer: data race in dictresize
Beyond the sanitizer report, the violation is directly observable: a reader
watching len() of the half-built object sees it grow 0 → 499 → 999 → 1899,
and gc.is_tracked() is True at every step.
The construction (write) side is already atomic — insertdict's size update
uses STORE_USED (FT_ATOMIC_STORE_SSIZE_RELAXED) and the inserts run under
Py_BEGIN_CRITICAL_SECTION(a) on the target in dict_merge. The defect is on
the read side: the custom frozendict read functions deliberately dropped both
the atomic load and the critical section.
Reproduction
import threading, gc
stop = threading.Event()
box = []
def reader():
while not stop.is_set():
if box:
try:
len(box[0]) # or repr(box[0]) / hash(box[0])
except Exception:
pass
class Evil:
def keys(self):
return [f"k{i}" for i in range(8192)]
def __getitem__(self, k):
if not box: # leak the half-built frozendict
for o in gc.get_objects():
if isinstance(o, frozendict) and len(o) >= 1:
box.append(o)
break
return 1
t = threading.Thread(target=reader, daemon=True)
t.start()
for _ in range(60):
box.clear()
frozendict(Evil())
stop.set()
t.join(timeout=2)
Run under a free-threading + TSan build; the race fires for len, repr and
hash.
Fix
frozendict reads should follow the same free-threading conventions the rest of
dictobject.c already uses:
- Read scalar
ma_used through GET_USED()
(FT_ATOMIC_LOAD_SSIZE_RELAXED), exactly like dict_length and like
set_len for frozenset.
- Wrap the entry-table walk in
frozendict_repr and frozendict_hash in a
per-object critical section, exactly like dict_repr already wraps
anydict_repr_impl, and like set_repr wraps set_repr_lock_held (which is
the tp_repr of both set and frozenset). The construction side already
holds Py_BEGIN_CRITICAL_SECTION(a) on the target in dict_merge, so the
reader's critical section serializes against construction and the race goes
away. The cached ma_hash short-circuit stays lock-free (atomic load),
matching frozenset_hash.
With this change, ThreadSanitizer is clean for len, repr and hash against
a concurrently-constructing frozendict, and the existing test_dict,
test_free_threading and test_set suites still pass on the free-threading +
TSan build.
A PR implementing the above is ready.
One related path: lock-free frozendict copy (PR gh-141510 / #145920)
copy_lock_held was deliberately made lock-free for frozendict sources in
#145920 ("A frozendict mapping is immutable so a critical section is not needed
to ensure that a copy is consistent"), giving a large speedup for the normal
(finished) case. That path reads the source's ma_used / ma_keys and walks
the entry table, so it has the same latent race during construction
(fd.copy() / dict(fd) / fd | m on a half-built source still trips TSan
after the read-side fix above). Because re-adding a critical section there would
walk back a deliberate, merged performance decision, I left that path as-is and
applied only the scalar-atomic read. The clean way to also close the copy window
without giving up #145920's lock-free copy is to make a half-built frozendict
non-observable in the first place — defer GC-tracking / publication until the
constructor returns, which would make can_modify_dict's unique-reference
assumption actually hold and keep every lock-free read correct. That is a
broader design decision (it would apply to frozenset too, which is also
GC-tracked and observably grows during construction), so I am raising it here
rather than bundling it into the read-path PR. Happy to follow whichever
direction you prefer.
Environment
- CPython
main, free-threading build
(--disable-gil --with-thread-sanitizer), reproduced on HEAD
9688d252d330b0b586760a121ee8c8f7215176e8 (3.16.0a0).
- Same code is present on the 3.15 branch (the
frozendict read paths are
byte-for-byte the same logic).
Linked
Linked PRs
Data race on
frozendictreads during construction in the free-threading buildSummary
In the free-threading build (
--disable-gil), severalfrozendict(PEP 814)read paths access the object's internal state without synchronization, on the
assumption that a
frozendictis immutable and therefore safe to readlock-free. That assumption holds once the object is fully built, but not while
it is being constructed from a user-defined mapping: the half-built
frozendictis already observable from other threads, and its scalar fields andentry table are being mutated concurrently. ThreadSanitizer reports a genuine
data race, and a reader can observe the object's length and contents change
underneath it, which violates the immutability the type is supposed to
guarantee.
Affected read paths:
len(fd)(frozendict_length),repr(fd)(
frozendict_repr),hash(fd)(frozendict_hash). All three read rawma_usedand/or walk the entry table via_PyDict_Nextwith no atomic load andno critical section, while the corresponding mutable-
dictpaths(
dict_length,dict_repr) useGET_USED()/Py_BEGIN_CRITICAL_SECTION.How a half-built
frozendictbecomes observablefrozendict(mapping)for a non-dictmapping takes the generic pathfrozendict_new→dict_update_common→dict_merge, which calls back intouser Python:
PyMapping_Keys(b), thenPyObject_GetItem(b, key)per key(
Objects/dictobject.c, the generic branch ofdict_merge).dict_newalreadycalls
_PyObject_GC_TRACK(d)before the first item is inserted, so from thevery first callback the half-built
frozendictis discoverable viagc.get_objects(). A__getitem__callback can take a second reference to itand hand a live alias to another thread, which then reads the same object while
dict_mergekeeps inserting and resizing it.can_modify_dict()treats afrozendictas mutable only while it is uniquelyreferenced (
PyUnstable_Object_IsUniquelyReferenced(mp)), i.e. it assumes ahalf-built
frozendictis private. Once user code has run during construction,that assumption is false.
ThreadSanitizer evidence (main HEAD
9688d252d330b0b586760a121ee8c8f7215176e8)Build:
./configure --disable-gil --with-thread-sanitizer && make.lenread racing the construction-time size store:reprandhashadditionally race the entry table while a concurrentdictresizereallocates it:Beyond the sanitizer report, the violation is directly observable: a reader
watching
len()of the half-built object sees it grow0 → 499 → 999 → 1899,and
gc.is_tracked()isTrueat every step.The construction (write) side is already atomic —
insertdict's size updateuses
STORE_USED(FT_ATOMIC_STORE_SSIZE_RELAXED) and the inserts run underPy_BEGIN_CRITICAL_SECTION(a)on the target indict_merge. The defect is onthe read side: the custom
frozendictread functions deliberately dropped boththe atomic load and the critical section.
Reproduction
Run under a free-threading + TSan build; the race fires for
len,reprandhash.Fix
frozendictreads should follow the same free-threading conventions the rest ofdictobject.calready uses:ma_usedthroughGET_USED()(
FT_ATOMIC_LOAD_SSIZE_RELAXED), exactly likedict_lengthand likeset_lenforfrozenset.frozendict_reprandfrozendict_hashin aper-object critical section, exactly like
dict_repralready wrapsanydict_repr_impl, and likeset_reprwrapsset_repr_lock_held(which isthe
tp_reprof bothsetandfrozenset). The construction side alreadyholds
Py_BEGIN_CRITICAL_SECTION(a)on the target indict_merge, so thereader's critical section serializes against construction and the race goes
away. The cached
ma_hashshort-circuit stays lock-free (atomic load),matching
frozenset_hash.With this change, ThreadSanitizer is clean for
len,reprandhashagainsta concurrently-constructing
frozendict, and the existingtest_dict,test_free_threadingandtest_setsuites still pass on the free-threading +TSan build.
A PR implementing the above is ready.
One related path: lock-free
frozendictcopy (PR gh-141510 / #145920)copy_lock_heldwas deliberately made lock-free forfrozendictsources in#145920 ("A frozendict mapping is immutable so a critical section is not needed
to ensure that a copy is consistent"), giving a large speedup for the normal
(finished) case. That path reads the source's
ma_used/ma_keysand walksthe entry table, so it has the same latent race during construction
(
fd.copy()/dict(fd)/fd | mon a half-built source still trips TSanafter the read-side fix above). Because re-adding a critical section there would
walk back a deliberate, merged performance decision, I left that path as-is and
applied only the scalar-atomic read. The clean way to also close the copy window
without giving up #145920's lock-free copy is to make a half-built
frozendictnon-observable in the first place — defer GC-tracking / publication until the
constructor returns, which would make
can_modify_dict's unique-referenceassumption actually hold and keep every lock-free read correct. That is a
broader design decision (it would apply to
frozensettoo, which is alsoGC-tracked and observably grows during construction), so I am raising it here
rather than bundling it into the read-path PR. Happy to follow whichever
direction you prefer.
Environment
main, free-threading build(
--disable-gil --with-thread-sanitizer), reproduced on HEAD9688d252d330b0b586760a121ee8c8f7215176e8(3.16.0a0).frozendictread paths arebyte-for-byte the same logic).
Linked
frozendict).frozendict_lengthoffdict_lengthand replaced the atomicread with a plain
ma_usedload.frozendictcopy path lock-free (copy_lock_held).Linked PRs