Skip to content

fix: prevent Hebrew locale from corrupting subsequent locale formatting#1287

Open
C1-BA-B1-F3 wants to merge 1 commit into
python-babel:masterfrom
C1-BA-B1-F3:fix/hebrew-locale-corruption
Open

fix: prevent Hebrew locale from corrupting subsequent locale formatting#1287
C1-BA-B1-F3 wants to merge 1 commit into
python-babel:masterfrom
C1-BA-B1-F3:fix/hebrew-locale-corruption

Conversation

@C1-BA-B1-F3

Copy link
Copy Markdown

Summary

Fixes #1234

After using Hebrew locale (), subsequent calls to with other locales (like , , ) returned Hebrew-formatted text instead of the requested locale.

Root Cause

was mutating the cached locale data when resolving aliases. Hebrew locale's dict was the same object as root's dict (due to shallow copying in ). When the alias at was resolved for Hebrew, the resolved Hebrew month names were written back into the shared dict, corrupting the root locale data that all other locales inherit from.

Reproduction

from babel.dates import format_date
from datetime import datetime

date_obj = datetime(2025, 10, 15)

print(format_date(date_obj, 'LLLL', 'de'))  # Oktober
print(format_date(date_obj, 'LLLL', 'he'))  # אוקטובר
print(format_date(date_obj, 'LLLL', 'no'))  # אוקטובר (BUG: should be 'oktober')
print(format_date(date_obj, 'LLLL', 'fr'))  # אוקטובר (BUG: should be 'octobre')

Fix

Remove the write-back in . The resolved values are no longer stored back into the original data dict, preventing mutation of shared/cached locale data.

# Before (buggy):
if val is not orig:
    self._data[key] = val  # Mutates shared cache!
return val

# After (fixed):
return val  # No mutation

Tests Added

  • test_locale_data_isolation_hebrew: Verifies month names are not corrupted after Hebrew usage
  • test_locale_data_isolation_format_date: Verifies format_date output is correct after Hebrew
  • test_locale_data_cache_not_mutated: Verifies root data integrity after alias resolution

All 7325 existing tests pass with this change.

Fixes python-babel#1234

The bug: After using Hebrew locale ('he'), subsequent calls to format_date()
with other locales (like 'no', 'fr', 'es') returned Hebrew-formatted text
instead of the requested locale.

Root cause: LocaleDataDict.__getitem__ was mutating the cached locale data
when resolving aliases. Hebrew locale's months.stand-alone dict was the
same object as root's months.stand-alone dict (due to shallow copying in
merge()). When the alias at months.stand-alone.wide was resolved for Hebrew,
the resolved Hebrew month names were written back into the shared dict,
corrupting the root locale data that all other locales inherit from.

Fix: Remove the write-back in LocaleDataDict.__getitem__. The resolved
values are no longer stored back into the original data dict, preventing
mutation of shared/cached locale data.

Added regression tests:
- test_locale_data_isolation_hebrew: Verifies month names are not corrupted
- test_locale_data_isolation_format_date: Verifies format_date output
- test_locale_data_cache_not_mutated: Verifies root data integrity
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Hebrew Locale Corrupts Subsequent Locale Formatting

1 participant