Skip to content

fix(python): preserve original JSON keys in Data shim round-trips#1900

Open
syf2211 wants to merge 1 commit into
github:mainfrom
syf2211:fix/python-data-key-roundtrip-1138
Open

fix(python): preserve original JSON keys in Data shim round-trips#1900
syf2211 wants to merge 1 commit into
github:mainfrom
syf2211:fix/python-data-key-roundtrip-1138

Conversation

@syf2211

@syf2211 syf2211 commented Jul 3, 2026

Copy link
Copy Markdown

Summary

Fixes Data.from_dict().to_dict() corrupting JSON keys that contain common abbreviations (e.g. userURL, sessionID, OAuthToken).

Motivation

The Data compatibility shim converts inbound JSON keys to snake_case for attribute access, then uses heuristic _compat_to_json_key when serializing back. That converter cannot reconstruct abbreviation-heavy camelCase, so keys were mutated on round-trip. This breaks consumers of unknown/custom event payloads that log, cache, or echo events.

Fixes #1138

Changes

  • Store the original JSON key per field in Data._json_keys during from_dict()
  • Prefer the stored JSON key in to_dict(); fall back to _compat_to_json_key for manually constructed Data(**kwargs)
  • Update the Python codegen template in scripts/codegen/python.ts and regenerate session_events.py

Tests

  • uv run pytest test_event_forward_compatibility.py -v — 11 passed
  • uv run ruff check test_event_forward_compatibility.py — passed
  • Added test_data_shim_preserves_abbreviation_json_keys_on_round_trip covering the keys cited in the issue

Notes

  • Unknown event types parsed via SessionEvent.from_dict already use RawSessionEventData (verbatim raw). This fix targets the Data shim used for backward-compatible/manual payloads.
  • Pre-existing key-collision edge case (userURL vs userUrl both mapping to user_url) is unchanged.
  • Reviewed with composer-2.5 subagent: APPROVE

Fixes github#1138

The Data compatibility shim converted JSON keys to snake_case for attribute
access but could not reconstruct abbreviation-heavy camelCase keys (userURL,
sessionID, OAuthToken) on to_dict(). Store the original JSON key per field
during from_dict() and prefer it when serializing back.

Includes regression tests for the abbreviation key cases described in the
issue.
@syf2211 syf2211 requested a review from a team as a code owner July 3, 2026 12:09
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Python codegen: _compat_to_python_key / _compat_to_json_key are not inverses for keys with common abbreviations (URL, ID, IP, XML, OAuth)

1 participant