Skip to content

fix: schema conflict#1711

Open
nikhilsinhaparseable wants to merge 1 commit into
parseablehq:mainfrom
nikhilsinhaparseable:fix/schema-conflict
Open

fix: schema conflict#1711
nikhilsinhaparseable wants to merge 1 commit into
parseablehq:mainfrom
nikhilsinhaparseable:fix/schema-conflict

Conversation

@nikhilsinhaparseable

@nikhilsinhaparseable nikhilsinhaparseable commented Jun 30, 2026

Copy link
Copy Markdown
Contributor

scenario: field exists in schema level with data type float64

subsequent batch: there are multiple events in the same ingestion batch,
and a field exists with different data types in the same batch

eg.

[{
    "level":false
},
{
    "level":"false"
},
{
    "level":0
}]

the ingestion fails with schema conflict error

fix: apply correct data type to each value
so "level":false becomes level_bool, "level":"false" becomes level_utf8 and "level":0 remains level

Summary by CodeRabbit

  • Bug Fixes
    • Improved handling of conflicting JSON fields so values are only renamed when they don’t fit the existing schema.
    • Preserved compatible values under their original field names, reducing unnecessary field renaming and helping keep event data more consistent.

scenario: field exists in schema `level` with data type `float64`

subsequent batch:  there are multiple events in the same ingestion batch,
and a field exists with different data types in the same batch

eg.
```
[{
    "level":false
},
{
    "level":"false"
},
{
    "level":0
}]
```

the ingestion fails with schema conflict error

fix: apply correct data type to each value
so "level":false becomes level_bool, "level":"false" becomes level_utf8
and "level":0 remains level
@coderabbitai

coderabbitai Bot commented Jun 30, 2026

Copy link
Copy Markdown
Contributor

Review Change Stack

Walkthrough

rename_conflicting_fields_in_json gains two new parameters (existing_schema and schema_version). The remapping loop now preserves the original key/value when the value is compatible with the existing schema field's Arrow type, and renames only incompatible values. The call site and unit tests are updated accordingly.

Changes

Selective field conflict renaming

Layer / File(s) Summary
Updated function signature and compatibility logic
src/event/format/mod.rs, src/event/format/json.rs
rename_conflicting_fields_in_json now takes existing_schema: &HashMap<String, Arc<Field>> and schema_version: SchemaVersion; the remapping loop checks value_compatible_with_type and preserves compatible entries under the original key. Call site in json.rs updated to pass these arguments.
Updated unit tests
src/event/format/mod.rs
Tests now supply existing_schema with body_timestamp typed as Timestamp(ms), use explicitly non-timestamp strings to trigger renaming, and assert that compatible values remain under the original key while incompatible ones are routed to the typed sibling column.

Estimated code review effort

🎯 2 (Simple) | ⏱️ ~10 minutes

Poem

🐇 A field walked in, but was it the right type?
The rabbit checked the schema before raising a gripe.
Compatible values stay put, no fuss, no rename,
Only the misfits get shuffled — the schema's the game.
Hop hop, correctness restored! 🌿

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Description check ⚠️ Warning It describes the bug and fix, but misses the template's required structure, rationale, and testing/documentation checklist. Add the template sections: Fixes reference if applicable, a ### Description with rationale and key changes, and the checklist items.
✅ Passed checks (4 passed)
Check name Status Explanation
Title check ✅ Passed The title is concise and relevant to the schema-conflict fix, though it is a bit broad.
Docstring Coverage ✅ Passed Docstring coverage is 100.00% which is sufficient. The required threshold is 80.00%.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.
✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests

Comment @coderabbitai help to get the list of available commands.

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@src/event/format/mod.rs`:
- Around line 558-565: The mixed-type rewrite in the field-mapping logic is
still collapsing incompatible values into a single sibling column. In the
`mod.rs` branch that checks `field_mapping.get(&key)` and
`existing_schema.get(&key)`, stop reusing the batch-level `new_key` for every
incompatible value; instead derive the sibling name per record from the actual
value/type so `false` maps to the bool sibling and `"false"` maps to the utf8
sibling. Keep the compatibility fast-path in place, but change the fallback so
`rename_per_record_type_mismatches` in `rename_per_record_type_mismatches` can
operate on already-separated sibling names rather than trying to recover after a
shared `level_utf8` rewrite.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Repository UI

Review profile: CHILL

Plan: Pro

Run ID: 9daa3c32-9c4e-47f4-af0c-f0c9d4cfe8ef

📥 Commits

Reviewing files that changed from the base of the PR and between 3220c1b and 78e7bb1.

📒 Files selected for processing (2)
  • src/event/format/json.rs
  • src/event/format/mod.rs

Comment thread src/event/format/mod.rs
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant