fix: schema conflict#1711
Conversation
scenario: field exists in schema `level` with data type `float64`
subsequent batch: there are multiple events in the same ingestion batch,
and a field exists with different data types in the same batch
eg.
```
[{
"level":false
},
{
"level":"false"
},
{
"level":0
}]
```
the ingestion fails with schema conflict error
fix: apply correct data type to each value
so "level":false becomes level_bool, "level":"false" becomes level_utf8
and "level":0 remains level
Walkthrough
ChangesSelective field conflict renaming
Estimated code review effort🎯 2 (Simple) | ⏱️ ~10 minutes Poem
🚥 Pre-merge checks | ✅ 4 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (4 passed)
✨ Finishing Touches🧪 Generate unit tests (beta)
Comment |
There was a problem hiding this comment.
Actionable comments posted: 1
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@src/event/format/mod.rs`:
- Around line 558-565: The mixed-type rewrite in the field-mapping logic is
still collapsing incompatible values into a single sibling column. In the
`mod.rs` branch that checks `field_mapping.get(&key)` and
`existing_schema.get(&key)`, stop reusing the batch-level `new_key` for every
incompatible value; instead derive the sibling name per record from the actual
value/type so `false` maps to the bool sibling and `"false"` maps to the utf8
sibling. Keep the compatibility fast-path in place, but change the fallback so
`rename_per_record_type_mismatches` in `rename_per_record_type_mismatches` can
operate on already-separated sibling names rather than trying to recover after a
shared `level_utf8` rewrite.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Repository UI
Review profile: CHILL
Plan: Pro
Run ID: 9daa3c32-9c4e-47f4-af0c-f0c9d4cfe8ef
📒 Files selected for processing (2)
src/event/format/json.rssrc/event/format/mod.rs
scenario: field exists in schema
levelwith data typefloat64subsequent batch: there are multiple events in the same ingestion batch,
and a field exists with different data types in the same batch
eg.
the ingestion fails with schema conflict error
fix: apply correct data type to each value
so "level":false becomes level_bool, "level":"false" becomes level_utf8 and "level":0 remains level
Summary by CodeRabbit