boolbv_index: handle incomplete extern array types in symbol registration#9055
Open
tautschnig wants to merge 2 commits into
Open
boolbv_index: handle incomplete extern array types in symbol registration#9055tautschnig wants to merge 2 commits into
tautschnig wants to merge 2 commits into
Conversation
There was a problem hiding this comment.
Pull request overview
Note
Copilot was unable to run its full agentic suite in this review.
Fixes a boolbv symbol-registration bug when flattening index expressions over incomplete extern T arr[] array symbols by avoiding zero-width literal-map entries, and adds a regression test reproducing the kernel _ctype[] case.
Changes:
- Skip
boolbv_mapt::get_literalsregistration when array width is unknown during index conversion. - Duplicate the same guard for both the byte-operator and non-byte-operator index paths.
- Add regression
regression/cbmc/incomplete_extern_array1/to prevent the invariant failure from recurring.
Reviewed changes
Copilot reviewed 3 out of 3 changed files in this pull request and generated 5 comments.
| File | Description |
|---|---|
| src/solvers/flattening/boolbv_index.cpp | Avoids registering array symbols in the literal map when width is unknown, preventing zero-width invariant failures. |
| regression/cbmc/incomplete_extern_array1/test.desc | Adds a regression harness asserting success and absence of prior failure strings. |
| regression/cbmc/incomplete_extern_array1/main.c | Reproduces indexing into an incomplete extern array at multiple indices. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
2e277ce to
232ff03
Compare
Codecov Report✅ All modified and coverable lines are covered by tests. Additional details and impacted files@@ Coverage Diff @@
## develop #9055 +/- ##
===========================================
+ Coverage 80.68% 80.69% +0.01%
===========================================
Files 1714 1714
Lines 189519 189526 +7
Branches 73 73
===========================================
+ Hits 152916 152946 +30
+ Misses 36603 36580 -23 ☔ View full report in Codecov by Harness. 🚀 New features to boost your workflow:
|
232ff03 to
dd4fe95
Compare
…tion When boolbv flattens an array index expression where the array is a symbol of unbounded array_typet (e.g. an incomplete declaration like 'extern T arr[]'), it registered that symbol with boolbv_mapt::get_literals using a width of array_width_opt.value_or(0). Passing 0 created a zero-width entry that tripped the size-equals-width invariant in get_literals when the same symbol was — or had been — registered at a non-zero width via its element-typed access path (e.g. T arr[i] returns a T-width value). Skip the registration only when the symbol is already registered at a different width. The width-0 registration itself must be preserved: it is what lets unbounded-array counterexample traces and string-refinement arrays display their element values (e.g. regression/cbmc/trace-values/unbounded_array); only the conflicting re-registration that trips the invariant is dropped. The shared logic is factored into boolbvt::register_array_symbol so the two index branches cannot drift apart. This was first surfaced by integration/linux scans of kernel sources that include <linux/ctype.h>, which declares 'extern const unsigned char _ctype[]' — an incomplete extern array referenced by the is*() classifier macros that expand to '_ctype[c]'. CBMC aborted at boolbv_map.cpp:68 on every such TU. Regression test: regression/cbmc/incomplete_extern_array1/ indexes such an array at two distinct indices (forcing the symbol to be reached twice) under --bounds-check, and asserts that re-reading the same index yields the same value, so the test guards against unsound flattening as well as the invariant abort. Co-authored-by: Kiro <kiro-agent@users.noreply.github.com>
The incremental SMT2 back-end keyed its set of already-declared functions on the full expression (irept). An incomplete `extern T arr[]` symbol can be reached via two expressions that share the same SSA identifier but differ only in their (sort-equivalent) array-type irep -- e.g. _ctype[c] and _ctype[c + 1] from <linux/ctype.h>. Those are distinct expression keys, so both reached send_function_definition and emitted a second (declare-fun |_ctype#1| () (Array ...)), which z3 rejects with "constant '_ctype#1' (with the given signature) already declared". Deduplicate by SSA identifier in send_function_definition: if the symbol is already in identifier_table, map the later expression to the existing declaration instead of re-declaring it. A defensive invariant checks that the later expression's type is sort-equivalent to the existing one, so a violated assumption fails loudly here rather than as an opaque downstream solver error. With this, regression/cbmc/incomplete_extern_array1 also passes under the incremental SMT2 back-end (the cbmc-new-smt-backend CTest profile), so its no-new-smt tag is removed. Co-authored-by: Kiro <kiro-agent@users.noreply.github.com>
dd4fe95 to
bfa0251
Compare
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
When boolbv flattens an array index expression where the array is a symbol of unbounded
array_typet(e.g. an incomplete declaration likeextern T arr[]), it registered that symbol withboolbv_mapt::get_literalsusing a width ofarray_width_opt.value_or(0). Passing 0 created a zero-width entry that tripped the size-equals-width invariant inget_literalswhen the same symbol was — or had been — registered at a non-zero width via its element-typed access path (e.g.T arr[i]returns a T-width value).This was first surfaced by
integration/linuxscans of kernel sources that include<linux/ctype.h>, which declaresextern const unsigned char _ctype[]— an incomplete extern array referenced by theis*()classifier macros that expand to_ctype[c]. CBMC aborted atboolbv_map.cpp:68on every such TU.The fix spans both back-ends, hence two commits:
boolbv_index— Skip the registration only when the symbol is already registered at a different width. The width-0 registration itself must be preserved: it is what lets unbounded-array counterexample traces and string-refinement arrays display their element values (e.g.regression/cbmc/trace-values/unbounded_array); only the conflicting re-registration that trips the invariant is dropped. The shared guard is factored intoboolbvt::register_array_symbolso the two index branches cannot drift apart.smt2_incremental— The incremental SMT2 back-end needs the same symbol handled once. It keyed its set of already-declared functions on the full expression irep, so the same SSA symbol reached via two sort-equivalent array-type ireps (e.g._ctype[c]and_ctype[c + 1]) was emitted as twodeclare-funs, which z3 rejects (constant '_ctype#1' ... already declared).send_function_definitionnow deduplicates by SSA identifier, with a defensive invariant that the later expression's type is sort-equivalent to the existing declaration.Regression test:
regression/cbmc/incomplete_extern_array1/indexes such an array at two distinct indices (forcing the symbol to be reached twice) under--bounds-check, and asserts that re-reading the same index yields the same value — so it guards against unsound flattening as well as the original invariant abort. As aCOREtest it is additionally run by thecbmc-new-smt-backendCTest profile (--incremental-smt2-solver) when z3 is onPATH, which exercises the second commit.