feat(behavioral): canonical dangerous-callable resolver + shared evasion corpus (implements #181) by zied-jlassi · Pull Request #182 · NVIDIA/SkillSpector

zied-jlassi · 2026-06-23T21:53:57Z

Implements the proposal in #181: a single canonicalization chokepoint that collapses every spelling of a dangerous callable to one canonical sink id, paired with a shared evasion corpus wired as a fitness gate.

I've opened this as a concrete PR (rather than only waiting on a yes/no in #181) so the approach can be judged on real, tested code — happy to adjust the direction or close if you'd prefer.

⚠️ Depends on / stacked on #180 — please read first

This branch is stacked on #180 (fix: detect builtins.* and importlib.import_module sink evasions): the canonical resolver reuses resolve_dynamic_import_call / _strip_builtins_prefix introduced there. Because main does not yet contain #180, this PR's diff currently includes #180's commit 8bdc3bc (the common.py helpers + its behavioral tests). Once #180 lands I'll rebase onto main so only the canonical layer remains. Suggested merge order: #180 first, then this.

What this PR adds on top of #180

canonical_sink.py — resolve_to_canonical_sink(node, aliases, type_map): one chokepoint reducing bare / alias / from / as, builtins.*, importlib.import_module, reflective getattr("lit"), subscript __builtins__["exec"] / <mod>.__dict__["lit"], and importlib / runpy / code sibling machinery to a single canonical sink id, reusing the existing alias / type-map / dynamic-import machinery.
behavioral_ast + behavioral_taint_tracking wired to call it. Reflective subprocess invocations grade AST9-HIGH (parity with reflective os.system); a direct subprocess.* call keeps baseline AST4-MEDIUM; a _covered_by_ast9 guard keeps AST9 and the canonical fallback complementary so they never double-fire. Benign neighbours never canonicalize to a sink.
tests/evasion_corpus/ — shared parametrized corpus as a fitness gate: every spelling → its canonical id and a real production sink rule; FP-neighbours → no sink. Adding a sink becomes "add corpus rows", not "add a detector branch + hope it's complete".

Scope / non-goals

In scope: statically-resolvable spellings. Out of scope (documented, left to the generic reflective-access AST7-LOW rule): genuinely dynamic idioms — non-literal getattr(os, name), variable subscript key, computed attribute string, module name from a variable, local-variable indirection (f = os.system; f(x)). These have no static canonical id and would need def-use / taint tracking; kept explicitly as non-goals so the layer doesn't oversell completeness.

Validation (real analyzers + repo toolchain, isolated container)

evasion corpus: 108/108 pass
full suite: +108 tests, zero regressions — 1000 passed with this diff vs 892 on the base branch; the single pre-existing test_static_patterns::TestRunStaticPatternsAgentSnooping::test_no_same_line_duplicate failure is unrelated and present on the base too
ruff==0.15.2 check + ruff format + mypy all clean

AI disclosure: prepared with AI assistance; every line was reviewed and validated by me, and the results above come from running the repo's real analyzers + test toolchain.

rng1995 · 2026-06-24T08:30:26Z

    return None


+def _strip_builtins_prefix(name: str) -> str:


[Automated SkillSpector Review]

This helper (and the other common.py changes plus the duplicated test_behavioral_ast.py / test_behavioral_taint_tracking.py additions in this PR) come from #180, which this branch is stacked on. Once #180 merges, please rebase onto main so this PR's diff contains only the new canonical-sink layer. No change needed to the code itself.

rng1995

[Automated SkillSpector Review]

Requesting changes — scope/rebase, not the implementation.

To be clear up front: the canonical-sink layer itself is excellent and I have no correctness or security objection to canonical_sink.py. The single-chokepoint design (resolve_to_canonical_sink covering bare/alias/from/as, builtins.*, dynamic import, reflective getattr, subscript __builtins__["exec"]/os.__dict__[...], and importlib/runpy/code sibling machinery with receiver-gated instance methods), the AST9-parity for reflective subprocess, the _covered_by_ast9 anti-double-fire guard, and the shared evasion-corpus fitness gate are all well-reasoned and thoroughly tested.

The blocker is scope/mergeability: this PR is stacked on #180 and its diff currently includes #180's commit 8bdc3bc (the common.py helpers and the test_behavioral_ast.py / test_behavioral_taint_tracking.py additions are duplicated from #180). As written, this can't be merged independently without pulling in / conflicting with #180.

Requested change (matches your own stated plan): land #180 first, then rebase this onto main so the diff contains only the canonical-sink layer (canonical_sink.py, the behavioral_ast/behavioral_taint_tracking wiring, and tests/evasion_corpus/). I'll re-review and approve once the diff is isolated to that.

…ness gate Introduce resolve_to_canonical_sink, a single chokepoint reducing every spelling of a dangerous callable -- bare/alias/from/as, builtins.*, importlib.import_module, reflective getattr("lit"), subscript __builtins__["exec"] and <mod>.__dict__["lit"], plus importlib/runpy/code sibling machinery -- to one canonical sink id. Wire it into behavioral_ast and behavioral_taint_tracking so the per-idiom ladders stay simple, with a shared parametrized evasion corpus asserting the invariant as a fitness gate: a future missed spelling fails a shared test instead of shipping as a silent blind spot. Reflective subprocess invocations grade AST9-HIGH for parity with reflective os.system; direct subprocess.* keeps baseline AST4-MEDIUM; AST9 and the canonical fallback stay complementary (the _covered_by_ast9 guard prevents double-firing). Benign neighbours never canonicalize to a sink. Stacked on the builtins/importlib sink-evasion fix (reuses resolve_dynamic_import_call); to be rebased onto main once that lands. 108 evasion-corpus tests pass; full suite +108 with zero regressions (ruff 0.15.2 + mypy clean). Signed-off-by: Zied Jlassi <6190550+zied-jlassi@users.noreply.github.com>

zied-jlassi · 2026-06-27T22:11:05Z

Hi @rng1995 — thanks for merging #180 into main! 🙏

As requested, I've rebased this branch onto main so the diff is now isolated to the canonical-sink layer: the #180 commit (8bdc3bc) drops out cleanly (it's now an ancestor of main via the merge), leaving a single commit touching only canonical_sink.py, the two analyzer wirings, and tests/evasion_corpus/.

Re-validated in a clean uv sync --frozen env (ruff 0.15.2): ruff check + ruff format --check clean, evasion corpus 108/108, full suite (excluding integration/provider) 1179 passed / 0 failed, and no new mypy errors in the changed files. Ready for re-review whenever you have a moment — thanks again!

rng1995

The previous stacked-PR scope issue is resolved; this is now a clean, focused diff. Two resolver blockers remain: arbitrary .loader.exec_module(...) calls are falsely classified as dynamic imports, while aliased code.InteractiveInterpreter calls evade detection. Please require importlib receiver provenance, normalize aliases for code-interpreter receivers, and add production-wiring plus corpus coverage for both cases.

rng1995 · 2026-06-30T08:24:45Z

+    """
+    if isinstance(receiver, ast.Attribute):
+        # ``<x>.loader`` — the ModuleSpec.loader protocol attribute.
+        if receiver.attr == "loader":


Blocking false positive: any attribute receiver ending in .loader is accepted without proving it is an importlib ModuleSpec loader. plugin.loader.exec_module(mod) therefore canonicalizes to import and emits AST3 for an arbitrary user object. Require importlib-backed provenance and add this shape to the negative corpus.

rng1995 · 2026-06-30T08:24:45Z

+    """
+    inferred = None
+    if isinstance(receiver, ast.Call):
+        inferred = resolve_dotted_name(receiver.func)


Blocking evasion: direct constructor names are compared before alias normalization. 'import code as c; c.InteractiveInterpreter().runsource(x)' resolves to None, so a dangerous code-exec sink is missed. Thread aliases through sibling-method resolution or apply import-alias normalization, and test module/from-import aliases.

…alize code interpreter Addresses the NVIDIA#182 re-review: two resolver precision bugs. exec_module false positive: the `<x>.loader` branch matched ANY object exposing a `.loader` attribute, so an unrelated `self.widget.loader.exec_module(...)` was wrongly classified as a dynamic import. The base must now have importlib provenance -- a spec from importlib.util.module_from_spec / spec_from_loader carried through the type map, or an importlib-rooted dotted name -- via the new `_expr_is_importlib_rooted` helper. The legitimate `spec.loader.exec_module(mod)` chain still fires. code-interpreter false negative: `_receiver_is_code_interpreter` never alias-normalized the inferred constructor type, so `import code as c; c.InteractiveInterpreter().runsource(...)` evaded detection. The inferred name now runs through `apply_import_aliases`, which also covers `from code import InteractiveInterpreter`. Production wiring: `behavioral_ast` now threads import aliases into `canonical_sibling_method` (the taint analyzer already did via `resolve_to_canonical_sink`), so both detection paths apply the gate. Corpus: +2 positive rows (aliased code interpreter, aliased importlib spec loader) and +1 false-positive neighbour (arbitrary `.loader.exec_module`), all exercised end-to-end through the real analyzers as a red->green oracle. Signed-off-by: Zied Jlassi <6190550+zied-jlassi@users.noreply.github.com>

zied-jlassi · 2026-06-30T19:42:12Z

Thanks @rng1995 — both resolver blockers are addressed in c1c4141, scoped to the canonical-sink files only.

1. exec_module false positive (importlib provenance now required). The <x>.loader branch previously matched any object exposing a .loader attribute, so self.widget.loader.exec_module(...) was wrongly graded AST3 dynamic-import. A new _expr_is_importlib_rooted helper gates it: the object holding .loader must resolve to importlib machinery — a spec from importlib.util.module_from_spec / spec_from_loader carried through the type map, or an importlib-rooted dotted name (alias-normalized). The legitimate spec.loader.exec_module(mod) chain still fires.

2. code.InteractiveInterpreter false negative (alias normalization). _receiver_is_code_interpreter never alias-normalized the inferred constructor type, so import code as c; c.InteractiveInterpreter().runsource(...) evaded the gate. The inferred name now passes through apply_import_aliases (this also picks up from code import InteractiveInterpreter).

Production wiring. behavioral_ast now threads import aliases into canonical_sibling_method; the taint analyzer already did via resolve_to_canonical_sink, so both detection paths apply the gate.

Corpus coverage (run end-to-end through the real analyzers, not just the resolver):

+ code_runsource_alias — aliased interpreter, must be detected
+ spec_loader_exec_module_alias — aliased importlib spec loader, provenance preserved
+ arbitrary_loader_attr_exec_module (FP neighbour) — must NOT fire

Verified red→green: on the pre-fix code those three rows fail (the FP neighbour produces the AST3 finding you flagged; the aliased interpreter is missed); on c1c4141 all pass.

Validation (clean uv sync --frozen --extra dev, ruff 0.15.2): ruff check / ruff format --check clean, mypy clean on the three touched files, full suite 1186 passed / 0 failed (12 skip, 26 deselected, 6 xfail).

zied-jlassi mentioned this pull request Jun 23, 2026

Proposal: canonical dangerous-callable resolver + shared evasion corpus as a fitness gate #181

Open

rng1995 reviewed Jun 24, 2026

View reviewed changes

rng1995 requested changes Jun 24, 2026

View reviewed changes

zied-jlassi force-pushed the feature/canonical-sink-evasion-corpus branch from ae6c9f1 to 4163d57 Compare June 27, 2026 22:10

rng1995 requested changes Jun 30, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

feat(behavioral): canonical dangerous-callable resolver + shared evasion corpus (implements #181)#182

feat(behavioral): canonical dangerous-callable resolver + shared evasion corpus (implements #181)#182
zied-jlassi wants to merge 2 commits into
NVIDIA:mainfrom
zied-jlassi:feature/canonical-sink-evasion-corpus

zied-jlassi commented Jun 23, 2026

Uh oh!

rng1995 Jun 24, 2026

Uh oh!

rng1995 left a comment

Uh oh!

zied-jlassi commented Jun 27, 2026

Uh oh!

rng1995 left a comment

Uh oh!

rng1995 Jun 30, 2026

Uh oh!

rng1995 Jun 30, 2026

Uh oh!

zied-jlassi commented Jun 30, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Uh oh!

Conversation

zied-jlassi commented Jun 23, 2026

⚠️ Depends on / stacked on #180 — please read first

What this PR adds on top of #180

Scope / non-goals

Validation (real analyzers + repo toolchain, isolated container)

Uh oh!

rng1995 Jun 24, 2026

Choose a reason for hiding this comment

Uh oh!

rng1995 left a comment

Choose a reason for hiding this comment

Uh oh!

zied-jlassi commented Jun 27, 2026

Uh oh!

rng1995 left a comment

Choose a reason for hiding this comment

Uh oh!

rng1995 Jun 30, 2026

Choose a reason for hiding this comment

Uh oh!

rng1995 Jun 30, 2026

Choose a reason for hiding this comment

Uh oh!

zied-jlassi commented Jun 30, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants