Skip to content

feat(search): collapse over-long separators in recalled snippets#819

Open
niemst wants to merge 6 commits into
mksglu:nextfrom
niemst:feat/collapse-long-separators-in-snippet
Open

feat(search): collapse over-long separators in recalled snippets#819
niemst wants to merge 6 commits into
mksglu:nextfrom
niemst:feat/collapse-long-separators-in-snippet

Conversation

@niemst

@niemst niemst commented Jun 12, 2026

Copy link
Copy Markdown

What / Why / How

ctx_search echoes indexed content verbatim, so decorative separators (=×80 banners, long --- rules) are recalled in full — spending context tokens for ~zero information. extractSnippet now collapses any single non-whitespace char repeated 12+ times to ccc…×N, lossless about the original length.

  • Threshold 12 leaves real markup intact: --- rules, ***/___, ~~~ fences, @@ diff hunks.
  • Applied only to the returned snippet, after FTS5 highlight offsets are computed — highlight positions are unaffected.
  • \S-only, so whitespace/newline structure is never touched.

Before

--- [session | … | file:/tmp/report.txt] ---
================================================================================
flow_id=306058 …

After

--- [session | … | file:/tmp/report.txt] ---
===…×80
flow_id=306058 …

Affected platforms

  • All platforms

Recall snippet formatting is platform-agnostic.

Test plan

Added 5 cases to tests/core/search.test.tsextractSnippet collapses over-long separators): 80-char banner → ===…×80, short markup untouched, lossless ×N, whitespace runs untouched, collapse survives the windowed/truncated path. npm test: 4244 pass (2 pre-existing macOS locale/PWD env failures, unrelated to this change). npm run typecheck: clean.

Bundles not rebuilt on purpose — bundle.yml regenerates them on merge to main, and next's committed bundle is already stale vs source, so a local rebuild would only add unrelated churn.

Checklist

  • Tests added/updated (TDD: red → green)
  • npm test passes (2 unrelated pre-existing env failures)
  • npm run typecheck passes
  • Docs updated if needed (n/a — internal formatter)
  • No Windows path regressions (no path code touched)
  • Targets next branch

github-actions Bot and others added 6 commits June 12, 2026 09:31
Decorative separators (= * 80 banners, --- rules) carried in indexed
content are echoed verbatim on ctx_search recall, spending context
tokens for ~zero information. Collapse a 12+ char single-char run to
ccc…×N (lossless about length). Threshold 12 leaves markdown rules,
fences, and diff hunks untouched. Applied only to the returned snippet,
so FTS5 highlight offsets are unaffected.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants