Skip to content

refactor(semantic-search): use Cloudflare AI Search#66

Closed
aryasaatvik wants to merge 5 commits into
devfrom
feat/semantic-search-cloudflare-ai-search
Closed

refactor(semantic-search): use Cloudflare AI Search#66
aryasaatvik wants to merge 5 commits into
devfrom
feat/semantic-search-cloudflare-ai-search

Conversation

@aryasaatvik

@aryasaatvik aryasaatvik commented Jun 24, 2026

Copy link
Copy Markdown
Owner

Summary

This rewrites the semantic-search plugin around Cloudflare AI Search and removes the previous Vectorize, embedding, chunking, FTS, zvec, sqlite-vec, queue-style index job, and evaluation harness stack.

The new plugin treats AI Search as the retrieval/indexing engine and keeps only one Executor-side projection table, aiSearchItems, to track uploaded tool documents, fingerprints, item ids, and processing status.

Implementation Outline

semanticSearchHttpPlugin({ aiSearch, namespace })
  -> semanticSearchPlugin
     -> pluginStorage: { aiSearchItems }
     -> extension.reindex(executor)
     -> extension.search(executor, { query, namespace, limit })
     -> extension.status()
     -> runtime.toolDiscoveryProvider()
reindexAiSearch
  -> executor.tools.manifest()
  -> diff manifest.indexFingerprint against aiSearchItems rows
  -> executor.tools.schema(address) for changed tools
  -> collect markdown document + 5 custom metadata fields
  -> aiSearch.items.delete(previousItemId) when changed
  -> aiSearch.items.upload(`${path}.md`, document, { metadata })
  -> aiSearchItems.put({ path, key, itemId, fingerprint, status })
  -> delete AI Search items for removed tools
makeAiSearchToolDiscoveryProvider.searchTools
  -> aiSearch.search({ messages, ai_search_options })
  -> read chunks[]
  -> map chunk.item.metadata or aiSearchItems row back to ToolDiscoveryResult
  -> collapse multiple chunks per tool path to the best score
  -> apply optional integration/path prefix filter
  -> return the normal PagedResult<ToolDiscoveryResult>

Cloudflare Host Wiring

CloudflareEnv.AI_SEARCH
  -> loadConfig(env).aiSearch
  -> makeCloudflarePlugins(secret, analytics, aiSearch, organizationId)
  -> semanticSearchHttpPlugin({ aiSearch, namespace: organizationId })

apps/host-cloudflare/wrangler.jsonc now documents the optional AI_SEARCH binding; operators uncomment it after creating the AI Search instance.

Breaking Changes

  • Removes the old Vectorize/Gemini embedding plugin options.
  • Removes local vector and FTS stores from the semantic-search package exports.
  • Removes the old internal index-run/job/chunk storage collections.
  • Reindex now queues uploads into AI Search instead of performing local embedding and vector upserts.
  • Search results are AI Search chunk results collapsed back to Executor tool rows.

Validation

  • bun run --cwd packages/plugins/semantic-search typecheck
  • bun run --cwd packages/plugins/semantic-search test
  • bun run --cwd apps/host-cloudflare typecheck
  • bun run --cwd apps/host-cloudflare test
  • bun run typecheck
  • bun run test
  • Targeted oxlint with --deny-warnings over semantic-search, host-cloudflare changed files, and touched SDK discovery comments
  • Targeted oxfmt --check over changed files

Root bun run lint and root bun run format:check still scan .scratchpad; both are currently blocked by pre-existing scratchpad files unrelated to this PR.

@greptile-apps

greptile-apps Bot commented Jun 24, 2026

Copy link
Copy Markdown

Greptile Summary

This PR replaces the previous Vectorize + Gemini embedding + FTS5 + chunking pipeline with Cloudflare AI Search as the single retrieval and indexing engine, deleting ~8 600 lines of the old stack and adding ~900 lines of new code. The plugin maintains one D1 projection table (aiSearchItems) for fingerprinting and item-ID tracking, uploads markdown tool documents to AI Search during reindex, and collapses returned chunks back to ToolDiscoveryResult entries for both operator console and runtime tools.search.

  • reindexAiSearch diffs the live manifest against tracked D1 rows by fingerprint, uploads changed documents, best-effort-deletes replaced items, and removes stale rows; upload-before-record ordering ensures upload failures leave the previous item live.
  • makeAiSearchToolDiscoveryProvider queries AI Search with hybrid retrieval + reranking, collapses multiple chunks per tool to the best score, and filters by an optional integration-prefix namespace; the runtime path uses metadata-only results while the extension path cross-references D1 rows to suppress orphaned chunks.
  • Host wiring (app.ts, config.ts, execution.ts, plugins.ts, wrangler.jsonc) consistently threads the optional AI_SEARCH binding through both the HTTP-API and execution stacks.

Confidence Score: 5/5

Safe to merge; all previously flagged correctness issues have been addressed and no new blocking defects were found.

The upload-before-record and best-effort-delete ordering is sound. The orphan-filtering split between the extension path (D1-backed) and the runtime path (metadata-only) is an intentional, tested trade-off. The two remaining notes — the max_num_results under-fetch when chunk-per-tool ratio is high, and the sequential upload loop's wall-clock risk for large catalogs — are worth addressing in a follow-up but do not block correctness for normal-sized catalogs.

packages/plugins/semantic-search/src/sdk/ai-search.ts — the searchTools over-fetch strategy and the sequential reindex loop are the areas most likely to need tuning as catalog size grows.

Important Files Changed

Filename Overview
packages/plugins/semantic-search/src/sdk/ai-search.ts Core new module wiring Cloudflare AI Search for reindex, status, and tool discovery; sequential upload loop and under-fetch strategy for searchTools are notable design concerns
packages/plugins/semantic-search/src/sdk/plugin.ts Plugin wiring is clean; runtime provider intentionally uses items:undefined (metadata-only path), extension search uses items collection for orphan filtering
packages/plugins/semantic-search/src/sdk/documents.ts Document construction and schema-term extraction look correct; metadata truncation and binary-search byte-limit truncation are sound
packages/plugins/semantic-search/src/sdk/ai-search.test.ts Good coverage of chunk collapsing, namespace filtering, orphan-row filtering, and stale-row removal under remote delete failure
apps/host-cloudflare/src/execution.ts Now passes aiSearch and organizationId to makeCloudflarePlugins, matching the HTTP-stack namespace; previous inconsistency resolved
apps/host-cloudflare/wrangler.jsonc AI Search binding is now commented-out opt-in with a clear comment instructing operators to uncomment only after creating the instance

Sequence Diagram

%%{init: {'theme': 'neutral'}}%%
sequenceDiagram
    participant Agent as Agent / Engine
    participant Plugin as semanticSearchPlugin
    participant AiSearch as Cloudflare AI Search
    participant D1 as D1 aiSearchItems

    Note over Plugin,D1: reindexAiSearch
    Plugin->>D1: items.list() load existing rows
    Plugin->>Plugin: executor.tools.manifest()
    loop each changed tool fingerprint mismatch
        Plugin->>Plugin: collectToolSearchDocument()
        Plugin->>AiSearch: items.upload(name, content, metadata)
        AiSearch-->>Plugin: id and key
        Plugin->>D1: items.put path itemId fingerprint status queued
        Plugin->>AiSearch: items.delete(previousItemId) best-effort
    end
    loop stale tools not in live manifest
        Plugin->>D1: items.remove key
        Plugin->>AiSearch: items.delete(itemId) best-effort
    end
    Note over Agent,AiSearch: tools.search runtime
    Agent->>Plugin: searchTools query limit offset
    Plugin->>AiSearch: search messages ai_search_options
    AiSearch-->>Plugin: chunks array
    Plugin->>Plugin: collapse chunks to bestByPath metadata-only
    Plugin-->>Agent: PagedResult ToolDiscoveryResult
    Note over Plugin,D1: extension.search operator console
    Plugin->>D1: items.list build rowsByKey
    Plugin->>AiSearch: search messages ai_search_options
    AiSearch-->>Plugin: chunks array
    Plugin->>Plugin: collapse chunks filter by rowsByKey suppress orphans
    Plugin-->>Plugin: SemanticSearchResultPage
Loading
%%{init: {'theme': 'base', 'themeVariables': {"darkMode": true, "background": "#0d1117", "primaryColor": "#21262d", "primaryTextColor": "#e6edf3", "primaryBorderColor": "#8b949e", "lineColor": "#8b949e", "textColor": "#e6edf3", "edgeLabelBackground": "#161b22", "actorBkg": "#21262d", "actorBorder": "#8b949e", "actorTextColor": "#e6edf3", "actorLineColor": "#8b949e", "signalColor": "#8b949e", "signalTextColor": "#e6edf3", "noteBkgColor": "#373320", "noteBorderColor": "#d4a72c", "noteTextColor": "#f0e6c0", "labelBoxBkgColor": "#21262d", "labelBoxBorderColor": "#8b949e", "labelTextColor": "#e6edf3", "loopTextColor": "#e6edf3", "activationBkgColor": "#30363d", "activationBorderColor": "#8b949e"}}}%%
sequenceDiagram
    participant Agent as Agent / Engine
    participant Plugin as semanticSearchPlugin
    participant AiSearch as Cloudflare AI Search
    participant D1 as D1 aiSearchItems

    Note over Plugin,D1: reindexAiSearch
    Plugin->>D1: items.list() load existing rows
    Plugin->>Plugin: executor.tools.manifest()
    loop each changed tool fingerprint mismatch
        Plugin->>Plugin: collectToolSearchDocument()
        Plugin->>AiSearch: items.upload(name, content, metadata)
        AiSearch-->>Plugin: id and key
        Plugin->>D1: items.put path itemId fingerprint status queued
        Plugin->>AiSearch: items.delete(previousItemId) best-effort
    end
    loop stale tools not in live manifest
        Plugin->>D1: items.remove key
        Plugin->>AiSearch: items.delete(itemId) best-effort
    end
    Note over Agent,AiSearch: tools.search runtime
    Agent->>Plugin: searchTools query limit offset
    Plugin->>AiSearch: search messages ai_search_options
    AiSearch-->>Plugin: chunks array
    Plugin->>Plugin: collapse chunks to bestByPath metadata-only
    Plugin-->>Agent: PagedResult ToolDiscoveryResult
    Note over Plugin,D1: extension.search operator console
    Plugin->>D1: items.list build rowsByKey
    Plugin->>AiSearch: search messages ai_search_options
    AiSearch-->>Plugin: chunks array
    Plugin->>Plugin: collapse chunks filter by rowsByKey suppress orphans
    Plugin-->>Plugin: SemanticSearchResultPage
Loading

Reviews (5): Last reviewed commit: "fix(semantic-search): persist AI Search ..." | Re-trigger Greptile

Comment thread packages/plugins/semantic-search/src/sdk/documents.ts
Comment thread packages/plugins/semantic-search/src/sdk/plugin.ts
Comment thread apps/host-cloudflare/src/execution.ts Outdated
Comment thread apps/host-cloudflare/wrangler.jsonc
Comment thread packages/plugins/semantic-search/src/sdk/ai-search.ts Outdated
Comment thread packages/plugins/semantic-search/src/sdk/ai-search.ts
Comment thread packages/plugins/semantic-search/src/sdk/ai-search.ts
@aryasaatvik

Copy link
Copy Markdown
Owner Author

Superseded by the backend-contract stack: #67 introduces the semantic search backend contract, and #68 adds Cloudflare AI Search as a backend without removing the local/vector implementations.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant