feat(knowledge): add-source conflict dialog and directory source/relativePath model#16189
Conversation
9e8884f to
7e6c86b
Compare
7e6c86b to
006d18d
Compare
zhangjiadi225
left a comment
There was a problem hiding this comment.
Multi-agent review: 6 parallel module reviewers + 1 adversarial verifier (every finding stress-tested). CI is green. The change is well-structured — lock ordering (cancel outside / purge inside), vector-before-row deletion, best-effort file delete, the additive IPC contract, and the path -> relativePath rename are all sound. The items below should be addressed before merge.
Blockers
-
B1 —
replacepurges only one of several same-name roots (inline onKnowledgeWorkflowService.ts).resolveKnowledgeAddConflictskeeps only the first existing root per(type, conflictKey), but keep-all routinely creates duplicates (tworeport.pdffrom different folders; two notes with the same first line). Areplacethen leaves a stale duplicate, violating the documented contract. -
B2 — directory
raw/prefix drifts on every reindex (inline onprepare.ts).collectReservedTopLevelNamescounts the container's ownrelativePathas reserved against itself, so re-expansion dedupsdocs->docs_1->docs_2..., churning on-disk material and the display title with no user action. Fix: passexcludeItemIdfor the container. -
B3 — raw NUL (0x00) byte in
addConflicts.ts:30. The conflict-map-key separator is a literal 0x00 control byte inside the template literal, instead of the escape sequence backslash-zero. The accompanying comment is correct; only the literal is wrong. Consequences: (1) git classifies the whole file as binary (verified —git diff --numstatreports dashes), so it has no reviewable diff or blame and cannot receive inline review comments; (2) it is fragile — a formatter, editor, or copy-paste can silently drop the byte, after which the separator becomes empty andtoConflictMapKeyloses its anti-aliasing guarantee. Fix: replace the raw byte with the escape sequence backslash-zero (or backslash-u-0000).
Warning
- W1 — dotted directory names mangled by extension-based dedup (inline on
directory.ts). On collision a folderreport.v2becomesreport_1.v2instead ofreport.v2_1. Cosmetic, but compounds with B2 (each reindex forces a self-collision).
Notices
- N2 (inline) — conflict-dialog buttons are all
variant=outline; per DESIGN.md the primary action should useemphasisand the destructiveReplaceshould usedestructive. - N3 (inline) — empty-content notes share an empty conflict key, so two blank notes are treated as same-name.
- Nit (no inline anchor — outside the diff):
useAddKnowledgeItems.test.ts:35still constructs a directory input with the removedpathfield; stale leftover of the rename, worth cleaning up.
| case 'directory': | ||
| return getKnowledgePathBasename(data.source || '') | ||
| case 'note': | ||
| return getKnowledgeNoteFirstLine(data.content || '') |
There was a problem hiding this comment.
[A2 · Notice] For a note, getKnowledgeItemConflictKey returns the first non-empty line of data.content, which is the empty string for blank content. The runtime add schema does not reject this — RuntimeNoteItemDataSchema.content is z.string().max(...) with no .min(1) (unlike url/source/relativePath). So two empty-content notes produce the same empty conflict key and are treated as same-name. Rare (notes come from picked files) but reachable — consider a unique fallback key, or skipping conflict detection, when the note key is empty.
| <Button | ||
| variant="outline" | ||
| onClick={() => onResolve('rename')} | ||
| loading={pendingResolution === 'rename'} | ||
| disabled={isResolving}> | ||
| {t('knowledge.data_source.add_dialog.conflict_dialog.keep_all')} | ||
| </Button> | ||
| <Button | ||
| variant="outline" | ||
| onClick={() => onResolve('replace')} | ||
| loading={pendingResolution === 'replace'} | ||
| disabled={isResolving}> | ||
| {t('knowledge.data_source.add_dialog.conflict_dialog.replace')} | ||
| </Button> | ||
| <Button variant="outline" onClick={onCancel} disabled={isResolving}> | ||
| {t('common.cancel')} | ||
| </Button> |
There was a problem hiding this comment.
[C7 · Notice] All three footer buttons use variant="outline", flattening the action hierarchy in this new dialog — and Replace is destructive (it purges existing items). DESIGN.md prescribes emphasis for the primary action of a dialog and destructive for destructive ones, and the Button primitive supports both. Consider emphasis for the recommended action and destructive for Replace.
| private async purgeConflictingExistingItems(base: KnowledgeBase, itemsToAdd: KnowledgeAddItemInput[]): Promise<void> { | ||
| const currentRoots = await knowledgeItemService.getRootItemsByBaseId(base.id) | ||
| const { conflictingExistingRootIds } = resolveKnowledgeAddConflicts(itemsToAdd, currentRoots) | ||
| if (conflictingExistingRootIds.length === 0) { | ||
| return | ||
| } | ||
| const subtreeItems = await knowledgeItemService.getSubtreeItems(base.id, conflictingExistingRootIds, { |
There was a problem hiding this comment.
[A1 · Blocker] This replace purge (and the cancelActiveKnowledgeSubtreeJobs call at lines 90-93) drive off conflictingExistingRootIds, but resolveKnowledgeAddConflicts (addConflicts.ts) collects only the FIRST existing root per (type, conflictKey): existingByKey is filled under an if (!existingByKey.has(mapKey)) guard, and conflictingExistingRootIds is populated solely from that single existing.id. The default rename/keep-all flow routinely creates several roots sharing one conflict key — e.g. /a/report.pdf + /b/report.pdf (both source basename report.pdf), or two notes with the same first line (see KnowledgeService.integration.test.ts:284). A replace then purges only ONE of them and adds the incoming source, so a stale duplicate survives — violating the documented replace contract (Conflicting existing items are purged).
Fix: in resolveKnowledgeAddConflicts, collect ALL matching existing-root ids per key (e.g. Map<string, KnowledgeItem[]>) and return all of them; add a test with two existing same-key roots asserting both are purged on replace.
(Note: addConflicts.ts itself cannot take an inline comment — it is committed as a binary file due to a stray NUL byte; see the review summary.)
|
Thanks for the thorough multi-agent review — all seven items are addressed in Fixed
Tests |
zhangjiadi225
left a comment
There was a problem hiding this comment.
All review feedback addressed in 7ee96b2 — the two purge/prefix-drift blockers, the raw NUL byte (file is no longer binary), the dotted-directory dedup, the dialog button hierarchy, and the empty-note conflict key are all fixed, each with a targeted regression test. CI is green. LGTM.
fcf00d0 to
649cde0
Compare
…tivePath model Rebuilt on top of the directory field-rename PR. Bundles the knowledge add-source work and repurposes `relativePath` from the rename: - same-name conflict dialog on add (keep-all / replace / cancel) - open the OS picker directly for file & directory sources - store directory children under the deduped `raw/<name>` prefix (with `_N` suffix on collision), not the owner UUID - model directory items as `source` (original folder) + `relativePath` (deduped `raw/` prefix written back on first expansion); expansion now scans `source`, the prefix is pinned via updateDirectoryRelativePath - item titles show the deduped `relativePath` basename, else `source` - delete removes the whole `raw/<prefix>` subtree and prunes empty shells Squashed rebuild of the former feat/knowledge-add-conflict-dialog history onto the rename PR; final tree is identical to the validated combined state. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> Signed-off-by: eeee0717 <chentao020717Work@outlook.com>
- purge every existing root sharing a conflict key on replace, not just the first — keep-all can leave several roots under one name (report.pdf, report_1.pdf), and replace must overwrite the whole group - exclude a directory from its own reserved-name set on reindex so its pinned relativePath prefix is not self-deduped to `_1` every run - replace the raw NUL conflict-map-key separator with a `\0` escape so the source file stays UTF-8 text instead of a binary blob - keep a dotted directory prefix intact when deduping (`report.v2_1`, not `report_1.v2`) via a splitExtension flag on nextFreeKnowledgeRelativePath - never treat blank-content notes as the same name: an empty detection key is skipped in detection and always kept - conflict dialog: use emphasis (keep-all) and destructive (replace) button variants to signal the safe vs lossy action Tests pin each behavioral fix: group purge, blank-note non-collision, dotted directory dedup at the call site, and container self-exclusion on reindex. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> Signed-off-by: eeee0717 <chentao020717Work@outlook.com>
…ll same-source siblings Add-source conflict detection keyed file/directory off the source basename, so N kept copies of `test.md` (stored test.md / test_2.md / test_3.md) collapsed onto one detection key. `replace` then purged every sibling instead of the single copy the incoming source overwrites. Key file/directory off `relativePath` (the deduped raw/ name) when present, else the source basename: an add-input has no relativePath so detection still fires on the source basename, while an existing item keys off its unique deduped name, so replace targets only relativePath `test.md` and leaves test_2.md / test_3.md intact. This also aligns the key with getKnowledgeItemDisplayTitle. url/note are unchanged — their deduped name is a post-index snapshot absent at add-time, so they must keep keying off the raw url / first line. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> Signed-off-by: eeee0717 <chentao020717Work@outlook.com>
649cde0 to
05ae1a0
Compare
What this PR does
Before this PR, the v2 knowledge base "add data source" flow added items immediately with no same-name handling, picked file/folder sources through in-app dropzone/folder panels, and modeled directory items with a redundant
pathfield duplicatingsource.After this PR:
data.url) against existing root items plus earlier items in the same submit batch, then a Finder-style dialog offers one decision for the whole batch: 保留全部 (keep all, auto_Nrename) / 替换 (replace — the incoming source wins) / 取消 (abort the batch).source(the original folder, re-scanned on expansion) +relativePath(the dedupedraw/<name>prefix, written back on first expansion viaupdateDirectoryRelativePath); expanded children are stored under that deduped prefix (with_Non collision) instead of the owner UUID. Item titles show the dedupedrelativePathbasename, else thesourcebasename.raw/<prefix>subtree and prunes the empty shells left behind.CleanShot.2026-06-18.at.20.14.37.mp4
Fixes #
N/A — from internal QA / design feedback on the v2 knowledge base, no public issue.
Why we need it and why it was done in this way
Adding a source whose name already exists previously failed silently or collided on disk; this gives the user an explicit, Finder-like choice. Detection key and display name are deliberately separated so the dialog shows the list title the user recognizes while matching on a stable key (url keys off the raw URL because the post-index snapshot title is absent at add-time).
The following tradeoffs were made:
addItemsmutation lock (not the async delete queue) — otherwise the new file would be_N-renamed before the delete lands and 替换 would degrade to 保留全部.addItems(baseId, inputs, conflictStrategy = 'rename')folds detection into the existing call (default'rename'preserves internal callers like restore/migrator); no new IPC channel.sourceand only the dedupedraw/prefix inrelativePath, mirroring the file source/relativePath split, so re-expansion always re-scans the real folder.The following alternatives were considered:
Breaking changes
None for users. Data-model note: the directory item
datashape changes from{ source, path }to{ source, relativePath? };knowledge_item.datais a JSON column, so there is no schema/DDL migration. The v2 knowledge base is still pre-release.Special notes for your reviewer
pathtorelativePath#16188 (the purepath→relativePathfield rename). GitHub does not allow a cross-fork PR base to point at a fork branch, so this PR's base ismainand its diff currently also includes refactor(knowledge): rename directory item fieldpathtorelativePath#16188's rename. Please review/merge refactor(knowledge): rename directory item fieldpathtorelativePath#16188 first; I will then rebase this branch ontomain, after which this PR's diff collapses to just the conflict-dialog feature.typecheck:node/typecheck:web, oxlint (type-aware) + biome, main & renderer Vitest for the touched modules, plus manual runtime checks of the add flow (file/directory native pickers, all three conflict choices, directory expansion and delete).Checklist
mainfor active development,v1for v1 maintenance fixesdatacolumn, no migration)Release note