fix(visualize): first-session quick wins — UUID names, theme repaint, search navigation, honest stats#3029
fix(visualize): first-session quick wins — UUID names, theme repaint, search navigation, honest stats#3029Vasilije1990 wants to merge 7 commits into
Conversation
Top five impact-per-effort fixes from a six-persona first-time-user
review of the visualization (each finding independently reproduced):
- Never display UUIDs/hashes as node names: identifier-shaped values
are skipped when deriving display names, falling back to a readable
'Unnamed <Type> (id8)' placeholder, and unnamed nodes are excluded
from Key-mode label landmarks. On the review corpus this removed 104
raw-UUID labels — reviewers' single biggest trust killer.
- Theme toggle now repaints the canvas immediately (new
window._requestGraphRedraw hook) instead of leaving 95% of the
screen in the old theme until the next pan, and the chosen theme
persists across reloads via localStorage.
- Search is navigable: a match counter shows in the box, Enter jumps
to the best match and cycles onward (Shift+Enter backwards, Escape
clears). Previously highlighting was display-only with no way to
reach what you found.
- Legend can no longer contradict the canvas: type swatches sample the
color actually drawn on a node of that type instead of a second,
drifted palette.
- Honest stats and controls: 'Unknown' provenance no longer counts
('1 node sets' -> '0 node sets'), 'types' renamed to 'node kinds'
with a tooltip (it contradicted the 'Types (49)' column header), and
color-by modes for provenance the dataset doesn't carry (node set /
user) are disabled with an explanatory tooltip instead of rendering
a monochrome graph.
All five verified live with Playwright against the 542-node demo
graph; 39 visualization unit tests pass (4 new).
Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
WalkthroughVisualization module enhancements add identifier-detection logic to prevent UUID/hash strings as node display names, introduce stateful search navigation with match cycling and smooth pan/zoom, exclude "Unknown" provenance values from stats and legend, and persist theme preference to localStorage while syncing graph redraws across theme changes. ChangesVisualization UX Enhancements
🎯 3 (Moderate) | ⏱️ ~20 minutes 🚥 Pre-merge checks | ✅ 4 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (4 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches📝 Generate docstrings
🧪 Generate unit tests (beta)
Comment |
There was a problem hiding this comment.
Actionable comments posted: 3
🧹 Nitpick comments (1)
cognee/tests/unit/modules/visualization/test_preprocessor.py (1)
466-500: ⚡ Quick winAdd a short docstring to
TestUnnamedNodeFallbacks.This new test class is undocumented; a one-line intent docstring keeps the contract explicit for future maintenance.
Reference: PEP 257 – Docstring Conventions: https://peps.python.org/pep-0257/
As per coding guidelines, undocumented function definitions and class definitions in the project's Python code are assumed incomplete.
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@cognee/tests/unit/modules/visualization/test_preprocessor.py` around lines 466 - 500, Add a one-line docstring to the TestUnnamedNodeFallbacks test class explaining its intent (e.g., that it verifies fallback naming behavior for unnamed nodes such as UUID/hash detection and label priority), by placing a short string literal immediately under the class definition for TestUnnamedNodeFallbacks so the class is documented per PEP 257.Source: Coding guidelines
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@cognee/modules/visualization/preprocessor.py`:
- Around line 931-933: The current is_unnamed calculation wrongly treats any
name starting with "Unnamed " as synthetic; update the check in the
node_info["is_unnamed"] assignment (the line using looks_like_identifier and
node_info["name"].startswith("Unnamed ")) to only mark synthetic placeholders,
not user-provided labels—e.g., replace the prefix test with a strict
pattern/constant check that matches known synthetic forms (like the exact
placeholder or a numeric-suffixed placeholder such as "Unnamed 1") or a regex
that enforces the whole-string format; keep looks_like_identifier as-is and
ensure label_priority logic later (where label_priority is used) now preserves
legitimate names that merely begin with "Unnamed ".
In `@cognee/modules/visualization/views/story_view.js`:
- Around line 858-868: The schema-bridge handler (window._highlightSchemaType)
updates searchMatches/searchQuery but does not sync the derived navigation
state; update the handler so after it sets searchMatches/searchQuery it rebuilds
searchMatchList from searchMatches, sets searchMatchIdx to a valid index (e.g. 0
when matches exist, -1 when none), and then calls updateSearchCount() so the
counter text and Enter-cycling stay consistent; ensure the same sync logic is
applied to the other schema-driven update block later in the file (the other
occurrence around the 893-916 region).
- Line 884: The sort is using b.importance/a.importance but the UI weighting
uses importance_weight; update the comparator for searchMatchList.sort (the
anonymous function handling a and b) to order by (b.importance_weight ||
b.importance || 0) - (a.importance_weight || a.importance || 0) so entries with
importance_weight are prioritized and datasets that only have importance still
sort consistently; ensure the comparator references the same property names
(importance_weight, importance) used elsewhere (e.g., radius/labels) and keeps
the fallback to 0.
---
Nitpick comments:
In `@cognee/tests/unit/modules/visualization/test_preprocessor.py`:
- Around line 466-500: Add a one-line docstring to the TestUnnamedNodeFallbacks
test class explaining its intent (e.g., that it verifies fallback naming
behavior for unnamed nodes such as UUID/hash detection and label priority), by
placing a short string literal immediately under the class definition for
TestUnnamedNodeFallbacks so the class is documented per PEP 257.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
Run ID: 07d72433-c7a8-4a23-b963-fb8945f6c3ca
📒 Files selected for processing (5)
cognee/modules/visualization/preprocessor.pycognee/modules/visualization/template.htmlcognee/modules/visualization/views/story_view.jscognee/modules/visualization/views/ui_chrome.jscognee/tests/unit/modules/visualization/test_preprocessor.py
| node_info["is_unnamed"] = looks_like_identifier(raw_name) or node_info["name"].startswith( | ||
| "Unnamed " | ||
| ) |
There was a problem hiding this comment.
Prefix-based is_unnamed detection can hide valid labels.
node_info["name"].startswith("Unnamed ") conflates real user content (e.g. a legitimate title beginning with “Unnamed ”) with synthetic placeholders. Those nodes then lose label_priority in Line 1028+, which is a user-visible correctness regression in Key mode.
Proposed fix
-def derive_node_name(node_info, node_id):
+def derive_node_name(node_info, node_id):
@@
- if name and not looks_like_identifier(name):
- return name
+ if name and not looks_like_identifier(name):
+ return name, False
@@
- return normalized[:120]
+ return normalized[:120], False
@@
- return f"Unnamed {node_type} ({str(node_id)[:8]})"
+ return f"Unnamed {node_type} ({str(node_id)[:8]})", True- node_info["name"] = derive_node_name(node_info, node_id)
+ derived_name, is_placeholder = derive_node_name(node_info, node_id)
+ node_info["name"] = derived_name
@@
- node_info["is_unnamed"] = looks_like_identifier(raw_name) or node_info["name"].startswith(
- "Unnamed "
- )
+ node_info["is_unnamed"] = looks_like_identifier(raw_name) or is_placeholder🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@cognee/modules/visualization/preprocessor.py` around lines 931 - 933, The
current is_unnamed calculation wrongly treats any name starting with "Unnamed "
as synthetic; update the check in the node_info["is_unnamed"] assignment (the
line using looks_like_identifier and node_info["name"].startswith("Unnamed "))
to only mark synthetic placeholders, not user-provided labels—e.g., replace the
prefix test with a strict pattern/constant check that matches known synthetic
forms (like the exact placeholder or a numeric-suffixed placeholder such as
"Unnamed 1") or a regex that enforces the whole-string format; keep
looks_like_identifier as-is and ensure label_priority logic later (where
label_priority is used) now preserves legitimate names that merely begin with
"Unnamed ".
| var searchCountEl=document.getElementById("search-count"); | ||
| var searchMatchList=[]; // matches ordered by importance, for Enter-cycling | ||
| var searchMatchIdx=-1; | ||
|
|
||
| function updateSearchCount(){ | ||
| if(!searchCountEl)return; | ||
| if(!searchQuery){searchCountEl.textContent="";return} | ||
| var n=searchMatchList.length; | ||
| if(!n){searchCountEl.textContent="no matches";return} | ||
| searchCountEl.textContent=searchMatchIdx>=0?((searchMatchIdx%n)+1)+" / "+n:n+(n===1?" match":" matches"); | ||
| } |
There was a problem hiding this comment.
Schema-triggered highlighting does not sync the new search navigation state.
After this search-state refactor, the schema bridge path (window._highlightSchemaType, later in file) updates searchMatches/searchQuery but not searchMatchList, searchMatchIdx, or the counter. That leaves stale count text and inconsistent Enter navigation after schema-driven highlights.
Suggested patch
function updateSearchCount(){
if(!searchCountEl)return;
if(!searchQuery){searchCountEl.textContent="";return}
var n=searchMatchList.length;
if(!n){searchCountEl.textContent="no matches";return}
searchCountEl.textContent=searchMatchIdx>=0?((searchMatchIdx%n)+1)+" / "+n:n+(n===1?" match":" matches");
}
+
+function rebuildSearchMatchListFromSet(){
+ searchMatchList=[];
+ nodes.forEach(function(n){ if(searchMatches.has(n.id))searchMatchList.push(n); });
+ searchMatchList.sort(function(a,b){
+ var bi=(typeof b.importance_weight==="number")?b.importance_weight:(b.importance||0);
+ var ai=(typeof a.importance_weight==="number")?a.importance_weight:(a.importance||0);
+ return bi-ai;
+ });
+ searchMatchIdx=-1;
+ updateSearchCount();
+} window._highlightSchemaType=function(typeName){
searchQuery="__schema_type__";
searchMatches.clear();
nodes.forEach(function(n){if(nodeMatchesSchemaType(n,typeName))searchMatches.add(n.id)});
+ rebuildSearchMatchListFromSet();
draw();
};Also applies to: 893-916
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@cognee/modules/visualization/views/story_view.js` around lines 858 - 868, The
schema-bridge handler (window._highlightSchemaType) updates
searchMatches/searchQuery but does not sync the derived navigation state; update
the handler so after it sets searchMatches/searchQuery it rebuilds
searchMatchList from searchMatches, sets searchMatchIdx to a valid index (e.g. 0
when matches exist, -1 when none), and then calls updateSearchCount() so the
counter text and Enter-cycling stay consistent; ensure the same sync logic is
applied to the other schema-driven update block later in the file (the other
occurrence around the 893-916 region).
| searchMatchList.push(n); | ||
| } | ||
| }); | ||
| searchMatchList.sort(function(a,b){return (b.importance||0)-(a.importance||0)}); |
There was a problem hiding this comment.
Use importance_weight when ordering search matches.
Line 884 sorts by n.importance, but this file’s weighting logic (e.g., radius/labels) is based on importance_weight. On datasets without importance, Enter-cycling becomes effectively unsorted.
Suggested patch
- searchMatchList.sort(function(a,b){return (b.importance||0)-(a.importance||0)});
+ searchMatchList.sort(function(a,b){
+ var bi=(typeof b.importance_weight==="number")?b.importance_weight:(b.importance||0);
+ var ai=(typeof a.importance_weight==="number")?a.importance_weight:(a.importance||0);
+ return bi-ai;
+ });📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| searchMatchList.sort(function(a,b){return (b.importance||0)-(a.importance||0)}); | |
| searchMatchList.sort(function(a,b){ | |
| var bi=(typeof b.importance_weight==="number")?b.importance_weight:(b.importance||0); | |
| var ai=(typeof a.importance_weight==="number")?a.importance_weight:(a.importance||0); | |
| return bi-ai; | |
| }); |
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@cognee/modules/visualization/views/story_view.js` at line 884, The sort is
using b.importance/a.importance but the UI weighting uses importance_weight;
update the comparator for searchMatchList.sort (the anonymous function handling
a and b) to order by (b.importance_weight || b.importance || 0) -
(a.importance_weight || a.importance || 0) so entries with importance_weight are
prioritized and datasets that only have importance still sort consistently;
ensure the comparator references the same property names (importance_weight,
importance) used elsewhere (e.g., radius/labels) and keeps the fallback to 0.
A new third tab implementing the memory-map product brief: a visual explanation of how cognee memory is created, searched, and expanded — documents as the central anchor, not a generic node-link graph. Columns left-to-right: Documents (rectangles split into chunk_index- ordered chunk cells) -> Extracted Memory (entities grouped by semantic type in balanced sub-columns, low-importance members collapsed behind '+K more' pills) -> Summaries (TextSummary cards positioned at the mean of their source chunks) -> Global Context (compact bucket cards, with an explicit empty state when no GlobalContextSummary exists yet). Provenance edges (contains, made_from, summarized_in) are the loud structure; semantic entity-entity arcs render quieter. Spatial stability by construction: the layout is computed ONCE from the final graph state, deterministically from sorted data identity — the timeline never re-runs layout. Scrubbing to an earlier event only ghosts elements created later (class toggles, positions untouched), so shared elements stay visually identical across every timeline step. Timeline rail: ingestion/cognify run events derived from t_created gap-clustering of node timestamps (no relational queries in v1), merged chronologically with optional search events. Searches enter via a documented injection hook — cognee_network_visualization(..., search_events=[...]) — and render as overlays on the unchanged layout: everything dims, retrieved chunks/entities/summaries/context get halos, provenance trails connect them, and the detail panel shows the query, answer, retrieved-by-stage counts, and chunk previews. Detail panel covers every selection type: document metadata, chunk text preview with entity/summary chips, entity relations, group membership, collapsed-pill member lists, run-event stage breakdowns, and search metadata. Both themes supported via --mm-* CSS variables. The payload (__MEMORY_DATA__) carries structure only — ids, ordering, grouping, timeline — and resolves node/link details through the already-embedded story-view data, avoiding a second 2MB embed. Verified end-to-end with Playwright on a 542-node demo graph: column order, chunk-cell ordering, byte-identical layout across reloads and timeline scrubs, search overlay via the real injection hook, both themes, and Graph/Schema tabs unaffected. 65 visualization unit tests pass (26 new for the payload builder and assembly wiring). Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
Found while verifying the populated global-context path (the demo previously only exercised the empty state): after running improve(build_global_context_index=True) the status line omitted the 14 context summaries the band was showing. Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
…-appear on the Memory timeline Closes the loop between backend operations and the Memory tab. Searches run through the full backend (with a session) and rated answers now appear on the timeline automatically — no hand-built event payloads. - cognee/modules/visualization/session_events.py: best-effort collector that enumerates the user's recent sessions from the lifecycle table, reads QA entries from the session cache, and maps them to timeline events: every entry becomes a 'search' event (query, answer, and the used_graph_element_ids the retrievers recorded), and every RATED entry additionally becomes an 'improve' event carrying the rating, feedback text, whether apply_feedback_weights has run (memify_metadata), and the same element ids — the exact set the reinforcement touched. Collection never fails the render: unavailable cache/table degrades to an empty list. - visualize_graph(include_session_events=True) wires collection in by default, with session_ids/user overrides. - Memory tab: 'improve' rail items (green, star rating) and a reinforcement overlay — same stable layout, elements glow by feedback valence (green = weights up for ratings >=4, amber = weights down for <=2), with provenance trails and an Improve panel (rating, effect, applied status, feedback text, touched-by-stage counts). Entity panels now show feedback/importance weights with reinforced/weakened tags when drifted from the 0.5 default. Verified end-to-end against real backend operations: two cognee.search() calls in a session (retrievers recorded 13+12 node ids), a FeedbackEntry(score=5), improve() applying the weights, then a plain visualize_graph() — the timeline showed 5 auto-captured searches plus the improve event (applied=true), the overlay lit 14 reinforced elements with 8 trails, and the rated entity's panel read 'Feedback weight 0.550 · reinforced'. Fixed en route: SessionRecord .user_id is UUID-typed — binding a string raised "'str' object has no attribute 'hex'" inside the best-effort guard. 71 visualization tests pass (6 new for the mapper). Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
…ry map with operation bench (#3033) ## What A new third **Memory** tab implementing the document-grounded memory visualization brief: a visual explanation of how cognee memory is created, searched, and *improved* — documents as the central anchor, a timeline as the navigation model, and searches/feedback as overlays on a spatially stable layout. > **Stacked on #3029** (first-session quick wins) — merge that first; this PR then shows only the three Memory-tab commits. ### The memory map - Column bands: **Documents** (rectangles split into `chunk_index`-ordered chunk cells — chunks are provenance cells, not free nodes) → **Extracted memory** (entities grouped by semantic type in balanced sub-columns, low-importance members collapsed behind "+K more" pills) → **Summaries** (positioned at the mean of their source chunks) → **Global context** (bucket cards + root, or an explicit empty state until `improve(build_global_context_index=True)` runs — both states verified on real data). - Provenance edges (`contains`, `made_from`, `summarized_in`) are the loud structure; semantic arcs render quieter. - Detail panel for every selection: documents, chunks (text preview + entity/summary chips), entities (relations, **feedback/importance weights with reinforced/weakened tags**), groups, pills, summaries, context buckets, run events, search events. ### Spatial stability by construction The layout is computed **once** from the final graph state, deterministically from sorted data identity. Timeline scrubbing only toggles ghost classes on later-created elements — verified byte-identical positions across reloads and every timeline step. ### Timeline Ingestion/cognify/memify runs derived from `t_created` gap-clustering of node timestamps, labeled by `source_pipeline` — zero new persistence; an `improve(build_global_context_index=True)` run appeared on the rail with no code changes. ### Operation bench: backend searches & feedback auto-captured `visualize_graph(include_session_events=True)` (default) enumerates recent sessions from the lifecycle table and maps cached QA entries to timeline events: - every entry → a **search** event (query, answer, the `used_graph_element_ids` the retrievers recorded) rendered as a retrieval spotlight — dim + halos + provenance trails, layout untouched; - every **rated** entry → an **improve** event (rating, feedback text, whether `apply_feedback_weights` has run) rendered as a reinforcement overlay: green halos for weights-up, amber for weights-down, on exactly the elements the feedback touched. Collection is best-effort and can never fail a render. An explicit `search_events=` hook remains for custom pipelines. ## Verified end-to-end on real backend operations Two `cognee.search()` calls in a session (13+12 recorded element ids) → `FeedbackEntry(score=5)` → `improve()` → plain `visualize_graph()`: the timeline showed 5 auto-captured searches + the improve event (`applied: yes`), the reinforcement overlay lit 14 elements with 8 provenance trails, and the rated entity's panel read `Feedback weight 0.550 · reinforced`. Playwright-verified throughout (both themes, Graph/Schema tabs unaffected, byte-identical layout stability). ## Testing 71 visualization unit tests pass (32 new across the payload builder, assembly wiring, and the session-event mapper); `node --check` and pre-commit clean. ## Notes for review - Column order is Documents-first (left), not the brief's Global-Context-first — flagged during validation; a reorder is layout-constant changes if preferred. - `edge_ids` overlay matching uses the both-endpoints fallback on graphs whose links lack `edge_object_id`. - Fixed en route: `SessionRecord.user_id` is UUID-typed; string binds fail inside best-effort guards ("'str' object has no attribute 'hex'"). 🤖 Generated with [Claude Code](https://claude.com/claude-code)
What
The top five impact-per-effort fixes from a six-persona first-time-user review of the visualization (every finding driven against the real rendered HTML with Playwright and independently reproduced before landing here). Follows on from #3026 (merged).
1. Never display UUIDs as names (reviewers' #1 trust killer — flagged independently by 5 of 6 personas)
preprocessor.py: identifier-shaped values (UUIDs, content hashes) are skipped when deriving display names — the fallback chain continues through title/text/summary, ending at a readableUnnamed <Type> (id8)placeholder — and unnamed nodes are excluded from Key-mode label landmarks (a raw UUID was being promoted as a top-3 landmark label next toaliceandwhite rabbit). On the 542-node review corpus this removed 104 raw-UUID labels.2. Theme toggle repaints the canvas, theme persists
Clicking "Dark mode" flipped every DOM panel but left the canvas — 95% of the screen — in the old theme until the next pan forced a redraw, and the choice was lost on reload.
story_view.jsexposeswindow._requestGraphRedraw; the toggle (ui_chrome.js) requests a frame and persists the theme in localStorage, applied before first paint. Verified: canvas center pixel is dark 300ms after toggle; reload keeps the theme.3. Search is navigable
Search was display-only: highlighting with no match count and a dead Enter key. Now: live match counter in the box, Enter jumps to the best match and cycles onward (
1 / 18), Shift+Enter backwards, Escape clears.4. Legend cannot contradict the canvas
Two divergent palettes existed (
story_view.js colorByTypevspreprocessor.py _TYPE_COLOR_MAP), so legend swatches didn't match drawn nodes. Type swatches now sample the color actually drawn on a node of that type — consistent by construction.5. Honest stats and controls
1 node sets→0 node sets, with correct pluralization throughout (1 pipeline).5 types→5 node kinds+ tooltip (it visibly contradicted the "Types (49)" column header).Testing
node --checkon edited JS; pre-commit clean.Source
Six-persona Playwright-driven review (46 raw observations → 22 verified findings, 0 rejected). Remaining medium-effort items (entity-column clustering, schema initial fit, panel pinning, ops rail overlay) are documented for follow-up.
🤖 Generated with Claude Code
Summary by CodeRabbit
Release Notes
New Features
Bug Fixes
Style