Make Rescan non-destructive — it silently wipes usage history#138
Make Rescan non-destructive — it silently wipes usage history#138OtoGodfrey wants to merge 1 commit into
Conversation
The "Rescan" button (and the POST /api/rescan handler) did a full rebuild: it `unlink()`ed usage.db and re-scanned from the JSONL transcripts on disk. That is data-loss by design. Claude Code prunes old session transcripts on a rolling basis (cleanupPeriodDays), so on any given day the projects dir only holds roughly the last few weeks. usage.db, being append-only, is the *only* place older history survives. Deleting it and rebuilding from disk therefore permanently discards every day older than the current on-disk retention window — with no warning and no backup. A user who clicks "Rescan" once (the tooltip even suggests it "if costs seem wrong") loses months of history instantly. This happened to me: a one-month-old install with history back to mid-Feb was reduced to the last ~30 days the moment Rescan ran. Fix: - Remove the Rescan button, its click handler, and its CSS. - Keep POST /api/rescan but make it append-only: it now calls scanner.scan() WITHOUT unlinking the DB. scanner.scan() tracks processed files and only adds new turns, so a refresh never destroys history. All 103 existing tests pass.
The Rescan button's POST /api/rescan handler unlink()ed usage.db and rebuilt from the JSONL transcripts on disk. That is data-loss by design: Claude Code prunes old transcripts on a rolling basis (cleanupPeriodDays), and usage.db — being append-only — is the only durable record of history older than the on-disk retention window. Clicking Rescan once wiped every day older than that window, with no warning and no backup. Fix: /api/rescan now runs an incremental scanner.scan() WITHOUT unlinking the DB (scan dedupes via the message_id index and only adds new turns, so history is preserved). Unlike PR #138 we keep the button — it's the only in-session way to ingest new turns, since auto-refresh only re-reads /api/data. Reworded its tooltip to describe the additive behaviour. Tests: - test_api_rescan_is_non_destructive: seeds history with no on-disk JSONL, posts /api/rescan, asserts the rows survive (fails under the old wipe). - test_fable_and_mythos_have_explicit_entries / _date_suffix / substring: lock in Fable/Mythos pricing (#136/#137 had no regression guard). Co-Authored-By: OtoGodfrey <OtoGodfrey@users.noreply.github.com> Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
|
Thank you for catching this, @OtoGodfrey — and for the detailed write-up. You're exactly right: I've landed the core of your fix on Also added a regression test ( Ships in v1.2.6. Since the safety fix is now on |
|
Landed in v1.2.6 (just released) — the safety fix is on |
The problem
The Rescan button (and its
POST /api/rescanhandler) does a full rebuild: itunlink()susage.dband re-scans from the JSONL transcripts on disk.That is silent, unrecoverable data loss. Claude Code prunes old session transcripts on a rolling basis (
cleanupPeriodDays), so on any given day~/.claude/projectsonly holds roughly the last few weeks.usage.dbis append-only, which makes it the only place older history survives. Deleting it and rebuilding from disk therefore permanently discards every day older than the current on-disk retention window — with no warning and no backup. The tooltip actively invites the click ("Use if data looks stale or costs seem wrong").Repro
usage.dbaccumulates history (e.g. a few months).The fix
POST /api/rescan, but make it append-only: it now callsscanner.scan()without unlinking the DB.scanner.scan()already tracks processed files and only inserts new turns, so a refresh updates the DB without ever destroying history.Net diff is small (+6 / −23). All 103 existing tests pass (including
test_api_rescan_returns_json).Note
I kept
/api/rescanas a safe incremental refresh rather than removing the endpoint outright, since it's the more conservative change. If you'd rather drop the endpoint entirely (the button was its only caller), or instead keep a destructive rebuild but back the DB up first, happy to adjust.🤖 Generated with Claude Code