Skip to content

Make Rescan non-destructive — it silently wipes usage history#138

Closed
OtoGodfrey wants to merge 1 commit into
phuryn:mainfrom
OtoGodfrey:fix/remove-destructive-rescan
Closed

Make Rescan non-destructive — it silently wipes usage history#138
OtoGodfrey wants to merge 1 commit into
phuryn:mainfrom
OtoGodfrey:fix/remove-destructive-rescan

Conversation

@OtoGodfrey

Copy link
Copy Markdown
Contributor

The problem

The Rescan button (and its POST /api/rescan handler) does a full rebuild: it unlink()s usage.db and re-scans from the JSONL transcripts on disk.

That is silent, unrecoverable data loss. Claude Code prunes old session transcripts on a rolling basis (cleanupPeriodDays), so on any given day ~/.claude/projects only holds roughly the last few weeks. usage.db is append-only, which makes it the only place older history survives. Deleting it and rebuilding from disk therefore permanently discards every day older than the current on-disk retention window — with no warning and no backup. The tooltip actively invites the click ("Use if data looks stale or costs seem wrong").

Repro

  1. Run the tool for a while so usage.db accumulates history (e.g. a few months).
  2. Claude Code, meanwhile, deletes transcripts older than its retention window.
  3. Click Rescan.
  4. All history older than what's still on disk is gone. In my case a one-month-old install with data back to mid-February was reduced to the last ~30 days the instant Rescan ran.

The fix

  • Remove the Rescan button, its click handler, and its CSS.
  • Keep POST /api/rescan, but make it append-only: it now calls scanner.scan() without unlinking the DB. scanner.scan() already tracks processed files and only inserts new turns, so a refresh updates the DB without ever destroying history.

Net diff is small (+6 / −23). All 103 existing tests pass (including test_api_rescan_returns_json).

Note

I kept /api/rescan as a safe incremental refresh rather than removing the endpoint outright, since it's the more conservative change. If you'd rather drop the endpoint entirely (the button was its only caller), or instead keep a destructive rebuild but back the DB up first, happy to adjust.

🤖 Generated with Claude Code

The "Rescan" button (and the POST /api/rescan handler) did a full rebuild:
it `unlink()`ed usage.db and re-scanned from the JSONL transcripts on disk.

That is data-loss by design. Claude Code prunes old session transcripts on
a rolling basis (cleanupPeriodDays), so on any given day the projects dir
only holds roughly the last few weeks. usage.db, being append-only, is the
*only* place older history survives. Deleting it and rebuilding from disk
therefore permanently discards every day older than the current on-disk
retention window — with no warning and no backup. A user who clicks "Rescan"
once (the tooltip even suggests it "if costs seem wrong") loses months of
history instantly.

This happened to me: a one-month-old install with history back to mid-Feb
was reduced to the last ~30 days the moment Rescan ran.

Fix:
- Remove the Rescan button, its click handler, and its CSS.
- Keep POST /api/rescan but make it append-only: it now calls
  scanner.scan() WITHOUT unlinking the DB. scanner.scan() tracks processed
  files and only adds new turns, so a refresh never destroys history.

All 103 existing tests pass.
phuryn added a commit that referenced this pull request Jun 15, 2026
The Rescan button's POST /api/rescan handler unlink()ed usage.db and
rebuilt from the JSONL transcripts on disk. That is data-loss by design:
Claude Code prunes old transcripts on a rolling basis (cleanupPeriodDays),
and usage.db — being append-only — is the only durable record of history
older than the on-disk retention window. Clicking Rescan once wiped every
day older than that window, with no warning and no backup.

Fix: /api/rescan now runs an incremental scanner.scan() WITHOUT unlinking
the DB (scan dedupes via the message_id index and only adds new turns, so
history is preserved). Unlike PR #138 we keep the button — it's the only
in-session way to ingest new turns, since auto-refresh only re-reads
/api/data. Reworded its tooltip to describe the additive behaviour.

Tests:
- test_api_rescan_is_non_destructive: seeds history with no on-disk JSONL,
  posts /api/rescan, asserts the rows survive (fails under the old wipe).
- test_fable_and_mythos_have_explicit_entries / _date_suffix / substring:
  lock in Fable/Mythos pricing (#136/#137 had no regression guard).

Co-Authored-By: OtoGodfrey <OtoGodfrey@users.noreply.github.com>
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
@phuryn

phuryn commented Jun 15, 2026

Copy link
Copy Markdown
Owner

Thank you for catching this, @OtoGodfrey — and for the detailed write-up. You're exactly right: usage.db is append-only and the only durable record once Claude Code prunes transcripts past cleanupPeriodDays, so the wipe-and-rebuild in /api/rescan was silent data loss. Sorry it cost you history.

I've landed the core of your fix on DEV (97e1d91), with credit to you via a Co-Authored-By trailer. One deliberate divergence: I kept the button rather than removing it. Auto-refresh only re-reads /api/data and never scans, so the button is the only in-session way to ingest new turns — without it you'd have to restart the server to see fresh usage. Instead I made the endpoint itself safe (incremental scanner.scan(), no unlink()) and reworded the tooltip to describe the additive behaviour, so the button keeps its useful job while losing its destructive one.

Also added a regression test (test_api_rescan_is_non_destructive) that seeds history with no on-disk JSONL and asserts it survives a rescan — it fails against the old code, so this won't regress.

Ships in v1.2.6. Since the safety fix is now on DEV (in slightly modified form), I'll let the maintainer decide whether to close this — but the substance of your report is fixed and credited. 🙏

@phuryn

phuryn commented Jun 15, 2026

Copy link
Copy Markdown
Owner

Landed in v1.2.6 (just released) — the safety fix is on main and shipped, credited to you via a Co-Authored-By trailer on 97e1d91. As noted above, the only divergence is that we kept the Rescan button and made the endpoint itself non-destructive, rather than removing the button. Closing since the substance of your report is fixed. Thanks again for catching this, @OtoGodfrey! 🙏

@phuryn phuryn closed this Jun 15, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants