Skip to content

integrate hashvault#3602

Open
cody-littley wants to merge 10 commits into
mainfrom
cjl/hashvault-integration
Open

integrate hashvault#3602
cody-littley wants to merge 10 commits into
mainfrom
cjl/hashvault-integration

Conversation

@cody-littley

Copy link
Copy Markdown
Contributor

Describe your changes and provide context

Wire in the hash vault. This prevents the app hash from changing for a particular block without human intervention.

@github-actions

github-actions Bot commented Jun 16, 2026

Copy link
Copy Markdown

The latest Buf updates on your PR. Results from workflow Buf / buf (pull_request).

BuildFormatLintBreakingUpdated (UTC)
✅ passed✅ passed✅ passed✅ passedJun 23, 2026, 8:04 PM

@codecov

codecov Bot commented Jun 16, 2026

Copy link
Copy Markdown

Codecov Report

❌ Patch coverage is 40.84507% with 42 lines in your changes missing coverage. Please review.
✅ Project coverage is 58.20%. Comparing base (c1eab8e) to head (9191743).
⚠️ Report is 18 commits behind head on main.

Files with missing lines Patch % Lines
sei-tendermint/internal/p2p/giga_router.go 26.66% 28 Missing and 5 partials ⚠️
sei-db/state_db/sc/hashvault/noop_hashvault.go 0.00% 8 Missing ⚠️
sei-tendermint/node/setup.go 0.00% 1 Missing ⚠️
Additional details and impacted files

Impacted file tree graph

@@            Coverage Diff             @@
##             main    #3602      +/-   ##
==========================================
- Coverage   59.02%   58.20%   -0.82%     
==========================================
  Files        2215     2146      -69     
  Lines      182530   174529    -8001     
==========================================
- Hits       107741   101590    -6151     
+ Misses      65092    63900    -1192     
+ Partials     9697     9039     -658     
Flag Coverage Δ
sei-chain-pr 71.36% <40.35%> (?)
sei-db 70.41% <ø> (ø)
sei-db-state-db ?
sei-db-state-db-pr 79.92% <42.85%> (?)

Flags with carried forward coverage won't be shown. Click here to find out more.

Files with missing lines Coverage Δ
sei-db/state_db/sc/hashvault/pebble_hashvault.go 82.64% <100.00%> (+0.74%) ⬆️
sei-tendermint/config/config.go 76.09% <100.00%> (+2.58%) ⬆️
sei-tendermint/config/toml.go 62.33% <ø> (+7.33%) ⬆️
sei-tendermint/node/setup.go 68.91% <0.00%> (-0.23%) ⬇️
sei-db/state_db/sc/hashvault/noop_hashvault.go 0.00% <0.00%> (ø)
sei-tendermint/internal/p2p/giga_router.go 60.25% <26.66%> (-7.80%) ⬇️

... and 74 files with indirect coverage changes

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

@cody-littley cody-littley marked this pull request as ready for review June 17, 2026 13:18
@cody-littley cody-littley requested a review from wen-coding June 17, 2026 13:19
@cursor

cursor Bot commented Jun 17, 2026

Copy link
Copy Markdown

PR Summary

High Risk
Changes validator safety on the block execution path: misconfiguration or disabling the vault removes equivocation protection and could enable slashing; hash mismatch halts the node by design.

Overview
Autobahn GigaRouter now owns a HashVault that records each height’s app hash before app.Commit() and before app hashes are pushed for voting, and panics on hash mismatch or vault I/O failure (shutdown cancellation is handled without panic).

The vault is Pebble-backed under PersistentStateDir/hashvault by default; in-memory Autobahn or hash-vault-disabled-unsafe=true use a new NoopHashVault with loud unsafe warnings. Vault pruning follows the app’s RetainHeight like the data layer.

Operators get hash-vault-disabled-unsafe in Tendermint config and generated config.toml. Pebble hash-mismatch logs now include the vault data directory and recovery/slashing guidance.

Reviewed by Cursor Bugbot for commit 9191743. Bugbot is set up for automated code reviews on this repo. Configure here.

Comment thread sei-tendermint/internal/consensus/replay.go Outdated
Comment thread sei-tendermint/internal/state/execution.go Outdated

// Commit this block's hash to the equivocation guard before saving state. See commitHashToVault.
// A returned error is a benign shutdown cancellation; genuine faults panic inside the call.
if err := commitHashToVault(ctx, blockExec.hashVault, block.Height, block.Hash()); err != nil {

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the intent is to vault AppHash? block.Hash() is the Tendermint block header hash.

I think you should do:

if err := commitHashToVault(ctx, blockExec.hashVault, block.Height, state.AppHash); err != nil {
    return state, err
}

instead.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

change made

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Side note, this file is now reverted. We decided not to integrate the hashvault into cosmos pathways.

h.eventBus,
sm.NopMetrics(),
h.consensusPolicy,
h.hashVault)

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This only covers non-Autobahn part. Is the intent to do Autobahn part later? I think you would need to:

  1. GigaRouter struct — add the vault field:
type GigaRouter struct {
    // ... existing fields ...
    hashVault hashvault.HashVault
}
  1. NewGigaRouter — accept it as a parameter and set the field.

  2. executeBlock — vault resp.AppHash (the FinalizeBlock output, not the Autobahn header hash) right after FinalizeBlock succeeds, before Commit:

resp, err := app.FinalizeBlock(ctx, ...)
if err != nil { ... }

// Seal the computed AppHash before commit — same equivocation guard as the Cosmos path.
if err := commitHashToVault(ctx, r.hashVault, int64(b.GlobalNumber), resp.AppHash); err != nil {
    return nil, err
}

commitResp, err := app.Commit(ctx)
  1. buildGigaRouter in setup.go — thread the vault from makeNode through to NewGigaRouter.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point, I think I did this backwards. I've reverted the cosmos changes and re-integrated with autobanhn.

return fmt.Errorf("hashvault CommitToHash aborted at height %d: %w", height, err)
}
if errors.Is(err, hashvault.ErrHashMismatch) {
logger.Error("FATAL: HashVault detected a block-hash mismatch — the node has equivocated. "+

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Chatted with Greg today, since re-execution of a block should only happen during restart, it's fine to just panic here. I think you can print previous and current hash, then hint if the human is really really sure, he can remove directory to proceed, but warn that this may lead to slashing etc

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code now logs both hashes, as well as instructions about how to bypass the problem.

"Halting. DO NOT RESTART WITHOUT HUMAN INTERVENTION.",
"height", height, "hash", fmt.Sprintf("%X", hash), "err", err)
} else {
logger.Error("FATAL: HashVault could not commit the block hash (operational error, not a "+

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could a transient I/O error land here?

What's the behavior if the on-disk file is corrupted? If the on-disk file is corrupted maybe we should just ask human to remove or just proceed? I just don't want one corrupted bit on disk to take down the whole validator. What do you feel, log an error and proceed, but don't overwrite the corrupted entry?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, and this is by design. If we can't read and write to the hashvault, we have no idea if the hash we are providing will cause slashing. The only way we can be 100% confident is if the hash vault is able to read/write the file system and we find no conflict in the data.

The challenge with random file system errors is that there isn't a clear way to recover from them automatically. If the files are corrupted, the correct solution is probably going to be for a human operator to delete the hashvault archive. But if the error is from a full disk or bad file permissions, the recovery pathway will look very different.

Let me know if you'd like to discuss.

@cursor cursor Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cursor Bugbot has reviewed your changes using default effort and found 1 potential issue.

Fix All in Cursor

❌ Bugbot Autofix is OFF. To automatically fix reported issues with cloud agents, enable autofix in the Cursor dashboard.

Reviewed by Cursor Bugbot for commit 5429566. Configure here.

Comment thread sei-tendermint/internal/state/execution.go Outdated
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants