integrate hashvault#3602
Conversation
|
The latest Buf updates on your PR. Results from workflow Buf / buf (pull_request).
|
Codecov Report❌ Patch coverage is Additional details and impacted files@@ Coverage Diff @@
## main #3602 +/- ##
==========================================
- Coverage 59.02% 58.20% -0.82%
==========================================
Files 2215 2146 -69
Lines 182530 174529 -8001
==========================================
- Hits 107741 101590 -6151
+ Misses 65092 63900 -1192
+ Partials 9697 9039 -658
Flags with carried forward coverage won't be shown. Click here to find out more.
🚀 New features to boost your workflow:
|
PR SummaryHigh Risk Overview The vault is Pebble-backed under Operators get Reviewed by Cursor Bugbot for commit 9191743. Bugbot is set up for automated code reviews on this repo. Configure here. |
|
|
||
| // Commit this block's hash to the equivocation guard before saving state. See commitHashToVault. | ||
| // A returned error is a benign shutdown cancellation; genuine faults panic inside the call. | ||
| if err := commitHashToVault(ctx, blockExec.hashVault, block.Height, block.Hash()); err != nil { |
There was a problem hiding this comment.
I think the intent is to vault AppHash? block.Hash() is the Tendermint block header hash.
I think you should do:
if err := commitHashToVault(ctx, blockExec.hashVault, block.Height, state.AppHash); err != nil {
return state, err
}
instead.
There was a problem hiding this comment.
Side note, this file is now reverted. We decided not to integrate the hashvault into cosmos pathways.
| h.eventBus, | ||
| sm.NopMetrics(), | ||
| h.consensusPolicy, | ||
| h.hashVault) |
There was a problem hiding this comment.
This only covers non-Autobahn part. Is the intent to do Autobahn part later? I think you would need to:
- GigaRouter struct — add the vault field:
type GigaRouter struct {
// ... existing fields ...
hashVault hashvault.HashVault
}
-
NewGigaRouter — accept it as a parameter and set the field.
-
executeBlock — vault
resp.AppHash(the FinalizeBlock output, not the Autobahn header hash) right after FinalizeBlock succeeds, before Commit:
resp, err := app.FinalizeBlock(ctx, ...)
if err != nil { ... }
// Seal the computed AppHash before commit — same equivocation guard as the Cosmos path.
if err := commitHashToVault(ctx, r.hashVault, int64(b.GlobalNumber), resp.AppHash); err != nil {
return nil, err
}
commitResp, err := app.Commit(ctx)
buildGigaRouterin setup.go — thread the vault from makeNode through to NewGigaRouter.
There was a problem hiding this comment.
Good point, I think I did this backwards. I've reverted the cosmos changes and re-integrated with autobanhn.
| return fmt.Errorf("hashvault CommitToHash aborted at height %d: %w", height, err) | ||
| } | ||
| if errors.Is(err, hashvault.ErrHashMismatch) { | ||
| logger.Error("FATAL: HashVault detected a block-hash mismatch — the node has equivocated. "+ |
There was a problem hiding this comment.
Chatted with Greg today, since re-execution of a block should only happen during restart, it's fine to just panic here. I think you can print previous and current hash, then hint if the human is really really sure, he can remove directory to proceed, but warn that this may lead to slashing etc
There was a problem hiding this comment.
Code now logs both hashes, as well as instructions about how to bypass the problem.
| "Halting. DO NOT RESTART WITHOUT HUMAN INTERVENTION.", | ||
| "height", height, "hash", fmt.Sprintf("%X", hash), "err", err) | ||
| } else { | ||
| logger.Error("FATAL: HashVault could not commit the block hash (operational error, not a "+ |
There was a problem hiding this comment.
Could a transient I/O error land here?
What's the behavior if the on-disk file is corrupted? If the on-disk file is corrupted maybe we should just ask human to remove or just proceed? I just don't want one corrupted bit on disk to take down the whole validator. What do you feel, log an error and proceed, but don't overwrite the corrupted entry?
There was a problem hiding this comment.
Yes, and this is by design. If we can't read and write to the hashvault, we have no idea if the hash we are providing will cause slashing. The only way we can be 100% confident is if the hash vault is able to read/write the file system and we find no conflict in the data.
The challenge with random file system errors is that there isn't a clear way to recover from them automatically. If the files are corrupted, the correct solution is probably going to be for a human operator to delete the hashvault archive. But if the error is from a full disk or bad file permissions, the recovery pathway will look very different.
Let me know if you'd like to discuss.
There was a problem hiding this comment.
Cursor Bugbot has reviewed your changes using default effort and found 1 potential issue.
❌ Bugbot Autofix is OFF. To automatically fix reported issues with cloud agents, enable autofix in the Cursor dashboard.
Reviewed by Cursor Bugbot for commit 5429566. Configure here.

Describe your changes and provide context
Wire in the hash vault. This prevents the app hash from changing for a particular block without human intervention.