Skip to content

feat(scan): add OWASP AST05 agent identity integrity check#77

Open
rhiannalitchfield wants to merge 2 commits into
pegasi-ai:mainfrom
rhiannalitchfield:feat/agent-identity-integrity-check
Open

feat(scan): add OWASP AST05 agent identity integrity check#77
rhiannalitchfield wants to merge 2 commits into
pegasi-ai:mainfrom
rhiannalitchfield:feat/agent-identity-integrity-check

Conversation

@rhiannalitchfield

Copy link
Copy Markdown

Purpose

Implements the check requested in #60. Agent identity files (SOUL.md, MEMORY.md, IDENTITY.md) are a known vector for memory poisoning attacks (OWASP AST05). This adds a scan check that detects unauthorized modifications to those files.

What Changed

  • Added AGENT_IDENTITY_INTEGRITY check to SecurityScanner.ts following the existing check pattern
  • On first scan: SHA-256 hashes of any found identity files are saved to ~/.openclaw/reins/agent-identity-hashes.json as a baseline; returns WARN
  • On subsequent scans: compares current hashes against baseline; returns FAIL if any file was modified or a new identity file appeared unexpectedly
  • Searches in ~/.openclaw/ and the current working directory
  • Fixed a pre-existing package.json syntax error where a winston key was accidentally nested inside the keywords array, preventing npm install from running

Testing

  1. Create a SOUL.md or MEMORY.md in ~/.openclaw/
  2. Run reins scan - should see WARN with "baseline created"
  3. Modify the file
  4. Run reins scan again - should see FAIL with the filename flagged
  5. Run reins scan with no identity files present - should see PASS

Closes #60

Add AGENT_IDENTITY_INTEGRITY scan check that detects unauthorized
modifications to agent identity files (SOUL.md, MEMORY.md, IDENTITY.md).

On first scan, SHA-256 hashes of any found identity files are saved to
~/.openclaw/reins/agent-identity-hashes.json as a baseline. Subsequent
scans compare current hashes against the baseline and FAIL if any file
has changed or a new identity file has appeared unexpectedly.

Also fix pre-existing package.json syntax error where a winston
dependency key was accidentally nested inside the keywords array.

Closes pegasi-ai#60
@greptile-apps

greptile-apps Bot commented Jun 17, 2026

Copy link
Copy Markdown
Contributor

Greptile Summary

This PR adds the AGENT_IDENTITY_INTEGRITY scan check to detect memory-poisoning attacks on agent identity files (SOUL.md, MEMORY.md, IDENTITY.md) by SHA-256 hashing them on first run and comparing on subsequent runs, and also fixes a pre-existing package.json syntax error.

  • New check (checkAgentIdentityIntegrity): Searches ~/.openclaw/ and cwd for identity files, creates a hash baseline on first scan (WARN), and reports FAIL on any hash mismatch or new unexpected file on later scans.
  • package.json fix: Removes the malformed \"winston\": \"^3.11.0\" entry that was erroneously embedded inside the keywords array, restoring valid JSON so npm install works.
  • Logic gap: Deleted identity files are not detected — the tamper loop only iterates currently-present files, so removing a tracked file after baseline silently returns PASS.

Confidence Score: 3/5

The new integrity check can be bypassed by deleting a tracked identity file, which is the exact threat this check is meant to catch.

The tamper-detection loop only iterates files that currently exist on disk; any file recorded in the baseline that is subsequently deleted simply disappears from the comparison and the check returns PASS. For a security scan whose purpose is to catch unauthorized modifications — including deletion — to agent identity files, this is a meaningful gap in the core logic that should be closed before the feature is shipped.

src/core/SecurityScanner.ts — specifically the tamper-comparison loop in checkAgentIdentityIntegrity (lines 689–705)

Important Files Changed

Filename Overview
src/core/SecurityScanner.ts Adds AGENT_IDENTITY_INTEGRITY check that baselines SHA-256 hashes of identity files; has a logic gap where deleted baseline files are not flagged, and an inconsistent write path for the baseline JSON.
package.json Fixes a pre-existing syntax error where a winston dependency was accidentally nested as a string inside the keywords array; no functional concerns.

Flowchart

%%{init: {'theme': 'neutral'}}%%
flowchart TD
    A[checkAgentIdentityIntegrity] --> B{Search openclawHome + cwd\nfor SOUL.md / MEMORY.md / IDENTITY.md}
    B -- none found --> C[PASS: no identity files]
    B -- files found --> D[Hash each file with SHA-256\nsilently skip unreadable files]
    D --> E{Baseline JSON exists\nat identityHashesPath?}
    E -- no / corrupted --> F[ensureDir via this.identityHashesPath\nwriteJson via getPreferredDataPath ⚠️]
    F --> G[WARN: baseline created]
    E -- yes --> H{Compare currentHashes\nvs baseline entries}
    H -- new file or hash mismatch --> I[FAIL: unexpected modification]
    H -- all match --> J[PASS: N files match baseline]
    H -. deleted file in baseline\nnot in currentHashes .-> K[⚠️ silently ignored → PASS]
Loading
%%{init: {'theme': 'base', 'themeVariables': {"darkMode": true, "background": "#0d1117", "primaryColor": "#21262d", "primaryTextColor": "#e6edf3", "primaryBorderColor": "#8b949e", "lineColor": "#8b949e", "textColor": "#e6edf3", "edgeLabelBackground": "#161b22", "actorBkg": "#21262d", "actorBorder": "#8b949e", "actorTextColor": "#e6edf3", "actorLineColor": "#8b949e", "signalColor": "#8b949e", "signalTextColor": "#e6edf3", "noteBkgColor": "#373320", "noteBorderColor": "#d4a72c", "noteTextColor": "#f0e6c0", "labelBoxBkgColor": "#21262d", "labelBoxBorderColor": "#8b949e", "labelTextColor": "#e6edf3", "loopTextColor": "#e6edf3", "activationBkgColor": "#30363d", "activationBorderColor": "#8b949e"}}}%%
flowchart TD
    A[checkAgentIdentityIntegrity] --> B{Search openclawHome + cwd\nfor SOUL.md / MEMORY.md / IDENTITY.md}
    B -- none found --> C[PASS: no identity files]
    B -- files found --> D[Hash each file with SHA-256\nsilently skip unreadable files]
    D --> E{Baseline JSON exists\nat identityHashesPath?}
    E -- no / corrupted --> F[ensureDir via this.identityHashesPath\nwriteJson via getPreferredDataPath ⚠️]
    F --> G[WARN: baseline created]
    E -- yes --> H{Compare currentHashes\nvs baseline entries}
    H -- new file or hash mismatch --> I[FAIL: unexpected modification]
    H -- all match --> J[PASS: N files match baseline]
    H -. deleted file in baseline\nnot in currentHashes .-> K[⚠️ silently ignored → PASS]
Loading

Reviews (1): Last reviewed commit: "feat(scan): add OWASP AST05 agent identi..." | Re-trigger Greptile

Comment on lines +689 to +705
const tampered: string[] = [];
for (const [filePath, currentHash] of Object.entries(currentHashes)) {
const baselineHash = baseline[filePath];
if (!baselineHash) {
tampered.push(`${path.basename(filePath)} (new)`);
} else if (baselineHash !== currentHash) {
tampered.push(path.basename(filePath));
}
}

if (tampered.length > 0) {
return this.fail(
'AGENT_IDENTITY_INTEGRITY',
`unexpected modification to identity file(s): ${tampered.join(', ')}`,
REMEDIATIONS.agentIdentity
);
}

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Deleted identity files are never flagged as tampered

The tamper loop iterates Object.entries(currentHashes), which only contains files that currently exist on disk. A file that was tracked in the baseline but has since been deleted (or had its permissions changed so readFile throws) will simply be absent from currentHashes and never compared. Concretely: if SOUL.md is in the baseline and is then deleted, foundFiles will be empty and the method returns PASS ("no agent identity files found") — the check silently reports clean even though a tracked identity file has disappeared. To close this gap the loop should also walk Object.keys(baseline) and flag any key not present in currentHashes.

Comment thread src/core/SecurityScanner.ts Outdated
Comment on lines +680 to +681
await fs.ensureDir(path.dirname(this.identityHashesPath));
await fs.writeJson(getPreferredDataPath('agent-identity-hashes.json'), currentHashes, { spaces: 2 });

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Baseline write uses a different path expression than the read

ensureDir is called on path.dirname(this.identityHashesPath) but writeJson targets getPreferredDataPath(...) directly. These resolve to the same path when the file doesn't yet exist (the only moment this branch runs), but it's fragile: the directory is created for one path and the file is written to a separately-computed one. Using this.identityHashesPath for the write makes the intent explicit and removes the implicit coupling.

Suggested change
await fs.ensureDir(path.dirname(this.identityHashesPath));
await fs.writeJson(getPreferredDataPath('agent-identity-hashes.json'), currentHashes, { spaces: 2 });
await fs.ensureDir(path.dirname(this.identityHashesPath));
await fs.writeJson(this.identityHashesPath, currentHashes, { spaces: 2 });

Note: If this suggestion doesn't match your team's coding style, reply to this and let me know. I'll remember it for next time!

Two issues flagged in code review:

- Deleted identity files were silently ignored because the tamper loop
  only iterated currently-present files. Add a second loop over baseline
  keys to catch files that existed at baseline time but are now absent,
  flagging them as (deleted).

- Baseline was written via getPreferredDataPath() directly instead of
  this.identityHashesPath. Use the instance field consistently.

- Remove unused getPreferredDataPath import.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Add OWASP AST05: SOUL.md / MEMORY.md integrity check

1 participant