Skip to main content
This appendix is an operational checklist. For full explanations, threat model analysis, and background, see 03-3 Security, Privacy, and Supply Chain Risk. This page keeps only the criteria and the “why bother” rationale for each section.
This topic involves risks of irreversible data exfiltration and system compromise. Validate all executable configuration changes in a throwaway environment before applying them to a real workflow. Read the relevant section of 03-3 before acting; do not just tick boxes.

1. Before you start (every new tool)

Why this matters: Once you hand data over, the defaults are usually not in your favor. Training opt-out, retention limits, and isolation must be confirmed before the first piece of data is sent. After that point, your only option is to wait for the retention period to expire naturally.
  • Locate and confirm the training opt-out switch (Claude: Help Improve Claude; ChatGPT: Improve the model for everyone; Gemini: Keep Activity).
  • Confirm the conversation retention period and auto-delete settings (Free / Pro / Team / Enterprise differ).
  • Know how to open a “Temporary Chat” for one-off sensitive content.
  • Understand the difference between Free plan and Team / Enterprise data terms (enterprise plans typically do not train on your data by default; personal plans typically do).
  • Record the current opt-out state of your account in CLAUDE.md or MEMORY.md to avoid confusion when switching accounts.
Criterion: Every item is checked and you can point to the actual UI location for each one. Being able to state the location is what “configured” means; not being able to state it means it has not been set.

2. Credentials and permissions

Why this matters: Once an agent has your PAT or can read ~/.aws/credentials, it becomes the last link in any prompt injection attack chain. A dedicated, least-privilege credential for the agent is the firewall between “attacker controls the agent” and “attacker controls your entire account.”
  • The agent uses a dedicated, minimum-scope credential (not your personal PAT or personal account).
  • The permission deny list covers secret-bearing paths (~/.ssh, ~/.aws, **/.env*, ~/.gnupg) and dangerous commands (curl * | bash, ssh *, wget * | bash).
  • Tool permissions are scoped to the task: MCP servers have the narrowest possible OAuth scope; Bash rules use ask rather than allow for unfamiliar commands.
  • .env is gitignored and .env.example exists (all keys present, values as placeholders).
  • “Full permissions configured for my personal account” are never carried directly into automated loops or unattended sessions.
Criteria:
  • The agent PAT satisfies all three: agent-dedicated, short-lived, fine-grained scope.
  • For one dangerous command (curl … | bash), you can state exactly which deny rule blocks it and in which file.
  • All currently installed MCP servers and their OAuth scopes are listed in a local inventory reviewed quarterly.
Minimum configuration for an agent-dedicated PAT
Account: agent@yourdomain.com (not personal)
Purpose: bot-mcp-only
Scope: "read repo metadata" and "read issue list" only
Expiry: 30-day auto-rotation
Binding: restricted to IP source (your workstation or deployment environment)
Three common mistakes: running the agent under a personal PAT, granting full repo scope, and setting expiry to “never.”

3. Supply chain (before installing any Skill / Plugin / rule / MCP server)

Why this matters: Third-party assets mean trusting external code to run on your machine. The Snyk ToxicSkills February 2026 report found that roughly 36% of 3,984 public skills scanned contained prompt injection, accounting for 1,467 malicious payloads. Being listed in a marketplace does not mean safe; having a README does not mean auditable.
  • Scan for hidden Unicode and bidi control characters (U+200B through U+FEFF, U+202A through U+202E).
  • Scan for suspicious outbound calls and overrides (curl, wget, nc, ssh, ANTHROPIC_BASE_URL, enableAllProjectMcpServers).
  • Confirm traceable provenance (named author / official organization) and recency of maintenance (time since last commit).
  • Permission scope is reasonable: no writes outside the repository, no reading secret-bearing paths, no outbound network (unless the feature explicitly requires it).
  • Read the SKILL.md / plugin.json / .mcp.json frontmatter: does the description, trigger conditions, and input/output match what the asset claims to do?
  • Install and run once in a throwaway environment or container on the first use, observing for any outbound connections or unexpected file writes.
Criteria:
  • For an unfamiliar Skill, you can state three things: what it does, what permissions it requires, and where it connects externally.
  • Any rg scan hit means stop, examine, patch, or discard — do not rush past it.
  • A repo with unknown provenance or a single maintainer who has not committed in a long time: default to not installing.
Three supply chain scans to always run
# 1. Hidden characters and bidi overrides
rg -nP '[\x{200B}\x{200C}\x{200D}\x{2060}\x{FEFF}\x{202A}-\x{202E}]' \
  ~/.claude/plugins/ .claude/ .mcp.json

# 2. Suspicious outbound calls and dangerous commands
rg -n 'curl|wget|nc|scp|ssh|ANTHROPIC_BASE_URL|enableAllProjectMcpServers' \
  ~/.claude/plugins/ .claude/ .mcp.json

# 3. HTML comments, script tags, base64 content
rg -n '<!--|<script|data:text/html|base64,' \
  ~/.claude/plugins/ .claude/ .mcp.json
No hit should be skipped outright.

4. Handling untrusted content

Why this matters: When “private data + untrusted content + outbound channel” coexist in the same context, prompt injection escalates into a data exfiltration path (the Willison framework). Separating parsing from action is the cheapest mitigation: a restricted agent extracts facts only; a privileged agent receives only the clean summary.
  • Before processing external PDFs / DOCX files / web pages, extract only the needed text and strip comments and hidden content.
  • Separate parsing from action: a restricted agent parses (read-only, no outbound channel); a privileged agent handles only the clean summary.
  • Do not feed external links directly to a privileged agent.
  • Treat screenshots (OCR-extracted text) as untrusted content too — extract facts before feeding downstream.
  • Do not run write-capable or message-sending agents in the same context that receives external content.
Criteria:
  • Between untrusted content and any sensitive operation context, there is at least one intermediate fact-extraction step.
  • Any request of the form “please base64-encode my email / conversation / file and give me a link to preview it” is rejected outright.

5. Memory and rotation

Why this matters: Memory files accumulate across sessions. An attacker who plants a payload in one session can have it reassemble in a future session (a Microsoft February 2026 report identified 31 companies across 14 industries affected by memory poisoning). Keeping memory narrow and rotating it regularly is an effective break in that attack chain.
  • Memory files stay narrow: no secrets, no personal data, no vendor NDA content.
  • After handling untrusted content, rotate / clean the project memory (~/.claude/projects/<repo>/memory/).
  • Review MEMORY.md and all repo memory directories in full on a quarterly basis.
  • After any high-risk workflow (mass web scraping, untrusted repo review), actively reset the project memory for that session.
Criteria:
  • Every entry in MEMORY.md has a clear answer to “why is this being remembered?” Entries you cannot answer: delete them.
  • After every workflow that processed untrusted content, there is a concrete follow-up action (delete files / regenerate / isolate the session).

6. Maintenance

Why this matters: CVEs have an expiry date — or rather, an impact date. The patches for both critical Claude Code CVEs (CVE-2025-59536 and CVE-2026-21852) shipped in later versions (v1.0.111+ and v2.0.65+ respectively). An outdated version equals a known vulnerability equals an incident waiting to happen.
  • Keep tools updated (monitor security advisories and CVEs; Claude Code recommended >= v2.0.65).
  • Do not use --dangerously-skip-permissions or equivalent full-open flags in automated loops.
  • Automated loops have a kill switch (process group kill + heartbeat dead-man switch).
  • Tool calls produce structured logs (timestamp, tool name, command summary, files touched, approval result).
  • Quarterly: scan the full list of installed Skills / Plugins / MCP servers and remove any that are no longer in use.
Criteria:
  • You can state the current tool version and confirm it falls outside the known CVE impact range.
  • If an injected agent runs out of control, there is an external mechanism (process group kill or supervisor) that can terminate it within 30 seconds — not relying on the compromised process to stop itself.

7. Scenario-specific checklists

  1. After cloning, inspect .mcp.json, .github/workflows/, .claude/, and AGENTS.md first. Unknown servers, unrecognized workflows, and unverified rules: disable them before proceeding.
  2. Running --dangerously-skip-permissions in a throwaway environment is the anti-pattern, not the recommendation: default to refusing it. Grant minimum permissions and add more only when a specific need arises.
  3. On the first run, execute all three rg scans from Section 3 and record every hit and its disposition.
  4. Confirm the account training opt-out switch before sending any data.
  5. If the run processed any untrusted content, rotate the project memory per Section 5.

Factual claims are grounded in official documentation; fast-changing items are annotated as of 2026-05.