One-sentence definitions for all terms used in the book. English originals are retained alongside Chinese equivalents. Claims are grounded in official documentation; fast-changing items are annotated as of 2026-05.
Part I Foundations and Core Concepts
token: The smallest unit the model processes (a subword), not a character and not a word; affects cost and context occupancy. Chinese text typically runs 1 to 2 tokens per character, denser than English. tokenization: The process of splitting input text into a token sequence; different models use different tokenizers, and rough estimates should use official tooling. context window: The maximum number of tokens a model can process in one pass (input plus output); when the window fills, the oldest content is pushed out. context rot: The phenomenon of declining model quality as the context approaches capacity or passes the midpoint; Chroma (2025) observed this across 18 models [1]. lost in the middle: In long contexts, the model pays significantly less attention to information in the middle than at the beginning or end (Liu, et al., 2023) [2]. context engineering: Managing everything that enters the context window (instructions, knowledge, history, tool results) to improve output quality. Anthropic defines it as “finding the minimum high-signal token set that maximizes goal achievement within a limited attention budget” [3]. See 01-4. hallucination: Content the model produces that looks plausible but is incorrect; arises because generative models optimize for “does it look right” rather than “is it right.” temperature: A sampling parameter that scales the sharpness of the probability distribution. Approaching 0 converges output; approaching 1 increases divergence. Lower for reproducibility, higher for brainstorming. top-p (nucleus sampling): Sampling only from the candidate token pool whose cumulative probability reaches p, dynamically trimming the long tail. system prompt: The instruction layer set by the platform or developer, with higher priority than user input; the technical foundation for rules files. tool use / function calling: The mechanism by which the model decides to call an external tool and feeds the result back into the context under thetool role; the model only “requests” — the harness “executes.”
agentic loop: The autonomous cycle of observe, decide, act, observe again; the essential difference between an agent and a simple conversation.
harness: The execution shell of an agent, responsible for tool execution, permission boundaries, state persistence, loop termination, observability, and kill switches. The model sets the capability ceiling; the harness determines how much of it you actually get. See 01-6.
augmented LLM: Anthropic’s term for an LLM enhanced with retrieval, tools, memory, and similar capabilities; the harness revolves around it.
kill switch: A mechanism triggered externally to terminate the agent process; does not rely on the compromised process to terminate itself.
heartbeat dead-man switch: A mechanism where a long-running task writes a heartbeat at regular intervals and an external supervisor kills the entire process group when the heartbeat stops.
prompt engineering: The design practice of writing tasks as specifications the model can execute; structured scaffolding, XML tags, few-shot examples, Automated Prompt Optimization (APO). See 01-3.
few-shot: Placing 2 to 3 examples in the prompt to calibrate abstract style or format requirements into templates the model can align to.
APO (Automated Prompt Optimization): Automated prompt iteration driven by an evaluation function; representative implementations include OPRO (Yang, et al., 2023) [4] and DSPy (Khattab, et al., 2023) [5].
workflow: A system where LLMs and tools are pre-arranged along programmatic paths; you define the steps and branches, and the model only acts within each node. See 01-5.
agent: A system where the LLM autonomously decides the flow and tool usage; high flexibility, lower controllability.
DAG (directed acyclic graph): A directed graph of steps; parallelization opportunities surface naturally once drawn (nodes with no path between them can run concurrently).
Vibe Coding: Letting AI write code by feel with low structure; this Playbook advocates replacing blind acceptance with configuration and verification.
Part II Configuration and Mechanisms
CLAUDE.md: Claude’s memory and rules file, divided into four layers — managed, user, project, and local; the managed layer cannot be disabled by individuals. See 04-1.@path import syntax: Using @path/to/file inside CLAUDE.md to import other files; up to 4 levels of recursion; commonly used with @AGENTS.md so Claude and other tools share the same baseline.
AGENTS.md: An open standard for cross-tool rules files, governed by the Agentic AI Foundation under the Linux Foundation; natively supported by more than twenty tools, adopted by over sixty thousand projects (as of 2026-06).
Rules (.claude/rules/*.md): Modular rules files; setting paths in frontmatter binds them to a glob, loading them only when matching files are touched, avoiding unnecessary context occupancy. See 04-2.
Skill (SKILL.md): A reusable process that exists as a directory and is loaded on demand; whether the model auto-invokes it depends on how well the description is written. See 04-4.
progressive disclosure: Skills move expensive content into references/ and scripts/, keeping the main file to essential entry points; reduces context cost per invocation.
Hook: An event-driven deterministic automation mechanism (PreToolUse, PostToolUse, Stop, SessionStart, and others — 30 events as of 2026-06) that upgrades “the model remembers” to “the system guarantees.” See 04-6.
exit 2 intercept semantics: A hook returning exit 2 in the PreToolUse phase is equivalent to rejecting that tool call; the model sees the rejection feedback and adjusts its plan.
Subagent: A subtask execution container with its own independent context window; since v2.1, structured as built-in (Explore, Plan, general-purpose) plus custom layers. See 04-10.
isolation: worktree: Subagents physically isolate the filesystem via git worktree, enabling parallel subtasks without conflicts.
Plugin: A unit that packages skills, agents, hooks, MCP and LSP configuration, and background monitoring; distributed via marketplace or git URL, with namespace isolation to prevent name collisions. See 04-8.
Marketplace (plugin marketplace): Two public marketplaces — official claude-plugins-official and community claude-community — with /plugin install as the installation path.
harness layer composition: Case studies such as OpenClaw establish that Identity (SOUL.md), Memory (MEMORY.md), Heartbeat, Tool conventions, Bootstrap, and User Markdown files in a workspace collectively assemble agent personality and behavior. See 05-1.
MCP (Model Context Protocol): An open standard for connecting external tools and data sources to the model; a single server exposes three interface types — tools, resources, and prompts. See 04-9.
MCP transport: The transport protocol between client and server; Claude Code supports stdio (local process), http (including the streamable-http alias), sse (deprecated), and ws.
MCP scope: The configuration scope of an MCP server — local (personal, not shared), project (versioned, requires user approval), and user (all personal projects).
CLI-first: Prefer CLI for any operation that can be done via CLI, with MCP as fallback; CLI has no schema preload cost, its stdout is trimmable, and it can be chained with shell pipes. See 04-10.
Part III Risk and Evaluation
prompt injection: An attack where untrusted external content is mistakenly executed as instructions; a design constraint at the application layer, not a model bug. See 03-3. Willison triple threat: When private data, untrusted content, and an outbound channel coexist, prompt injection escalates from “making the model say strange things” to a data exfiltration vector. memory poisoning: Planting a payload in memory across sessions that triggers when a future session assembles its context; keeping memory narrow and rotating it regularly are mitigations. supply chain risk: Security risks introduced by third-party Skills, plugins, rules, or hooks; the Snyk ToxicSkills report (2026-02) found approximately 36% of 3,984 public skills contained prompt injection [6]. provenance: Whether a component’s origin is named and traceable, used as a trust signal. hidden unicode / bidi attack: Invisible payloads hidden in zero-width characters (U+200B and similar) and bidirectional control codes (U+202A through U+202E); invisible to humans, visible to models.
saturation check: A 100% pass rate on a benchmark means the tests are too easy and lack discriminative power; add edge cases until at least one model tier fails. See 03-4.
blind trust: The verifiability blind spot of accepting something because it looks reasonable; verifying all verifiable claims is the antidote.
Part IV Customization and Automation (supplementary concepts)
path-scoped rule: A rules file that uses frontmatterpaths: ["src/**/*.ts"] to restrict loading to only when matching glob files are touched. See 04-2.
@-mention explicit invocation: Using @subagent-name in the main conversation to directly invoke a Subagent rather than relying on the model to auto-trigger based on the description.
context: fork: A Skill executing its entire procedure inside a Subagent context; the Subagent uses skills: to preload procedures into its own context.
heartbeat scheduling: The timed trigger mechanism used by harnesses such as OpenClaw; the agent is woken at scheduled points to run a turn, simulating “standing watch in the background.”
soul-md / memory-md: The personality and memory Markdown files in the OpenClaw workspace; the agent reads its own “soul” before starting work each session. See 05-1.
Part V Evaluation and Case Studies (supplementary concepts)
NemoClaw / always-on agent: NVIDIA’s security shell designed for resident agents; corresponds to a hardened mode covering enterprise-grade SLAs, shared team infrastructure, and unified auditing. See 05-3. harness visualization: OpenClaw breaks down personality, memory, heartbeat, and tool conventions into readable Markdown files so users can directly inspect and modify agent design.Cross-unit shared terms
config layer: The configuration hierarchy of user-global, project, and local layers; higher layers do not necessarily have higher precedence — each tool’s own rules determine priority. See 02-1. least privilege: Agents and tools receive only the permissions actually required for the task; anything beyond that is revoked. credential boundary: The isolation boundary between agent-specific PATs, personal PATs, and machine accounts. pre-flight check: A quick verification before execution (permissions, training opt-out, secret path scan) to avoid cleanup after the fact. hybrid verification: Write general concepts directly; always look up fast-changing facts (config file names, versions, mechanisms) in official documentation. provenance annotation: Marking source, author, date, and confidence level alongside references, examples, and links.Corresponding units
- Foundations overview: 01-1 Why use AI tools “correctly”
- LLM and agent mechanics: 01-2 How LLMs and Agents work
- Context engineering detail: 01-4 Context Engineering
- Configuration layer model: 02-1 The Configuration Layer Model
- Privacy and security terms: Appendix B Privacy and Security Checklist and 03-3 Security, Privacy, and Supply Chain Risk
- Cross-tool configuration terms: 02-6 Other Tools Comparison
- Customization terms: 04-1 CLAUDE.md through 04-11 Agent Teams and Sub Agents
- Case study terms: 05-1 OpenClaw through 05-4 Three Approaches Compared
- [1] Chroma, “Context Rot in Long-Context Language Models,” 2025. [Online]. Available: https://research.trychroma.com (verified 2026-06; includes empirical testing across 18 models)
- [2] Liu, et al., “Lost in the Middle: How Language Models Use Long Contexts,” 2023. [Online]. Available: https://arxiv.org/abs/2307.03172
- [3] Anthropic, “Effective context engineering for AI agents,” 2026. [Online]. Available: https://www.anthropic.com/engineering/effective-context-engineering-for-ai-agents
- [4] Yang, et al., “Large Language Models as Optimizers,” 2023. [Online]. Available: https://arxiv.org/abs/2309.03409
- [5] Khattab, et al., “DSPy: Compiling Declarative Language Model Calls into Self-Improving Pipelines,” 2023. [Online]. Available: https://arxiv.org/abs/2310.03714
- [6] Snyk, “ToxicSkills: 2026 Report on Malicious Skills in the Wild,” 2026. [Online]. Available: https://snyk.io (original citable URL for Snyk ToxicSkills 2026-02 report needs verification)