Skip to main content
What this unit solvesThree projects ask the same question: “Can I see and version-control what this agent has learned?” But they approach it differently. OpenClaw is a TypeScript personal AI assistant framework that writes every Skill as skills/<name>/SKILL.md, making the agent harness readable, version-controlled, and auditable. Hermes-Agent is a self-improving agent built in Python by Nous Research that stores cumulative procedural knowledge in a skills/ directory and uses prompt_builder’s three-layer system prompt assembly (stable, context, volatile) to inject personality, instructions, and memory in a disciplined way. NemoClaw is NVIDIA’s reference stack open-sourced in 2026-03 that wraps OpenClaw or Hermes inside an OpenShell sandbox, providing YAML blueprints, network policies, SSRF validation, managed inference, and other hardening layers — turning a personal agent into a service acceptable to enterprise environments. This unit compares the differences, capabilities, and characteristics of all three, and covers what you should learn from them, what questions to ask yourself, and where things are heading. For a deep dive into NemoClaw, see 05-3.

Learning objectives

  • Use a single comparison table to articulate the key differences among the three in positioning, core language, Skill/harness design, memory mechanisms, and enterprise readiness
  • Explain the shared design orientation of all three: making an agent’s “planning and personality” explicit as human-readable, version-controlled .md files
  • Concretely state what can be learned from each framework and which self-directed questions to ask when putting them into practice
  • Assess the trend toward agent shell standardization and locate where you currently stand
  • Explain the role difference between NemoClaw and OpenClaw / Hermes: agent body vs. hardened shell

1. Comparison tables: positioning, technology, Skill design, memory, and community

1.1 One-sentence positioning for each route

ProjectOne-sentence positioningRole
OpenClaw”A personal daemon that acts for you across 20+ communication channels”Agent body (personal)
Hermes-Agent”A self-improving agent that reassembles its own personality every morning”Agent body (personal)
NemoClaw”A hardened shell that puts either of the above agents inside an OpenShell sandbox”Hardened shell (enterprise)
OpenClaw and Hermes-Agent solve “how the agent does things”; NemoClaw solves “how enterprises can accept an always-on agent.” These three are not competing — they are additive: NemoClaw wraps OpenClaw or Hermes and provides the hardening layer.

1.2 Full-dimension comparison

DimensionOpenClaw (05-1)Hermes-Agent (05-2)NemoClaw (05-3)
RoleAgent body (personal)Agent body (personal)Hardened shell (enterprise)
Core languageTypeScript (Node.js)Python (uv managed)TypeScript + YAML + Bash
LicenseMITMITApache 2.0
Inference backendMultiple providers (Anthropic, OpenAI, etc.)Any OpenAI-compatible APIManaged inference + model-specific-setup registry
SandboxOptional (agents.defaults.sandbox)6 terminal backend optionsMandatory (OpenShell)
Network policyTool levelTool levelBlueprint + policy engine
Skill storageskills/<name>/SKILL.mdskills/<name>/SKILL.md.agents/skills/ (three-tier audience)
Skill loadingFull loadProgressive disclosure (Level 0 / Level 1)Decision tree via nemoclaw-skills-guide skill
Memory refinementMEMORY.mdMEMORY.md (frozen snapshot)Inherited from inner agent
Detailed logmemory/YYYY-MM-DD.mdSQLite + FTS5Inherited from inner agent
HeartbeatHEARTBEAT.md + Gateway scheduler(Needs source verification)Inner agent + blueprint control
Personality fileSOUL.md + IDENTITY.mdSOUL.mdInherited from inner agent
User fileUSER.mdUSER.md (volatile)Inherited from inner agent
Tool conventionsTOOLS.mdMerged into AGENTS.mdBlueprint declaration
Multi-channel routing20+ platformsTelegram, Discord, etc. (adapter pattern)Inherited via OpenClaw inner agent
Sub-agent delegationVia cron / ACPdelegate_task (default 3 concurrent)Inherited from inner agent
Enterprise readyNo (local-first personal)No (local-first personal)Yes (policy-as-code + vulnerability disclosure)
Policy expressionmd filesmd filesYAML blueprint + policies/
Upstream upgradeSelf-hostedSelf-hostedPlugin rather than fork; no rebase needed
Why NemoClaw has so many “inherited from inner agent” entriesNemoClaw does not build from scratch — it puts OpenClaw or Hermes wholesale into an OpenShell sandbox. So OpenClaw’s SOUL.md mechanism, Hermes’s SQLite memory, heartbeat scheduling: all of these are “inherited” from the inner agent. NemoClaw’s own strength is in the shell: blueprints, policies, inference routing, and the vulnerability disclosure process.

2. Common ground: making planning and orchestration explicit as readable, version-controlled md files

The three projects look technically divergent (TypeScript vs Python vs TypeScript+YAML), but their underlying design philosophy is highly consistent: pulling an agent’s “personality, planning, and orchestration” out of implicit strings or closed-source binaries and putting it into human-readable, diffable, version-controlled .md files. Concrete manifestations:
  • All three use SKILL.md as the primary medium for expressing capabilities (YAML frontmatter prefix, Markdown body)
  • All three isolate personality (SOUL.md) into its own file rather than embedding it in a system prompt string
  • All three keep the user side (USER.md) and project side (AGENTS.md) in separate files
  • NemoClaw goes further and makes “enterprise policy” explicit as YAML too (blueprint, policies, model-specific-setup), so the hardening layer also goes through version control and review
The engineering significance of this common ground: once the configuration layer that would normally be buried in a system prompt or in code is pulled into a repo, agent behavior becomes PR-reviewable, diffable, and transferable to the next person. People are not fixed; config files are. Config files that are version-controlled make agent behavior reproducible across different sessions, machines, and team members.
What “how to use a new tool” looks like across the three frameworksSuppose you want to give an agent a “read a PDF report and summarize it” skill:
# SKILL.md (works for both OpenClaw and Hermes)
---
name: pdf-summarize
description: Read a PDF report and produce a 5-point summary
---

## When to use
When the user provides a PDF file path and requests a summary.

## How to do it
1. Extract text from each page with pdfplumber
2. Count tokens with tiktoken
3. If over 8k tokens, chunk first then summarize
4. Output 5 bullet points

## What not to do
- Do not upload PDF content to any external API
- Do not assume the user's research domain
This file is readable by all three frameworks, version-controllable in all three, and discussable in a PR review in all three. The difference is that NemoClaw adds a blueprint declaration stating “which internal APIs this skill can call, and which external hosts it cannot reach.”

3. What to learn: harness externalization, reproducibility, auditability

Five design principles worth internalizing from these three frameworks:
  • Skill as code: Writing an agent’s operational procedures as .md and committing them to version control — just as writing tests as .spec files — is the minimum viable practice for engineering agent behavior. SKILL.md is in the same category as “an agent’s unit test”
  • Memory has a lifecycle: Hermes-Agent’s curator only archives, never deletes; pinned Skills are exempt from automatic transitions. This design prevents the agent from silently forgetting critical knowledge. OpenClaw’s MEMORY.md + memory/YYYY-MM-DD.md layering follows the same lifecycle thinking
  • Tool allowlists isolate background tasks: The evaluation agent forked by Hermes-Agent’s background_review.py (needs source verification: whether this remains the current implementation, skeleton unconfirmed) can only call memory and Skill management tools. NemoClaw’s SSRF validation + network policies embody the same isolation philosophy: keeping “what can be acted upon” and “what is being acted upon” separately auditable
  • .plans/ or blueprint as decision log: Keeping feature planning documents in the repository means future readers (including agents) can see “why this was designed this way” and not just “what was implemented.” NemoClaw’s nemoclaw-blueprint/blueprint.yaml is the hardened version of this idea
  • Shell as policy: NemoClaw moves security decisions such as “can this reach the internet?” from agent code into the shell (YAML blueprint) for enforcement, completely decoupling agent logic from security policy. This is the cleanest solution for scenarios where “agent logic upgrades frequently but enterprise policy cannot move”
Do not copy blindlySomeone else’s .md reflects someone else’s workflow. OpenClaw’s SOUL.md belongs to the OpenClaw team; Hermes-Agent’s MEMORY.md frozen-snapshot assumptions belong to Hermes-Agent. Copying directly often imports constraints you do not need, and may even introduce security risks (see 03-3 Security, Privacy, and Supply Chain Risks). Understand “why it was designed this way” before deciding whether to adopt the design.

4. Questions to ask yourself: is your agent configuration scattered implicitly or made explicit in files

Translate the shared design of these three frameworks into self-directed audit questions:
  • Is your current agent behavior scattered across system prompt strings, or do you have versioned SKILL.md files?
  • Can you see “what new behavior the agent learned” in a Pull Request diff?
  • Does background memory updating have an auditable boundary, or does it silently consume the main flow’s context?
  • If your agent configuration were handed off to another person, could they understand the entire harness in five minutes?
  • When your agent upgrades its upstream, will the hardening layer or customizations be preserved as-is, or must they be rebased?
  • Are your “can it reach the internet?” and “can it read this directory?” decisions at the code level or the policy level (policy-as-code)?
If you answered “no” to two or more of the first three questions, you are in a state where “agent behavior is neither reproducible nor auditable” — which is exactly the problem OpenClaw, Hermes, and NemoClaw are built to solve.

5. Where things are heading: agent shell standardization, md files as personality and orchestration carriers

Several trends worth watching (as of 2026-05 / 2026-06):
  • SKILL.md + YAML frontmatter is becoming the de facto format. OpenClaw, Hermes-Agent, Claude Code skills, and the agentskills.io specification are all converging on this direction (agentskills.io specification as of 2026-05; needs source verification: whether a formal cross-framework standardization body or RFC exists, skeleton unconfirmed). In the near term, a “minimum compatible SKILL.md subset across frameworks” is likely to emerge
  • Planning documents are becoming RAG inputs that agents can query. Files in .plans/ directories let an agent be asked “why was this designed this way” — the simplest source of self-explanation capability for an agent
  • NemoClaw pushes the trend one step further. It makes “enterprise policy” explicit as version-controllable YAML too (blueprint, policies, model-specific-setup), pushing the hardening layer from “every agent doing its own thing” toward “reference stack + policy-as-code.” NemoClaw’s plugin-not-fork approach to upgrading upstream OpenClaw — where the hardening layer doesn’t need to be rebased on upgrade — is likely to become the template for other personal agent frameworks entering enterprise environments
  • Future infrastructure directions: CI testing of agent behavior (OpenClaw already has .github/codeql/ for boundary quality scanning), automated Skill lifecycle management (the curator pattern), policy testing (NemoClaw’s preflight), and security audit units (the OpenShell sandbox boundary)
These are observations, not predictions“Format convergence” and “no rebase needed for the hardening layer on upgrade” are two things that can be verified today; the other trends are still evolving. Use them as a sense of direction when designing your own agent configuration, not as inviolable mandates.

6. One everyday analogy

The three frameworks are like three styles of having a working assistant:
  • OpenClaw is like a pocket procedures manual: you write every SOP into it, and it follows the manual. The manual is yours (version-controlled); it is just the executor
  • Hermes-Agent is like an intern with a learning journal: every morning it reconstitutes itself from three documents (SOUL.md + MEMORY.md + USER.md), writes in its journal after work, and reads it tomorrow. The journal is being written but does not take effect immediately; the discipline trades for overhead
  • NemoClaw is like putting either type of robot inside a reinforced glass display case: the robot itself has not changed (OpenClaw is still OpenClaw, Hermes is still Hermes), but the case determines what it can touch, what it cannot, who can see it, and who it can see. The case’s policy (YAML blueprint) is version-controlled, and the robot’s personality (SOUL.md) is version-controlled too — the two layers are managed independently
Which one you choose depends on your scenario: SOP discipline, growth potential, or enterprise acceptability. Most people start with OpenClaw or Hermes, then add NemoClaw when they hit an enterprise or always-on requirement.

Selection guide


Hands-on exercises

  • Compare Skills across frameworks: Clone both OpenClaw and Hermes-Agent, locate each one’s Skill directory, and compare the frontmatter structure and body style of an existing SKILL.md. Answer: “Can Skills from these two frameworks be ported to each other?” Experiment: put the simplest possible SKILL.md into OpenClaw’s skills/ and Hermes’s skills/ and check whether it gets loaded at startup.
  • Externalize your own agent configuration: Pick three operational procedures you currently express as system prompt strings and try rewriting them as a SKILL.md draft (YAML frontmatter + Markdown body). Add it to version control, open a PR to review it yourself, and check whether the diff is the kind of “small, focused change” you expected.
  • Read a NemoClaw blueprint: Clone the NemoClaw repository, read nemoclaw-blueprint/blueprint.yaml and one example from nemoclaw-blueprint/policies/, and absorb the shape of policy-as-code. Ask yourself: does your company (or lab) have an equivalent practice of “making policy explicit in files”?

Common pitfalls

  • Thinking “I have a .md file” equals “I have version control”: Writing a SKILL.md but not committing it to Git, or overwriting it every time without leaving a diff, is the same as not externalizing it. An md file not in version control is no different from a system prompt string
  • Conflating “memory (personal preferences and facts)” with “Skill (reproducible operational procedures)”: Hermes-Agent’s memory management explicitly evaluates these two categories separately and they should not be merged. Personal preferences go into MEMORY.md / USER.md; operational procedures go into SKILL.md
  • Misreading Hermes-Agent’s self-improvement capability as “the agent will autonomously change its own behavior in unpredictable ways”: In practice, SOUL.md frozen snapshots, curator-only-archives behavior, and tool allowlist isolation are all guardrails that actively constrain unexpected mutations. Self-improvement does not equal unpredictability
  • Thinking NemoClaw replaces OpenClaw or Hermes: NemoClaw is a hardened shell with no AIAgent main loop of its own. Without OpenClaw or Hermes running inside it, NemoClaw has no purpose
  • Copying someone else’s SOUL.md or SKILL.md: See the warning in Section 3 — someone else’s configuration reflects their own workflow and constraints

Self-check

The bar for passing this unit
  1. Without looking at the documentation, can you fill in a three-column table (OpenClaw, Hermes-Agent, your current agent) covering: Skill storage location, memory lifecycle design, and background evaluation mechanism?
  2. Can you articulate the role difference among OpenClaw, Hermes-Agent, and NemoClaw? (Hint: which are agent bodies, and which is the hardened shell?)
  3. Can you see “what new behavior the agent learned” in a Pull Request diff? If not, is the most opaque part technical debt or an intentional decision?
  4. Is your workflow “personal experimentation,” “always-on daemon,” or “enterprise production”? Which combination does that point to (bare OpenClaw / bare Hermes / OpenClaw + NemoClaw / Hermes + NemoClaw)?

Sources and further reading

Factual claims are grounded in official documentation; fast-changing items are annotated as of 2026-05.