05-4 OpenClaw, Hermes-Agent, and NemoClaw: Three Routes Compared

What this unit solvesThree projects ask the same question: “Can I see and version-control what this agent has learned?” But they approach it differently. OpenClaw is a TypeScript personal AI assistant framework that writes every Skill as skills/<name>/SKILL.md, making the agent harness readable, version-controlled, and auditable. Hermes-Agent is a self-improving agent built in Python by Nous Research that stores cumulative procedural knowledge in a skills/ directory and uses prompt_builder’s three-layer system prompt assembly (stable, context, volatile) to inject personality, instructions, and memory in a disciplined way. NemoClaw is NVIDIA’s reference stack open-sourced in 2026-03 that wraps OpenClaw or Hermes inside an OpenShell sandbox, providing YAML blueprints, network policies, SSRF validation, managed inference, and other hardening layers — turning a personal agent into a service acceptable to enterprise environments. This unit compares the differences, capabilities, and characteristics of all three, and covers what you should learn from them, what questions to ask yourself, and where things are heading. For a deep dive into NemoClaw, see 05-3.

Learning objectives

Use a single comparison table to articulate the key differences among the three in positioning, core language, Skill/harness design, memory mechanisms, and enterprise readiness
Explain the shared design orientation of all three: making an agent’s “planning and personality” explicit as human-readable, version-controlled .md files
Concretely state what can be learned from each framework and which self-directed questions to ask when putting them into practice
Assess the trend toward agent shell standardization and locate where you currently stand
Explain the role difference between NemoClaw and OpenClaw / Hermes: agent body vs. hardened shell

1. Comparison tables: positioning, technology, Skill design, memory, and community

1.1 One-sentence positioning for each route

Project	One-sentence positioning	Role
OpenClaw	”A personal daemon that acts for you across 20+ communication channels”	Agent body (personal)
Hermes-Agent	”A self-improving agent that reassembles its own personality every morning”	Agent body (personal)
NemoClaw	”A hardened shell that puts either of the above agents inside an OpenShell sandbox”	Hardened shell (enterprise)

OpenClaw and Hermes-Agent solve “how the agent does things”; NemoClaw solves “how enterprises can accept an always-on agent.” These three are not competing — they are additive: NemoClaw wraps OpenClaw or Hermes and provides the hardening layer.

1.2 Full-dimension comparison

Dimension	OpenClaw (05-1)	Hermes-Agent (05-2)	NemoClaw (05-3)
Role	Agent body (personal)	Agent body (personal)	Hardened shell (enterprise)
Core language	TypeScript (Node.js)	Python (`uv` managed)	TypeScript + YAML + Bash
License	MIT	MIT	Apache 2.0
Inference backend	Multiple providers (Anthropic, OpenAI, etc.)	Any OpenAI-compatible API	Managed inference + `model-specific-setup` registry
Sandbox	Optional (`agents.defaults.sandbox`)	6 terminal backend options	Mandatory (OpenShell)
Network policy	Tool level	Tool level	Blueprint + policy engine
Skill storage	`skills/<name>/SKILL.md`	`skills/<name>/SKILL.md`	`.agents/skills/` (three-tier audience)
Skill loading	Full load	Progressive disclosure (Level 0 / Level 1)	Decision tree via `nemoclaw-skills-guide` skill
Memory refinement	`MEMORY.md`	`MEMORY.md` (frozen snapshot)	Inherited from inner agent
Detailed log	`memory/YYYY-MM-DD.md`	SQLite + FTS5	Inherited from inner agent
Heartbeat	`HEARTBEAT.md` + Gateway scheduler	Official documentation does not explicitly state	Inner agent + blueprint control
Personality file	`SOUL.md` + `IDENTITY.md`	`SOUL.md`	Inherited from inner agent
User file	`USER.md`	`USER.md` (volatile)	Inherited from inner agent
Tool conventions	`TOOLS.md`	Merged into `AGENTS.md`	Blueprint declaration
Multi-channel routing	20+ platforms	Telegram, Discord, etc. (adapter pattern)	Inherited via OpenClaw inner agent
Sub-agent delegation	Via cron / ACP	`delegate_task` (default 3 concurrent)	Inherited from inner agent
Enterprise ready	No (local-first personal)	No (local-first personal)	Yes (policy-as-code + vulnerability disclosure)
Policy expression	md files	md files	YAML blueprint + policies/
Upstream upgrade	Self-hosted	Self-hosted	Plugin rather than fork; no rebase needed

Why NemoClaw has so many “inherited from inner agent” entriesNemoClaw does not build from scratch — it puts OpenClaw or Hermes wholesale into an OpenShell sandbox. So OpenClaw’s SOUL.md mechanism, Hermes’s SQLite memory, heartbeat scheduling: all of these are “inherited” from the inner agent. NemoClaw’s own strength is in the shell: blueprints, policies, inference routing, and the vulnerability disclosure process.

2. Common ground: making planning and orchestration explicit as readable, version-controlled md files

The three projects look technically divergent (TypeScript vs Python vs TypeScript+YAML), but their underlying design philosophy is highly consistent: pulling an agent’s “personality, planning, and orchestration” out of implicit strings or closed-source binaries and putting it into human-readable, diffable, version-controlled .md files. Concrete manifestations:

All three use SKILL.md as the primary medium for expressing capabilities (YAML frontmatter prefix, Markdown body)
All three isolate personality (SOUL.md) into its own file rather than embedding it in a system prompt string
All three keep the user side (USER.md) and project side (AGENTS.md) in separate files
NemoClaw goes further and makes “enterprise policy” explicit as YAML too (blueprint, policies, model-specific-setup), so the hardening layer also goes through version control and review

The engineering significance of this common ground: once the configuration layer that would normally be buried in a system prompt or in code is pulled into a repo, agent behavior becomes PR-reviewable, diffable, and transferable to the next person. People are not fixed; config files are. Config files that are version-controlled make agent behavior reproducible across different sessions, machines, and team members.

What “how to use a new tool” looks like across the three frameworksSuppose you want to give an agent a “read a PDF report and summarize it” skill:

# SKILL.md (works for both OpenClaw and Hermes)
---
name: pdf-summarize
description: Read a PDF report and produce a 5-point summary
---

## When to use
When the user provides a PDF file path and requests a summary.

## How to do it
1. Extract text from each page with pdfplumber
2. Count tokens with tiktoken
3. If over 8k tokens, chunk first then summarize
4. Output 5 bullet points

## What not to do
- Do not upload PDF content to any external API
- Do not assume the user's research domain

This file is readable by all three frameworks, version-controllable in all three, and discussable in a PR review in all three. The difference is that NemoClaw adds a blueprint declaration stating “which internal APIs this skill can call, and which external hosts it cannot reach.”

3. What to learn: harness externalization, reproducibility, auditability

Five design principles worth internalizing from these three frameworks:

Skill as code: Writing an agent’s operational procedures as .md and committing them to version control — just as writing tests as .spec files — is the minimum viable practice for engineering agent behavior. SKILL.md is in the same category as “an agent’s unit test”
Memory has a lifecycle: Hermes-Agent’s curator only archives, never deletes; pinned Skills are exempt from automatic transitions. This design prevents the agent from silently forgetting critical knowledge. OpenClaw’s MEMORY.md + memory/YYYY-MM-DD.md layering follows the same lifecycle thinking
Tool allowlists isolate background tasks: The evaluation agent forked by Hermes-Agent’s background_review.py (per official documentation describing this as the background-task isolation mechanism) can only call memory and Skill management tools. NemoClaw’s SSRF validation + network policies embody the same isolation philosophy: keeping “what can be acted upon” and “what is being acted upon” separately auditable
.plans/ or blueprint as decision log: Keeping feature planning documents in the repository means future readers (including agents) can see “why this was designed this way” and not just “what was implemented.” NemoClaw’s nemoclaw-blueprint/blueprint.yaml is the hardened version of this idea
Shell as policy: NemoClaw moves security decisions such as “can this reach the internet?” from agent code into the shell (YAML blueprint) for enforcement, completely decoupling agent logic from security policy. This is the cleanest solution for scenarios where “agent logic upgrades frequently but enterprise policy cannot move”

Do not copy blindlySomeone else’s .md reflects someone else’s workflow. OpenClaw’s SOUL.md belongs to the OpenClaw team; Hermes-Agent’s MEMORY.md frozen-snapshot assumptions belong to Hermes-Agent. Copying directly often imports constraints you do not need, and may even introduce security risks (see 03-3 Security, Privacy, and Supply Chain Risks). Understand “why it was designed this way” before deciding whether to adopt the design.

4. Questions to ask yourself: is your agent configuration scattered implicitly or made explicit in files

Translate the shared design of these three frameworks into self-directed audit questions:

Is your current agent behavior scattered across system prompt strings, or do you have versioned SKILL.md files?
Can you see “what new behavior the agent learned” in a Pull Request diff?
Does background memory updating have an auditable boundary, or does it silently consume the main flow’s context?
If your agent configuration were handed off to another person, could they understand the entire harness in five minutes?
When your agent upgrades its upstream, will the hardening layer or customizations be preserved as-is, or must they be rebased?
Are your “can it reach the internet?” and “can it read this directory?” decisions at the code level or the policy level (policy-as-code)?

If you answered “no” to two or more of the first three questions, you are in a state where “agent behavior is neither reproducible nor auditable” — which is exactly the problem OpenClaw, Hermes, and NemoClaw are built to solve.

5. Where things are heading: agent shell standardization, md files as personality and orchestration carriers

Several trends worth watching (as of 2026-05 / 2026-06):

SKILL.md + YAML frontmatter is becoming the de facto format. OpenClaw, Hermes-Agent, Claude Code skills, and the agentskills.io specification are all converging on this direction (agentskills.io specification as of 2026-05; whether a formal cross-framework standardization body or RFC exists, check the agentskills.io official page). In the near term, a “minimum compatible SKILL.md subset across frameworks” is likely to emerge
Planning documents are becoming RAG inputs that agents can query. Files in .plans/ directories let an agent be asked “why was this designed this way” — the simplest source of self-explanation capability for an agent
NemoClaw pushes the trend one step further. It makes “enterprise policy” explicit as version-controllable YAML too (blueprint, policies, model-specific-setup), pushing the hardening layer from “every agent doing its own thing” toward “reference stack + policy-as-code.” NemoClaw’s plugin-not-fork approach to upgrading upstream OpenClaw — where the hardening layer doesn’t need to be rebased on upgrade — is likely to become the template for other personal agent frameworks entering enterprise environments
Future infrastructure directions: CI testing of agent behavior (OpenClaw already has .github/codeql/ for boundary quality scanning), automated Skill lifecycle management (the curator pattern), policy testing (NemoClaw’s preflight), and security audit units (the OpenShell sandbox boundary)

These are observations, not predictions“Format convergence” and “no rebase needed for the hardening layer on upgrade” are two things that can be verified today; the other trends are still evolving. Use them as a sense of direction when designing your own agent configuration, not as inviolable mandates.

6. One everyday analogy

The three frameworks are like three styles of having a working assistant:

OpenClaw is like a pocket procedures manual: you write every SOP into it, and it follows the manual. The manual is yours (version-controlled); it is just the executor
Hermes-Agent is like an intern with a learning journal: every morning it reconstitutes itself from three documents (SOUL.md + MEMORY.md + USER.md), writes in its journal after work, and reads it tomorrow. The journal is being written but does not take effect immediately; the discipline trades for overhead
NemoClaw is like putting either type of robot inside a reinforced glass display case: the robot itself has not changed (OpenClaw is still OpenClaw, Hermes is still Hermes), but the case determines what it can touch, what it cannot, who can see it, and who it can see. The case’s policy (YAML blueprint) is version-controlled, and the robot’s personality (SOUL.md) is version-controlled too — the two layers are managed independently

Which one you choose depends on your scenario: SOP discipline, growth potential, or enterprise acceptability. Most people start with OpenClaw or Hermes, then add NemoClaw when they hit an enterprise or always-on requirement.

Selection guide

Hands-on exercises

Compare Skills across frameworks: Clone both OpenClaw and Hermes-Agent, locate each one’s Skill directory, and compare the frontmatter structure and body style of an existing SKILL.md. Answer: “Can Skills from these two frameworks be ported to each other?” Experiment: put the simplest possible SKILL.md into OpenClaw’s skills/ and Hermes’s skills/ and check whether it gets loaded at startup.
Externalize your own agent configuration: Pick three operational procedures you currently express as system prompt strings and try rewriting them as a SKILL.md draft (YAML frontmatter + Markdown body). Add it to version control, open a PR to review it yourself, and check whether the diff is the kind of “small, focused change” you expected.
Read a NemoClaw blueprint: Clone the NemoClaw repository, read nemoclaw-blueprint/blueprint.yaml and one example from nemoclaw-blueprint/policies/, and absorb the shape of policy-as-code. Ask yourself: does your company (or lab) have an equivalent practice of “making policy explicit in files”?

Common pitfalls

Thinking “I have a .md file” equals “I have version control”: Writing a SKILL.md but not committing it to Git, or overwriting it every time without leaving a diff, is the same as not externalizing it. An md file not in version control is no different from a system prompt string
Conflating “memory (personal preferences and facts)” with “Skill (reproducible operational procedures)”: Hermes-Agent’s memory management explicitly evaluates these two categories separately and they should not be merged. Personal preferences go into MEMORY.md / USER.md; operational procedures go into SKILL.md
Misreading Hermes-Agent’s self-improvement capability as “the agent will autonomously change its own behavior in unpredictable ways”: In practice, SOUL.md frozen snapshots, curator-only-archives behavior, and tool allowlist isolation are all guardrails that actively constrain unexpected mutations. Self-improvement does not equal unpredictability
Thinking NemoClaw replaces OpenClaw or Hermes: NemoClaw is a hardened shell with no AIAgent main loop of its own. Without OpenClaw or Hermes running inside it, NemoClaw has no purpose
Copying someone else’s SOUL.md or SKILL.md: See the warning in Section 3 — someone else’s configuration reflects their own workflow and constraints

Self-check

The bar for passing this unit

Without looking at the documentation, can you fill in a three-column table (OpenClaw, Hermes-Agent, your current agent) covering: Skill storage location, memory lifecycle design, and background evaluation mechanism?
Can you articulate the role difference among OpenClaw, Hermes-Agent, and NemoClaw? (Hint: which are agent bodies, and which is the hardened shell?)
Can you see “what new behavior the agent learned” in a Pull Request diff? If not, is the most opaque part technical debt or an intentional decision?
Is your workflow “personal experimentation,” “always-on daemon,” or “enterprise production”? Which combination does that point to (bare OpenClaw / bare Hermes / OpenClaw + NemoClaw / Hermes + NemoClaw)?

Sources and further reading

Factual claims are grounded in official documentation; fast-changing items are annotated as of 2026-05.

[1] OpenClaw, “openclaw/openclaw,” GitHub. https://github.com/openclaw/openclaw (as of 2026-05)
[2] OpenClaw, “openclaw.ai,” Official website. https://openclaw.ai (as of 2026-05)
[3] Nous Research, “NousResearch/hermes-agent,” GitHub. https://github.com/NousResearch/hermes-agent (as of 2026-05)
[4] Nous Research, “Hermes-Agent Documentation,” hermes-agent.nousresearch.com. https://hermes-agent.nousresearch.com/docs/ (as of 2026-05)
[5] NVIDIA, “NVIDIA/NemoClaw,” GitHub. https://github.com/NVIDIA/NemoClaw (as of 2026-06)
[6] NVIDIA, “NemoClaw Documentation,” docs.nvidia.com. https://docs.nvidia.com/nemoclaw/latest/ (as of 2026-06)
[7] agentskills.io, “Open Specification,” agentskills.io. https://agentskills.io/specification (as of 2026-05)

Deep dives: 05-1 OpenClaw dissection, 05-2 Hermes-Agent dissection, 05-3 NemoClaw dissection
Threat model behind the security boundaries: 03-3 Security, Privacy, and Supply Chain Risks

Overview

Part I Foundations

Part II Configuration

Part III Judgment

Part IV Customization

Part V Case Studies

Appendix

05-4 OpenClaw, Hermes-Agent, and NemoClaw: Three Routes Compared

Learning objectives

1. Comparison tables: positioning, technology, Skill design, memory, and community

1.1 One-sentence positioning for each route

1.2 Full-dimension comparison

2. Common ground: making planning and orchestration explicit as readable, version-controlled md files

3. What to learn: harness externalization, reproducibility, auditability

4. Questions to ask yourself: is your agent configuration scattered implicitly or made explicit in files

5. Where things are heading: agent shell standardization, md files as personality and orchestration carriers

6. One everyday analogy

Selection guide

Hands-on exercises

Common pitfalls

Self-check

Sources and further reading

​Learning objectives

​1. Comparison tables: positioning, technology, Skill design, memory, and community

​1.1 One-sentence positioning for each route

​1.2 Full-dimension comparison

​2. Common ground: making planning and orchestration explicit as readable, version-controlled md files

​3. What to learn: harness externalization, reproducibility, auditability

​4. Questions to ask yourself: is your agent configuration scattered implicitly or made explicit in files

​5. Where things are heading: agent shell standardization, md files as personality and orchestration carriers

​6. One everyday analogy

​Selection guide

​Hands-on exercises

​Common pitfalls

​Self-check

​Sources and further reading

Learning objectives

1. Comparison tables: positioning, technology, Skill design, memory, and community

1.1 One-sentence positioning for each route

1.2 Full-dimension comparison

2. Common ground: making planning and orchestration explicit as readable, version-controlled md files

3. What to learn: harness externalization, reproducibility, auditability

4. Questions to ask yourself: is your agent configuration scattered implicitly or made explicit in files

5. Where things are heading: agent shell standardization, md files as personality and orchestration carriers

6. One everyday analogy

Selection guide

Hands-on exercises

Common pitfalls

Self-check

Sources and further reading