04-4 Skills: Purpose, Format, Authoring, References and Scripts, Cross-Tool Differences

What this unit solvesSkills package repeating workflows into procedures the model can invoke automatically via the description field. A skill is more than a single SKILL.md: depending on the required rigor and scope, it may include a references/ directory for few-shot templates and specs, and a scripts/ directory for executable tool scripts. This unit walks through building a practical skill from scratch, explains how to write the description, uses progressive disclosure to manage token costs, and compares equivalent mechanisms across tools.

Learning objectives

Write a correctly structured SKILL.md with a description and the necessary frontmatter fields.
Decide when to create a references/ subdirectory (few-shot templates, spec documents) and a scripts/ subdirectory (executable scripts).
Use progressive disclosure to control token consumption so a complex skill does not dump all its content into context on every invocation.
Decide whether a given requirement belongs in a Skill, Rules, or Subagent, and articulate the criteria.
Compare equivalent skill-packaging mechanisms across tools and identify the key differences.

1. What a skill is

A skill is a directory placed at ~/.claude/skills/<name>/ or .claude/skills/<name>/, containing a required SKILL.md and optional references/ and scripts/. Claude invokes it automatically based on description semantics during conversation; users can also trigger it explicitly with /<name> [1]. The fundamental difference from CLAUDE.md:

Dimension	`CLAUDE.md`	Skill
Load time	Injected in full at every session start	Fully loaded only when the model judges it relevant, or the user types `/<name>`
Content cost	Paid every session, across the entire conversation	Only injected in the triggering turn; zero cost if never triggered
Appropriate content	General conventions, mandatory rules	Procedures, workflows, domain knowledge invoked by context

Skills are pulled in on demandCLAUDE.md is like a desk calendar: always in front of you. A skill is like the manual in a drawer: you pull it out when you need to fix the plumbing, then put it back. Moving expensive content from “paying every day” to “paying only when used” is the core value of the skill mechanism.

Skills and Commands now share the same underlying mechanismAs of 2026-06, .claude/commands/<name>.md and .claude/skills/<name>/SKILL.md in Claude Code are backed by the same mechanism: both produce a /<name> command. The difference is that the latter has a directory where you can place references/ and scripts/, while the former is a single file [1]. This unit covers the latter (Skills); the design side of explicit triggering is in 04-3.

2. Directory structure

my-skill

SKILL.md · required: main instructions and frontmatter

reference.md · optional: detailed API docs, loaded only when needed

examples

scripts

SKILL.md is the entry point; everything else is supporting material. When SKILL.md references reference.md or similar files via relative paths, Claude reads them on demand (see Section 6, progressive disclosure). Scripts in scripts/ are invoked by Claude through the Bash tool, not read into context.

3. Complete SKILL.md frontmatter reference

---
name: my-skill
description: What this skill does and when to use it
when_to_use: Additional trigger phrases
argument-hint: [issue-number]
arguments:
  - issue
disable-model-invocation: false
user-invocable: true
allowed-tools: Read Grep
disallowed-tools: AskUserQuestion
model: sonnet
effort: high
context: fork
agent: Explore
hooks: {}
paths:
  - "src/api/**/*.{ts,tsx}"
shell: bash
---

Field by field (as of 2026-06, per the official Skills section [1]):

Field	Required	Purpose
`name`	Optional	Display name; defaults to the directory name. Only the root `SKILL.md` of a plugin uses this to determine the command name
`description`	Recommended	The basis on which the model decides whether to invoke the skill. `description` + `when_to_use` combined are truncated to 1,536 characters to save context
`when_to_use`	Optional	Additional trigger context and example prompts; can fill in edge cases `description` does not cover
`argument-hint`	Optional	Parameter hint shown during autocomplete, e.g. `[issue-number]` or `[filename] [format]`
`arguments`	Optional	Named positional parameters; can be a string or a list. Names map to positions in order
`disable-model-invocation`	Optional	Set `true` to prevent Claude from invoking automatically. Default `false`
`user-invocable`	Optional	Set `false` to hide from the `/` menu; Claude-only invocation. Default `true`
`allowed-tools`	Optional	Tools that do not require confirmation when this skill is active (does not restrict other tools from being used)
`disallowed-tools`	Optional	Tools removed from the available pool while this skill is active
`model`	Optional	Model to use when this skill is active; overrides the session model and reverts on the next turn
`effort`	Optional	Effort level (`low` / `medium` / `high` / `xhigh` / `max`); overrides session effort
`context: fork`	Optional	Set `fork` to execute in an isolated subagent context
`agent`	Optional	Subagent type to use when `context: fork` is set (`Explore`, `Plan`, custom)
`hooks`	Optional	Hooks scoped to this skill’s lifecycle
`paths`	Optional	Glob patterns restricting when this skill is considered for automatic invocation
`shell`	Optional	Shell for dynamic injection (`bash` by default; `powershell` requires `CLAUDE_CODE_USE_POWERSHELL_TOOL=1`)

allowed-tools does not block tools not listedallowed-tools is a “no confirmation required” list only. Tools not listed can still be used, governed by session permission settings. To block a specific tool, use a deny rule in settings.json [1].

4. Trigger control: description, `when_to_use`, and paths

How you write the description determines whether the model will invoke your skill. A poor description leads to false triggers or missed triggers.

Description writing comparedToo broad — will trigger on unrelated requests:

description: Help me with all kinds of writing tasks

Too narrow — will miss legitimate triggers:

description: Convert my weekend research journal into markdown

About right:

description: Organize research observations, insights, and action items from the conversation into a structured markdown log. Use when the user says "log this", "記下來", "做個筆記", or wants to capture ideas from a long conversation into a single doc.
when_to_use: Applicable when the user wants to save ideas they don't want to act on right now but need to preserve.

Key observation: including trigger keywords (colloquial phrases, English synonyms) in description makes it easier for the model to match.

paths further narrows the trigger condition so the skill is only considered when working on a certain class of files:

---
description: API design convention checker
paths:
  - "src/api/**/*.{ts,tsx}"
---

The model lists this skill as a candidate only when editing files under src/api/.

5. Rigor levels: Level 1 to Level 3

Level 1: Single-file skill

Only a SKILL.md, 50 lines or fewer, no external dependencies. Good for quickly packaging a personal recurring workflow.

---
description: Format currently staged changes as a Conventional Commits message
disable-model-invocation: true
allowed-tools: Bash(git *)
---

## Staged diff
!`git diff --cached`

## Recent style
!`git log --oneline -5`

## Task
Write a Conventional Commits message for the staged diff above.

Level 2: Skill with few-shot examples

SKILL.md plus format examples or spec documents in references/. Good for scenarios requiring strict output format (academic abstracts, standard reports, contract clauses).

abstract-writer

SKILL.md

references

example-1.md · complete abstract example

example-2.md · edge case

style-guide.md · style rules

SKILL.md references them:

## Instructions

Read the paper's abstract section.
Match the structure in `references/example-1.md`.
Follow section ordering in `references/style-guide.md`.
Write a 150-200 word summary.

Level 3: Skill with executable scripts

SKILL.md plus scripts in scripts/ (Python, Bash, Node.js, etc.). Good for tasks requiring real computation, external APIs, or batch processing; requires careful allowed-tools security configuration.

codebase-visualizer

SKILL.md

scripts

visualize.py

Reference scripts in the same directory using ${CLAUDE_SKILL_DIR} so path resolution is independent of where the skill is installed [1]:

---
description: Generate an interactive collapsible code-tree visualization. Use when exploring a new repo.
allowed-tools: Bash(python3 *)
---

Run:
```bash
python3 ${CLAUDE_SKILL_DIR}/scripts/visualize.py .
```

Start at Level 1 and upgrade when you have a reasonDo not skip levels. After a Level 1 skill is working, if you notice Claude’s output format drifting, add few-shot examples (upgrade to Level 2). If you find you need real computation or external APIs, add scripts (upgrade to Level 3). Level 3 is the heaviest to maintain and is not the right fit for every workflow.

6. Progressive disclosure

The core token management strategy for skills. When SKILL.md loads it enters context and persists for the remainder of the session [1], but subsidiary files (references/, scripts/) are loaded on demand. This means:

Simple skill: all of SKILL.md goes into context.
Complex skill: detailed specs, API docs, and few-shot examples live in references/; SKILL.md only describes “when to read which references/ file,” using Markdown links.

The benefit: the token cost per invocation scales only with the references/ content actually needed that turn, not with the overall size of the skill.

Token savings from progressive disclosureAn “academic abstract” skill: SKILL.md is 30 lines with 5 Markdown links pointing to 5 format templates in references/, each 80 lines.Without progressive disclosure: all 5 templates stuffed into SKILL.md; every invocation = 30 + 5 x 80 = 430 lines.With progressive disclosure: SKILL.md is 30 lines plus 5 links; Claude reads a references/ file only when it actually needs that example. Average per invocation = 30 + 80 = 110 lines — 75% savings.Under a 200k context window, 110 and 430 both look small. Multiplied across invocations and team size, the gap is hundreds of dollars per month.

7. Skill content lifecycle

When a skill is triggered, the rendered content of SKILL.md (with $ARGUMENTS substituted and !`cmd` executed) enters the session conversation history as a single message and persists across subsequent turns [1]. When /compact summarizes the conversation to reclaim context, Claude Code reattaches the content of recently invoked skills, retaining up to 5,000 tokens per skill with a shared budget of 25,000 tokens filled from the most recently invoked. This means: if too many skills are invoked in one session, older ones will be dropped entirely [1].

A skill that seems to stop working is usually not lostIf a skill appears to have no effect after invocation, its content is typically still in context and the model simply chose a different tool for that turn. Strengthen the description and instruction wording, or use a Hook to enforce critical behaviors [1].

8. Dynamic context injection: !`command`

Inside SKILL.md, !`<command>` executes a shell command before the content is sent to Claude, embedding the output inline [1]:

## PR context
- Diff: !`gh pr diff`
- Comments: !`gh pr view --comments`
- Files: !`gh pr diff --name-only`

## Your task
Summarize this pull request in 3-5 bullets.

Rules:

Executes in the current shell (bash by default on macOS/Linux; set shell: powershell on Windows).
Output is inserted as plain text and will not be re-parsed as !` placeholders.
The inline form only triggers ! at line start or immediately after whitespace; KEY=!`cmd` does not trigger.
Multi-line forms use a ```! fence.
Organizations can set "disableSkillShellExecution": true in settings to disable this globally (bundled and managed skills are not affected) [1].

Dynamic injection runs your actual shell!`cmd` executes using your local shell, not a sandbox. If skill content comes from someone else (a shared repo, a plugin marketplace), the injected commands are effectively handing them shell access. Always review SKILL.md yourself before trusting it.

9. Tool comparison

Concept	Anthropic Claude (primary)	OpenAI Codex	Google Antigravity	GitHub Copilot	Cursor
Skill packaging unit	`~/.claude/skills/<name>/SKILL.md` or `.claude/skills/<name>/SKILL.md`	`[[skills.config]]` enable control + searches `$HOME/.agents/skills/` up to repo root (path is `.agents/skills/`, not `.codex/skills/`)	IDE global `~/.gemini/config/skills/`; workspace always `.agents/skills/`	Personal `~/.copilot/skills/<name>/SKILL.md`, project `.github/skills/<name>/SKILL.md` (same name overrides personal)	Rules for AI (`.cursorrules` and `.cursor/rules/`; closer in nature to rules than skills) [3]
Automatic invocation	`description` semantic match + `paths` glob	Config-driven, integrated with commands / hooks	`description` semantic match; agent loads automatically based on task	Config-driven	Rule-based (via frontmatter `globs`)
Few-shot attachment directory	`references/`; other subdirectories named as needed	`references/` subdirectory	`references/`, `assets/` subdirectories	Configuration-based	Via sections within `.mdc` files themselves
Executable script directory	`scripts/`	`scripts/`	`scripts/`	Not part of the main flow	Not applicable
Tool permission scoping	`allowed-tools` / `disallowed-tools`	Configuration-based	Not applicable	Not applicable	Not applicable
Per-invocation model	`model` frontmatter field	Configuration-based	Not applicable	Not applicable	Not applicable
Explicit trigger	`/<name>` command	Integrated with slash commands	Agent auto-detection + `/` menu	Via `/` menu	Via `/` menu
Cross-tool standard	Follows the Agent Skills open standard	Partial	Partial (`SKILL.md` / `skills.md`)	Partial	Does not follow

Naming clarifications and important facts

Claude Code Skills follow the Agent Skills open standard (agentskills.io) [1]. Because this standard is shared across multiple AI tools, skills you write should be directly installable in any compliant editor or CLI.
OpenAI Codex skill mechanism: configured in the [skills] section of config.toml; personal skills at ~/.agents/skills/<name>/SKILL.md, project skills at <repo>/.agents/skills/<name>/SKILL.md (trusted repos only; enabled per [[skills.config]] in config.toml). The path is .agents/skills/, not .codex/skills/ (per developers.openai.com/codex/skills, as of 2026-06).
GitHub Copilot “Skills” and “Copilot Extensions” are two separate things: the former is the local CLI’s SKILL.md mechanism; the latter is a third-party extension marketplace on GitHub. This table covers only the former.
Cursor has no native “Skill” mechanism; .cursor/rules/*.mdc is functionally closer to Claude’s rules, and .cursorrules is a single-file rules variant [3].

10. Pairing with Subagents

Skills provide procedure (steps, criteria, format); Subagents provide isolated context (a clean execution environment). Combining them (see 04-5):

---
name: deep-research
description: Research a topic thoroughly and return an annotated reference list
context: fork
agent: Explore
---

Research $ARGUMENTS thoroughly:

1. Use Glob and Grep to find relevant files
2. Read and analyze the source code
3. Summarize findings with specific file references

With context: fork, the skill content becomes the task prompt for a subagent. The subagent executes in a clean context and returns a summary to the main conversation [1]. When to use context: fork:

The skill needs to perform extensive reads or computation without polluting the main conversation context.
The task is parallelizable and each sub-task is independent.
The main conversation context is near its limit and isolation is needed.

context: fork requires the skill to contain an actual taskIf skill content is purely normative (“follow these API conventions,” no explicit task), context: fork delivers “conventions” to a subagent that has nothing to do, and returns an empty result. context: fork is appropriate for procedural skills with concrete steps [1].

Hands-on exercises

Start at Level 1

Pick a workflow you repeat at least three times a week and write a SKILL.md of 30 lines or fewer, with a description and 2-3 Markdown steps. Place it at ~/.claude/skills/<name>/SKILL.md and start a session to test both automatic triggering and explicit /<name> invocation.

Upgrade to Level 2

Extend the skill from Step 1 by adding references/example.md with a complete few-shot example, and reference it via a Markdown link in SKILL.md. Compare Claude’s output quality with and without the example.

Observe token cost

Invoke the skill multiple times within the same session, then run /context to see how much context the skill content occupies. Cross-check against the SKILL.md line count to verify that skill content enters context in its entirety, as designed.

Test trigger control

Write two skills: one with disable-model-invocation: true and one with user-invocable: false. Verify each invocation permission separately for both user-triggered and model-triggered paths.

Common pitfalls

Anti-pattern list

description too broad (“help me with all kinds of tasks”): the model triggers on unrelated contexts and pollutes context. Narrow to specific, identifiable trigger conditions and keywords.
Stuffing all spec documents into SKILL.md: every invocation dumps the full token payload. Use references/ with progressive disclosure so content is read only when needed.
Skipping Level 1 and writing Level 3 directly: over-engineering a requirement that 30 lines would solve. Verify the workflow is worth packaging before deciding whether to upgrade.
Running shell scripts in Level 3 without setting allowed-tools: both !`cmd` in SKILL.md and execution calls in scripts/ should be explicitly permitted by allowed-tools. Without this, Claude operates under current session permissions, which may trigger large numbers of confirmations or be blocked entirely.
Committing sensitive information in a public repo’s skill: API keys, absolute paths, and internal URLs all get read into context and committed to version control. Values that need to be injected dynamically should come from environment variables or dynamic shell commands, not be hardcoded in SKILL.md.
Ignoring description truncation: a skill’s description + when_to_use combined must stay within 1,536 characters to appear in full in the skill list [1]. Content beyond that is cut off and invisible to the model.

Self-check

The bar for passing this unit

Can you write a SKILL.md with complete frontmatter and a description with precise trigger conditions?
Can you decide whether a requirement belongs in a Skill, Rules, a Command, or a Subagent, and articulate the decision criteria?
Can you use progressive disclosure to break a 400-line procedure into a 30-line SKILL.md plus a references/ subdirectory?
Do you understand the lifecycle of skill content once it enters context (the /compact behavior, the 25k token budget)?

Sources and further reading

Factual claims are grounded in official documentation; fast-changing items are annotated as of 2026-05.

[1] Anthropic, “Extend Claude with skills,” code.claude.com, 2026. [Online]. Available: https://code.claude.com/docs/en/skills (as of 2026-06; includes complete frontmatter reference, references/ and scripts/, dynamic injection, context: fork, lifecycle)
[2] Anthropic, “Commands,” code.claude.com, 2026. [Online]. Available: https://code.claude.com/docs/en/commands (as of 2026-06; command and skill merged mechanism)
[3] Cursor, “Rules,” cursor.com, 2026. [Online]. Available: https://cursor.com/docs/context/rules (as of 2026-06; .mdc frontmatter and globs mechanism)
[4] Agent Skills Open Standard, “Agent Skills,” 2026. [Online]. Available: https://agentskills.io (as of 2026-06; open standard for skill packaging shared by Claude Code and other AI tools)
[5] Agentic AI Foundation (Linux Foundation), “AGENTS.md,” 2026. [Online]. Available: https://agents.md/ (as of 2026-06; cross-tool standard for project rules files)

CLAUDE.md and auto-memory: 04-1 CLAUDE.md and memory files.
Path-scoped rules mechanism: 04-2 Rules.
Explicit-trigger command design: 04-3 Commands.
Combining skills with subagents: 04-5 Subagents.
Context engineering and token budgets: 01-4 Context engineering.

Overview

Part I Foundations

Part II Configuration

Part III Judgment

Part IV Customization

Part V Case Studies

Appendix

04-4 Skills: Purpose, Format, Authoring, References and Scripts, Cross-Tool Differences

Learning objectives

1. What a skill is

2. Directory structure

3. Complete SKILL.md frontmatter reference

4. Trigger control: description, `when_to_use`, and paths

5. Rigor levels: Level 1 to Level 3

Level 1: Single-file skill

Level 2: Skill with few-shot examples

Level 3: Skill with executable scripts

6. Progressive disclosure

7. Skill content lifecycle

8. Dynamic context injection: !`command`

9. Tool comparison

10. Pairing with Subagents

Hands-on exercises

Common pitfalls

Self-check

Sources and further reading

​Learning objectives

​1. What a skill is

​2. Directory structure

​3. Complete SKILL.md frontmatter reference

​4. Trigger control: description, when_to_use, and paths

​5. Rigor levels: Level 1 to Level 3

​Level 1: Single-file skill

​Level 2: Skill with few-shot examples

​Level 3: Skill with executable scripts

​6. Progressive disclosure

​7. Skill content lifecycle

​8. Dynamic context injection: !`command`

​9. Tool comparison

​10. Pairing with Subagents

​Hands-on exercises

​Common pitfalls

​Self-check

​Sources and further reading

Learning objectives

1. What a skill is

2. Directory structure

3. Complete SKILL.md frontmatter reference

4. Trigger control: description, `when_to_use`, and paths

5. Rigor levels: Level 1 to Level 3

Level 1: Single-file skill

Level 2: Skill with few-shot examples

Level 3: Skill with executable scripts

6. Progressive disclosure

7. Skill content lifecycle

8. Dynamic context injection: !`command`

9. Tool comparison

10. Pairing with Subagents

Hands-on exercises

Common pitfalls

Self-check

Sources and further reading