AI2026-06-0513 min read

The Complete Guide to Hermes Agent & Hermes Desktop Skills and Tools

19,932 Skill Catalog, 40+ Built-in Tools, and the Use-Case Patterns That Matter (June 2026)

A comprehensive guide to the Skills & Tools system in Nous Research's open-source agent Hermes Agent v0.15.2 (and the Hermes Desktop GUI), grounded in official docs and GitHub releases. Covers Skills (on-demand procedural docs) with the three-level Progressive Disclosure loading scheme starting at ~3k tokens, the SKILL.md format, the skills.sh catalog that exploded from 858 to 19,932 entries in v0.15.1, the standout new skills (openhands, code-wiki, web-pentest), the self-improving loop where the agent creates / patches / edits / deletes its own Skills, and Tools — 40+ built-ins like web_search, x_search, terminal, patch, browser_navigate, vision_analyze, cronjob, memory, delegate_task. Also covers MCP client + server support, the macOS Computer Use background execution that doesn't move the cursor or switch Spaces (5–20ms/event), and the 25+ messenger gateway (Slack / Discord / Telegram / Teams / WhatsApp / LINE / Feishu / WeCom and more). Ends with eight category-specific combination patterns — research, writing, data analysis, coding, customer support, social listening, internal automation, personal work — sized for Japanese enterprise practice.

Hermes Agent Hermes Desktop Nous Research Skills Tools MCP AI Agent Open Source

TL;DR — What Skills and Tools Are in Hermes Agent

Nous Research's open-source Hermes Agent (latest v0.15.2, May 29, 2026) has two extension mechanisms:

- Skills — on-demand procedural docs. Markdown + YAML frontmatter SKILL.md files that the agent loads when relevant, using a three-level Progressive Disclosure scheme (metadata → body → references / templates) to stay light on context
- Tools — executable functions. 40+ built-ins like web_search, x_search, terminal, patch, cronjob, memory, plus any MCP server

The clean one-liner: Skills = what to do, Tools = how to execute it. A Skill calls Tools; Tools touch the outside world.

v0.15.1 (May 29, 2026) expanded the official Skills catalog (skills.sh) from 858 to 19,932 entries — not just a catalog refresh, but a signal that the agent's self-improving procedural-memory loop is now operating at scale.

Prereqs: this column builds on Hermes Desktop and Hermes Agent × X Premium / Grok integration. The basic Skills & Tools concepts are pre-loaded there.

The Skills System

Skills vs Tools

Official framing: Skills are "on-demand procedural documents" (features/skills).

Dimension	Skills	Tools
Role	Playbook / procedure doc	Executable function
Format	Markdown + YAML	function-calling spec
Invocation	Slash command or auto-load	LLM function call
Content	What to do, how, pitfalls	What to execute
Authoring	User can write Markdown	Code or MCP

Progressive Disclosure (Three-Level Loading)

The defining trait: token-conserving three-level load, per the agentskills.io open standard.

Level	Function	Content	Approx. tokens
0	`skills_list()`	metadata index of all skills	~3k
1	`skill_view(name)`	the `SKILL.md` body itself	depends
2	`skill_view(name, path)`	files under `references/` / `templates/` / `scripts/` / `assets/`	on demand

Even with 10,000+ skills, initial context stays at ~3k tokens. Body content is pulled only when needed.

The SKILL.md File Format

Storage: ~/.hermes/skills/. Primary file: SKILL.md — YAML frontmatter plus Markdown body. Recommended sections:

- When to Use
- Procedure
- Pitfalls
- Verification

platforms: [macos, linux, windows] in frontmatter scopes a skill to specific OSes. Subdirectories carry references/, templates/, scripts/, and assets/.

The Self-Improvement Loop

Distinctive feature: the agent can create, patch, edit, and delete its own Skills via the skill_manage tool. Triggers:

- A complex task completes with 5+ tool calls succeeding
- A working solution emerges after error correction
- A non-trivial workflow is discovered

This creates a "smarter the more you use it" procedural-memory loop, separate from conversational (episodic) memory. The practical implication: business know-how gets codified inside the agent.

The Official Catalog — 858 → 19,932

v0.15.1's headline change: skills.sh (Vercel-hosted public catalog) jumped from 858 to 19,932 entries (GitHub Release v0.15.1).

Three-tier trust model:

- built-in — shipped with Hermes, highest trust
- official — PR-reviewed by Nous Research
- community — community-submitted (use at your own risk)

Notable Skills added in v0.15 (RELEASE_v0.15.0.md):

- openhands — OpenHands-integrated dev workflow
- code-wiki — codebase analysis → internal wiki generation
- web-pentest — web pen-test procedures

Third-party observation: "a typical install ends up with about 28 tools / 89 skills" (blakecrosley.com). The 19,932 figure is the opt-in catalog total, not what a fresh install ships with.

Installing and Running Skills

hermes skills browse                     # list catalog
hermes skills search kubernetes          # keyword search
hermes skills install openai/skills/k8s  # individual install
/k8s deploy the staging manifest         # invoke via slash command

v0.15 also added Skill Bundles — install a related group at once (e.g., "DevOps Bundle" brings k8s / terraform / aws / ci-cd in one shot).

Distribution Channels

- Official repo — Nous-reviewed (official tier)
- skills.sh — Vercel-hosted public catalog including community submissions
- /.well-known/skills/index.json — any domain can self-host
- GitHub owner/repo/path direct install
- Custom taps — your own organization's private registry

Install-time security scans (data-leak, prompt-injection, destructive-command detection) run automatically. Part of v0.15's Promptware Defense.

The Tools System

Built-in Tools — Complete List

Per the official categorization (features/tools). The README says "40+", third parties report ~47 tools across ~20 toolsets.

Category	Tools	What they do
Web	`web_search`, `web_extract`	Search and page extraction
X / Twitter	`x_search`	xAI-routed X post search (opt-in)
Terminal & Files	`terminal`, `process`, `read_file`, `patch`	Shell, processes, file IO, diff edits
Browser	`browser_navigate`, `browser_snapshot`, `browser_vision`	Browser automation, text + vision
Media	`vision_analyze`, `image_generate`, `text_to_speech`	Multimodal
Orchestration	`todo`, `clarify`, `execute_code`, `delegate_task`	Planning, sandboxed code, sub-agents
Memory	`memory`, `session_search`	Persistent memory and past-session search (4,500× faster in v0.15)
Automation	`cronjob`, `send_message`	Natural-language scheduling, outbound messages
Integrations	`ha_` (Home Assistant), MCP tools, `rl_`	External systems

Toolsets

Tools are enabled in toolsets (categorical bundles). Common ones: web, search, terminal, file, browser, vision, image_gen, skills, tts, todo, memory, cronjob, code_execution, delegation, homeassistant, messaging, discord, debugging, rl, plus platform presets like hermes-cli / hermes-telegram.

MCP — Client and Server Both

Hermes Agent supports the Model Context Protocol (MCP) in both directions.

As a client: stdio and HTTP transports. Register servers in ~/.hermes/config.yaml under mcp_servers. v0.15 added a Nous-curated MCP catalog — hermes mcp catalog browses, hermes mcp install <name> installs (PR-reviewed only).

As a server: hermes mcp serve exposes Hermes itself as an MCP server, so Claude Code and Cursor can call Hermes's Tools / Skills / Memory. Becoming an MCP supplier isn't something competitors offer — it's a real differentiator.

`x_search` (xAI Grok Integration)

First-class in v0.14. With SuperGrok OAuth, you can run grok-4.3 (1M context) directly. Search X via natural language, with results flowing into the agent's reasoning. Details in Hermes Agent × X Premium / Grok integration.

macOS Computer Use — The Background-Execution Difference

Hermes's Computer Use (features/computer-use) has three distinct edges over Anthropic's and OpenAI's:

1. macOS-only, via Apple's SkyLight private SPI — driver is cua-driver (MCP over stdio)
2. Model-agnostic — Claude / GPT / Gemini / local vLLM. Anthropic's Computer Use is Claude-only; OpenAI's Codex Computer Use is GPT-only
3. Background execution — the cursor doesn't move, Spaces don't switch. You can keep working in the foreground while the agent operates GUI in the background

Supports click / type / scroll / drag / key / app-focus, at 5–20ms/event (a bit slower than foreground). Requires Accessibility + Screen Recording permissions. curl | bash and sudo rm -rf / are hard-blocked, password auto-fill is disallowed. Install with hermes computer-use install.

Compared to OpenAI Codex Computer Use on Windows which is foreground-only, Hermes is "runs in the background while you keep working" — a real value on Macs.

Messenger / Platform Gateway (25+)

Hermes's Gateway abstraction connects 25+ messaging platforms to a single agent (messaging).

General: Telegram, Discord, Slack, WhatsApp, Signal, SMS, Email, Microsoft Teams, Matrix, Mattermost, Google Chat, LINE, ntfy, BlueBubbles (iMessage), Home Assistant
CN / Greater China: DingTalk, Feishu / Lark, WeCom, WeChat, QQ Bot, Yuanbao
Universal: Webhooks, API Server, Browser

Discord / Slack / Feishu / Matrix are richest (voice / images / files / threads / reactions / typing / streaming). SMS and ntfy are minimal (text only). Setup: hermes gateway setup wizard.

From a Japan-business angle, full LINE and Microsoft Teams integration is the practical hook.

Webhook / API Server

Webhook and API Server both support the full tool surface including terminal (permission tiers configurable). hermes proxy exposes an OpenAI-compatible HTTP endpoint, so existing OpenAI SDK / LangChain / LlamaIndex clients can call Hermes (v0.14).

Coding Tools

For coding workflows, combine:

- terminal — shell execution, six sandboxes (local / Docker / SSH / Singularity / Modal / Daytona)
- read_file, patch — file IO; patch does diff-format edits
- execute_code — Python sandbox
- LSP integration — automatic semantic diagnostics post-write (v0.14)
- ACP (Agent Client Protocol) — call Hermes from VS Code / Zed / JetBrains

Version Timeline for Skills/Tools

Version	Date	Skills/Tools changes
v0.13.0	May 7, 2026	Kanban multi-agent, `video_analyze`, xAI Custom Voices TTS, Google Chat added (20-platform mark)
v0.14.0	May 16, 2026	`x_search` first-class, SuperGrok OAuth (grok-4.3 1M), Teams integration, `hermes proxy`, 22 platforms (LINE / SimpleX added), Native Windows beta, computer_use opened to non-Anthropic providers
v0.15.0	May 28, 2026	"Velocity Release". `run_agent.py` shrunk 76%, `session_search` 4,500× faster, `openhands` / `code-wiki` / `web-pentest` added, Skill Bundles, Kanban swarm, Promptware Defense, Krea 2 images
v0.15.1	May 29, 2026	skills.sh: 858 → 19,932 entries, dashboard fixes
v0.15.2	May 29, 2026	Hermes Desktop packaging fix

Competitive Comparison

Dimension	Hermes Agent	Claude Code	Cursor	OpenAI Codex CLI	Anthropic Computer Use
Skills concept	agentskills.io-compliant, auto-generated	Native Skills (similar)	MCP-centric	Plugins	None
Tools	40+ built-in + MCP	Limited + MCP	MCP	Plugins	computer_use only
MCP	client + server	client only	client only	client only	–
Computer Use	Any model + background	Claude-only	–	–	Claude-only (foreground)
Messenger integrations	25+	None	None	None	None
License	MIT OSS	Commercial	Commercial	Commercial	Commercial

Five differentiators: multi-model Computer Use, 25+ messenger gateway, MCP host, persistent self-learning Skills, MIT OSS.

Eight Use-Case Patterns

1. Research and Info Gathering

Tools: web_search + web_extract + x_search + vision_analyze (PDFs / charts) + session_search
Skills: code-wiki to publish research to your internal wiki
Gateway: Slack to receive research requests from team

2. Writing and Editing

Tools: patch (Markdown edits) + text_to_speech (reading drafts back) + web_search (fact check)
Skills: Custom house style-guide skill in ~/.hermes/skills/in-house-style/SKILL.md
Gateway: Slack / Teams for draft sharing

3. Data Analysis

Tools: execute_code (Python sandbox) + read_file + MCP database server + vision_analyze (chart understanding)
Skills: pandas / matplotlib pattern skills
Automation: cronjob for recurring reports → send_message for delivery

4. Coding and Development

Tools: terminal + patch + execute_code + ACP (VS Code / Zed / JetBrains) + MCP (GitHub)
Skills: openhands, code-wiki, web-pentest
Parallelism: delegate_task for sub-agents, Kanban swarm for parallel PRs
Gateway: GitHub webhook for PR-time auto-review

5. Customer Support

Gateway: Slack / Teams / LINE / WhatsApp / Email channels consolidated onto one agent
Tools: memory (customer profile) + MCP (CRM) + web_search (FAQ)
Skills: Product FAQ skill, escalation procedure skill

For Japan, registering the LINE official account and Microsoft Teams to the Gateway is the highest-leverage combo.

6. Social Listening

Tools: x_search (grok-4.3 1M context for cluster analysis) + web_search + vision_analyze
Automation: cronjob for daily summaries → send_message to Slack
Skills: Custom brand-monitoring skill

7. Internal Automation

Tools: cronjob (natural-language scheduling) + ha_* (Home Assistant) + webhook triggers + terminal (batch)
Skills: web-pentest for internal vuln scans + custom procedure skills

8. Personal Work

Tools: macOS Computer Use (background GUI ops) + MCP (Google Calendar / Gmail / Notion) + voice mode + text_to_speech
Skills: Personal routine skills (morning Slack triage, daily task organization, etc.)

Example: voice command to organize today's meeting notes into Notion and share via Teams. Hermes drives Notion through Computer Use in the background while you keep working in the foreground; Teams posting goes through MCP.

What Isn't Officially Documented

- Full Skills catalog of 19,932 entries — Skills Hub is dynamically loaded; not statically retrievable
- Per-skill / per-tool benchmarks — only aggregated wins are published (e.g., session_search 4,500× in v0.15.0)
- Enterprise SLA / commercial support contracts — Nous Research is OSS-focused; commercial support specifics are limited
- Japanese UI / official Japanese docs — English-first at the moment

FAQ

Q1. How do Skills and Tools differ?
A. Tools execute (e.g., terminal runs ls). Skills are the playbook around them (e.g., "for production deploys, stage first and verify before promoting..."). Skill calls Tools; Tools touch the world.

Q2. Are there really 19,932 usable skills?
A. There are 19,932 entries in the catalog — i.e., available to install. A fresh install ships with dozens. You opt in via hermes skills install (individually or in bundles).

Q3. Can anyone author a Skill?
A. Yes — Markdown + YAML frontmatter. No coding required. Codifying internal procedures as Skills is the recommended path.

Q4. Is Computer Use safe?
A. curl | bash and sudo rm -rf / are hard-blocked; password auto-fill is disallowed. But the required permissions (Accessibility + Screen Recording) are strong, so scope which devices can run it via internal policy.

Q5. What's the point of running Hermes as an MCP server?
A. Other agents like Claude Code and Cursor can call Hermes's Skills / Tools / Memory. If your team accumulates organizational knowledge in Hermes Skills, developers can reach it from whichever editor or agent they prefer.

Q6. Commercial use?
A. Fully permitted under MIT. Internal tools, SaaS embedding, redistribution — all OK.

Q7. Is it strong on Japanese workflows?
A. Skills/Tools are language-agnostic; the backend model (Claude / GPT / Gemini / Grok / local) decides language quality. The LINE / Teams / Feishu gateways are already first-class, which makes it fit Japanese business surfaces well.

Q8. How do I leverage the self-improvement loop?
A. As an engineer uses Hermes, their procedural knowledge accumulates as Skills. Publishing those to a team-shared tap (internal Skills registry) lets new hires inherit veterans' workflows. It's a concrete answer to know-how silos.

How Oflight Approaches This

Through our AI consulting practice, we organize Hermes Agent Skills & Tools into three layers:

1. Base layer — built-in Tools (web_search / terminal / patch / cronjob etc.) plus official-tier Skills
2. Extension layer — connect internal systems (CRM / ERP / DWH) through MCP; expose Hermes as an MCP server so other agents can share the knowledge layer
3. Learning layer — codify procedures as Skills; share the self-improving procedural memory via an internal tap, replacing implicit institutional knowledge

Often paired with FDE-style on-site enablement.

Bottom Line

Hermes Agent's Skills & Tools combination — clean separation of procedure (Skills) from execution (Tools), Progressive Disclosure loading, a 19,932-entry catalog, agent-driven self-improvement of Skills, MCP both as client and server, 25+ messenger gateway, and Mac-only background Computer Use — adds up to the open-source personal agent that gets smarter the more you use it.

For Japanese enterprises, the practical hooks are: (1) MIT license keeps legal risk low, (2) LINE / Teams / Feishu / WeCom gateways already cover Asian messaging, (3) the self-improvement loop turns implicit veteran know-how into shareable Skills, (4) MCP server mode interoperates cleanly with Claude Code, Cursor, and other agents already in the stack. Start from the eight use-case patterns above, run a 1–2 month PoC, and accumulate internal Skills as you go.

References

Primary:
- Hermes Agent site
- Hermes Desktop
- features/overview
- features/skills
- features/tools
- features/mcp
- features/computer-use
- messaging gateway docs
- getting-started/quickstart
- agentskills.io (open standard)

GitHub:
- NousResearch/hermes-agent
- Releases
- RELEASE_v0.15.0.md

Third-party:
- MarkTechPost — Hermes Desktop v0.15.2
- techsy.io — Hermes Agent v0.15
- blakecrosley — Hermes guide
- DataCamp — Hermes Agent tutorial
- @NousResearch X announcement

Related:
- Hermes Desktop (Nous Research)
- Hermes Agent × X Premium / Grok integration
- Claude Code Agent View
- Cursor Automations
- OpenAI Codex Computer Use on Windows
- Forward Deployed Engineer (FDE)

Note: the full Skills catalog (19,932 entries), per-skill/per-tool benchmarks, enterprise SLA / commercial support details, and an official Japanese UI are not all explicitly documented as of June 5, 2026. Re-verify on the live Skills Hub and current docs before production decisions.

Feel free to contact us