Nous Research's Hermes Agent Builds Skills From Its Own Work

Hermes Agent is an open-source autonomous AI agent built by Nous Research, released in February 2026. It is not a coding copilot tethered to an IDE, nor a chatbot wrapper around a single API. Four months later it has become one of the fastest-moving projects in the space, and it just crossed a threshold that matters for anyone evaluating it seriously.

Hermes Agent shipped its official desktop app on June 2, 2026, bundled as a public preview at v0.15.2 with native builds for macOS, Windows, and Linux. Before you read anything into that: the desktop app is a front-end GUI, not a new model and not a new agent framework. The agent's self-improving core stays the same; Hermes Desktop simply makes it clickable instead of command-line only. Worth being clear about, because the Hermes naming is genuinely layered - Hermes 4 is the language model, Hermes Agent is the framework, Hermes Desktop is the app. The June release is only the third of those things.

The practical effect of the desktop launch is real, though. The release eliminates the terminal requirement that kept a large slice of potential users on the sidelines. Before this, running Hermes with any graphical interface meant finding one yourself. The community built several impressive options; Nous Research even wrote up their favorites. But all of them were unofficial, third-party wrappers. Hermes Desktop is the first GUI that ships from the same team that builds the agent.

What is actually new about how the agent learns

The desktop launch is the headline, but the more interesting design decision is the closed learning loop that has been there since launch.

After every task execution, Hermes adds an evaluation layer. It assesses whether the outcome succeeded, extracts reusable reasoning patterns, and stores them as skill files - plain Markdown. The next time it encounters a similar task, it pulls the relevant skill instead of reasoning from scratch. Nous Research calls this a "closed learning loop," built on SQLite full-text search and LLM summarization.

The performance claim attached to that loop is specific: agents with 20 or more self-created skills complete similar future tasks 40% faster than fresh instances. That 40% refers to token consumption and wall-clock time, not output quality improvement. TokenMix's independent benchmarks corroborated this figure in April 2026.

There is a meaningful caveat, though: this improvement is domain-specific. A skill learned from "summarize a GitHub PR" does not transfer to "plan a database migration." Cross-domain generalization remains an open problem. The closed loop makes Hermes faster at things it has done before, not smarter about things it has never seen.

The May release did the harder infrastructure work

The desktop launch gets attention, but the "Tenacity" release from May 7 is the one that addressed production fragility. Kanban ships as a durable multi-agent board with heartbeat, reclaim, zombie detection, and auto-block on incomplete exit. /goal keeps the agent locked on a target across turns. Checkpoints v2 rewrites state persistence with real pruning. The gateway auto-resumes interrupted sessions after restart.

That matters because one of the consistent failure modes in autonomous agents is the half-finished task: the agent loses context, or the session drops, and the work simply disappears. Durable state and goal tracking are less glamorous than a skill-learning loop, but they are what separates an agent you can run overnight from one you need to babysit.

The same release closed eight P0 security issues: redaction is now on by default, Discord role-allowlists are guild-scoped, WhatsApp rejects strangers by default, and TOCTOU windows were closed across auth.json and MCP OAuth. That security wave is worth noting in context. As of April 2026, Hermes Agent had zero publicly disclosed agent-specific CVEs. In the same window, OpenClaw - the other dominant open-source agent framework - disclosed nine CVEs across four days in March 2026, including one rated CVSS 9.9. Hermes ships with built-in prompt injection scanning and credential filtering by default.

Security posture being a differentiator is a sign of how early this whole category still is.

Model-agnostic, but with a business model attached

Hermes Agent supports any model you want - Nous Portal, OpenRouter with 200+ models, NVIDIA NIM, OpenAI, or your own endpoint. Switch with hermes model, no code changes, no lock-in.

The no-lock-in claim is real at the infrastructure level, but Nous Research is building a commercial layer on top of it. If you'd rather not collect five separate API keys for the model, web search, image generation, TTS, and a cloud browser, Nous Portal covers all of them under one subscription, with 300+ models and a Tool Gateway routing through your account. Free to self-host, but the convenience path routes through Nous. That's a sustainable model for an MIT-licensed project; it also means the cost calculation for a team depends on which path they take.

Where the hype outpaces the evidence

Hermes Agent is described by some as "self-improving," even likened to AGI, because it can enhance its own capabilities by autonomously writing code to create new skills. That framing is doing a lot of work. The agent writes Markdown skill files from its task history. It does not update its own weights, modify its reward signal, or change the underlying model. "Self-improving" in the sense used here means "accumulates task-specific heuristics in a local file system." That is useful; it is not recursive self-improvement.

The open-source agent from Nous Research accumulated over 180,000 GitHub stars in under four months since its February 25 launch, making it the fastest-growing open-source agent framework of 2026 by Dealroom's count. That growth is real. Whether the skill loop holds up across the kinds of heterogeneous work most teams actually do - mixed codebases, domain-hopping, ambiguous instructions - is something that star counts do not answer.

The honest summary: Hermes Agent is the most thoughtfully architected open-source autonomous agent released this year. The learning loop and durable task state are genuine design advances over stateless alternatives. A teammate like Beagle, surfacing agent outputs into Slack threads where humans can review them, is a natural fit for a loop where the agent self-evaluates but humans still own the decision. The desktop app removes the last big onboarding barrier. But the "grows with you" pitch is most true in narrow, repetitive domains - and any team running it should treat those Markdown skill files the same way they treat an untrusted code dependency: read them before you trust them.

What is actually new about how the agent learns

The May release did the harder infrastructure work

Model-agnostic, but with a business model attached

Where the hype outpaces the evidence

Point me at your website.

Keep reading

Mastra Is the TypeScript AI Agent Framework Built for Production

Is OpenClaw Ready for Engineering Teams, or Just Developers?

Hermes Agent Reaches Slack and Teams With Persistent Memory