Hermes AI Agent: Complete Setup Guide 2026 (Install, Configure & Deploy)
AI Infrastructure Lead

Key Takeaways
- Completely free — MIT-licensed, open-source. You only pay for LLM API calls (or nothing with Ollama).
- Self-improving — Hermes creates skills from experience and gets better at repeated tasks automatically.
- Works with any model — OpenAI, Anthropic, Google, Ollama, OpenRouter, Nous Portal, or any custom endpoint.
- Multi-platform — CLI, Telegram, Discord, Slack, WhatsApp, Signal, Email, plus voice mode.
- 19K GitHub stars — one of the fastest-growing open-source AI agent projects of 2026.
Table of Contents
- What is Hermes AI Agent?
- Why Hermes Agent Matters in 2026
- Step 1: Installation
- Step 2: Configure Your LLM Provider
- Step 3: Enable Tools & MCP Servers
- Step 4: Connect Messaging Platforms
- Step 5: Understanding Skills & Memory
- Advanced: Deployment Options
- Advanced: Voice Mode & Scheduling
- Hermes vs Claude Code vs OpenClaw
- Real-World Use Cases
- Submit Your AI Tool
- FAQ
What is Hermes AI Agent?
Hermes Agent is an open-source AI agent built by Nous Research that does something most AI tools don't — it learns from its own experience and gets better over time. Released on February 25, 2026, it has already racked up over 19,000 GitHub stars, making it one of the fastest-growing agent frameworks this year.
We've been running Hermes across multiple projects for the past month, and the pitch is real. Unlike conventional AI assistants that start from scratch every session, Hermes builds a library of "skills" as it works. Complete a complex deployment task once, and the next time you ask for something similar, it loads that skill and executes it faster with fewer mistakes. That's not marketing — it's how the system actually functions under the hood.
The project sits at an interesting intersection: it's free (MIT license), works with any LLM provider you already pay for, supports 40+ built-in tools, connects to 7+ messaging platforms, and deploys anywhere from your laptop to a $5 VPS to serverless infrastructure. Nous Research — the team behind some of the most capable open-source language models — built this to be the agent layer that ties everything together.
This guide covers everything you need to go from zero to a fully configured Hermes Agent: installation, LLM setup, tool configuration, MCP integration, messaging platform connections, deployment options, and the advanced features that make this framework stand out. Whether you want a personal AI assistant on Telegram, an autonomous DevOps agent, or a research tool that improves with every query — this is how you build it.
Why Hermes Agent Matters in 2026
The AI agent space has exploded this year. We've reviewed Cursor, Devin, Claude Code, OpenClaw, and dozens of others. Most of them are excellent at what they do. But they all share a common limitation: they don't retain procedural knowledge between sessions. Every conversation starts cold.
Hermes approaches this differently with its built-in learning loop. Here's how it works in practice:
For example: "Set up a Nginx reverse proxy with SSL for my Next.js app on this VPS."
It uses its 40+ tools to SSH in, install packages, configure files, test the setup, fix errors — usually 5-15 tool calls.
After completing the task, Hermes writes a structured markdown skill file: the exact procedure, common pitfalls it encountered, verification steps it ran.
Ask for a similar deployment next week and Hermes retrieves the relevant skill, executes the procedure directly, and skips the trial-and-error. It gets measurably faster and more reliable over time.
Beyond the learning loop, Hermes packs a serious feature set. Persistent memory across sessions using FTS5 full-text search and LLM-powered summarization. Subagent delegation so it can parallelize complex tasks. Native MCP server support for connecting to virtually any external tool or API. And it runs on any model — from a local Ollama instance on your laptop to Claude or GPT via API.
The real draw here is flexibility. You're not locked into one provider, one deployment method, or one interface. Hermes is the scaffolding — you choose what goes inside it.
Step 1: Installation
Getting Hermes installed takes under five minutes. The only prerequisite is git — the install script handles everything else. It works on Linux, macOS, and Windows via WSL2.
Run the one-liner install:
curl -fsSL https://raw.githubusercontent.com/NousResearch/hermes-agent/main/scripts/install.sh | bash
After installation completes, reload your shell:
# For bash users
source ~/.bashrc
# For zsh users
source ~/.zshrc
Verify the installation:
hermes --version
Now run the setup wizard. This walks you through model selection, tool configuration, and basic preferences in an interactive terminal UI:
hermes setup
Windows users: You'll need WSL2. Install it from the Microsoft Store if you haven't already, then run the install script inside your WSL terminal. Everything works identically once you're in the Linux environment.
Once setup completes, start the interactive CLI by typing:
hermes
You're now in a conversational session with your agent. But before you start prompting, let's configure the LLM provider — the brain behind every response.
Step 2: Configure Your LLM Provider
Hermes doesn't ship with its own model — it's a framework that works with whatever LLM you prefer. This is one of its strongest selling points. You're not locked into a single provider or paying a second subscription on top of API costs you already have.
Run the model configuration command:
hermes model
You'll see the supported providers:
For most users, we recommend starting with OpenRouter — it gives you access to nearly every major model through a single API key, so you can experiment without committing to one provider. If privacy is your priority, Ollama keeps everything local.
Example: configuring OpenRouter with a specific model:
# Select OpenRouter as your provider
hermes model --provider openrouter
# Set your API key
hermes model --api-key sk-or-v1-your-key-here
# Choose a model
hermes model --model anthropic/claude-sonnet-4
Or for a fully local setup with Ollama:
# Make sure Ollama is running
ollama serve
# Pull a capable model
ollama pull llama3.3:70b
# Point Hermes at your local instance
hermes model --provider ollama --model llama3.3:70b
You can switch models mid-session using the /model slash command inside the Hermes CLI. This is useful when you want a cheaper model for simple queries and a frontier model for complex reasoning tasks.
Step 3: Enable Tools & MCP Servers
Hermes ships with 40+ built-in tools — file operations, shell commands, web browsing, code execution, image processing, and more. You control which tools the agent can access:
# View all available tools
hermes tools
# Enable specific tools
hermes tools --enable web_browse,shell_exec,file_write
# Disable tools you don't want the agent using
hermes tools --disable dangerous_tool_name
Inside an active session, use /tools to see what's currently enabled and toggle them on the fly.
But the built-in tools are just the beginning. The real power comes from MCP (Model Context Protocol) server support. MCP is rapidly becoming the standard way to connect AI agents to external services, and Hermes supports it natively.
To connect an MCP server, add the configuration to your Hermes config file. Here's an example connecting a GitHub MCP server so your agent can manage repositories, issues, and pull requests:
# In your Hermes MCP config (~/.hermes/mcp.json)
{
"servers": [
{
"name": "github",
"command": "npx",
"args": ["-y", "@modelcontextprotocol/server-github"],
"env": {
"GITHUB_PERSONAL_ACCESS_TOKEN": "ghp_your_token_here"
}
},
{
"name": "filesystem",
"command": "npx",
"args": ["-y", "@modelcontextprotocol/server-filesystem", "/home/user/projects"]
}
]
}
Once configured, Hermes can interact with these servers transparently. Ask it to "create a GitHub issue for the login bug" and it uses the GitHub MCP tools automatically. Ask it to "organize the files in my projects folder" and it uses the filesystem server.
Some particularly useful MCP servers to consider:
- GitHub — repo management, issues, PRs, code search
- PostgreSQL / SQLite — direct database queries and management
- Brave Search — web search without a browser
- Filesystem — sandboxed file access to specific directories
- Slack — read channels, send messages, manage threads
The community at agentskills.io hosts hundreds of pre-built MCP configs and Hermes skills that you can import directly. If you use Perplexity Pro for research, there's even an MCP server that lets Hermes query it programmatically — giving your agent access to real-time web intelligence without leaving the terminal.
Step 4: Connect Messaging Platforms
This is where Hermes diverges sharply from coding-focused tools like Claude Code or Cursor. Hermes isn't just a terminal agent — it can be your AI on Telegram, Discord, Slack, WhatsApp, Signal, and Email. Same agent, same skills, same memory — accessible from wherever you already communicate.
Set up a messaging gateway:
# Launch the gateway setup wizard
hermes gateway setup
The wizard walks you through connecting each platform. Here's what each requires:
Telegram: Create a bot via @BotFather, get your bot token, paste it into Hermes. Your agent is now reachable as a Telegram bot that anyone (or just you) can message.
# Connect Telegram
hermes gateway setup --platform telegram --token YOUR_BOT_TOKEN
# Restrict to specific user IDs (recommended)
hermes gateway setup --platform telegram --token YOUR_BOT_TOKEN --allowed-users 123456789
Discord: Create a Discord application at discord.com/developers, generate a bot token, invite it to your server. Hermes responds to direct messages and channel mentions.
# Connect Discord
hermes gateway setup --platform discord --token YOUR_DISCORD_BOT_TOKEN
Slack: Create a Slack app at api.slack.com/apps, configure OAuth scopes, install to your workspace. The agent can respond in channels and DMs.
# Connect Slack
hermes gateway setup --platform slack --bot-token xoxb-your-token --app-token xapp-your-token
The key insight here: it's the same agent instance behind all these platforms. Skills learned via CLI are available when you message through Telegram. Memory from a Discord conversation persists when you switch to Slack. This is what makes Hermes practical for teams — your AI assistant lives wherever your team already works.
For teams managing projects across tools, pairing Hermes with something like Taskade for project management creates a workflow where your agent handles execution while your project board tracks the big picture.
Step 5: Understanding Skills & Memory
We touched on the learning loop earlier, but it deserves a deeper look because this is genuinely what sets Hermes apart from every other agent framework we've tested.
Skills are structured markdown files that Hermes creates automatically after complex tasks. A skill contains:
- Procedure: Step-by-step instructions the agent followed to complete the task
- Pitfalls: Errors encountered and how they were resolved
- Verification: How the agent confirmed the task was successful
- Context: When this skill should be loaded (pattern matching on user requests)
Here's a concrete example. After we asked Hermes to set up a Convex backend with authentication, it created a skill that looked roughly like this:
# Skill: Convex Backend Setup with Auth
## Context: convex setup, convex auth, convex backend
## Procedure
1. Initialize Convex project: npx convex dev --once
2. Define schema in convex/schema.ts
3. Install auth provider: npm install @auth/core @auth/convex-adapter
4. Configure auth in convex/auth.config.ts
5. Add HTTP routes for OAuth callbacks
6. Test with: npx convex dashboard
## Pitfalls
- Must run `npx convex dev` before any other convex command
- Auth config requires SITE_URL env var in production
- OAuth redirect URIs must match exactly (trailing slash matters)
## Verification
- Run `npx convex dashboard` and confirm tables exist
- Test login flow end-to-end in browser
- Verify session persistence after page refresh
The next time we asked Hermes anything involving Convex + auth, it loaded this skill and executed the procedure directly. No fumbling, no re-discovering the same pitfalls. The improvement is noticeable after even a few days of use.
Memory works alongside skills but serves a different purpose. While skills capture procedural knowledge ("how to do X"), memory captures facts and context ("user prefers TypeScript", "production server is at 192.168.1.50", "project uses pnpm not npm").
Hermes uses a dual-layer memory system:
FTS5 Full-Text Search
Every conversation turn is indexed in a SQLite FTS5 database. When you ask a question, Hermes searches past conversations for relevant context and injects it into the prompt. Fast, accurate, and works offline.
LLM Summarization Layer
Periodically, the agent summarizes conversation history into compressed memory entries. This preserves important facts while keeping context windows manageable. Think of it as the agent's long-term memory being distilled from short-term conversations.
You can save important context manually with the /save command, and resume previous sessions with hermes --continue. The community skill repository at agentskills.io also lets you share skills — if someone has already built a skill for AWS Lambda deployment, you can import it and skip the learning phase entirely.
Advanced: Deployment Options
Hermes supports six deployment backends, and choosing the right one depends on your use case. Here's how each option works in practice:
Our recommendation for most users: Start local, then move to a $5/month VPS when you want the agent running 24/7. A basic DigitalOcean or Hetzner droplet is more than enough to run Hermes with messaging gateways.
Here's a practical Docker deployment:
# Pull and run the Hermes Docker image
docker run -d \
--name hermes-agent \
-v ~/.hermes:/root/.hermes \
-e OPENROUTER_API_KEY=sk-or-v1-your-key \
--restart unless-stopped \
nousresearch/hermes-agent:latest
And an SSH/VPS deployment with systemd for automatic restarts:
# On your VPS: install Hermes
curl -fsSL https://raw.githubusercontent.com/NousResearch/hermes-agent/main/scripts/install.sh | bash
source ~/.bashrc
# Configure your model and gateways
hermes setup
# Create a systemd service for always-on operation
sudo tee /etc/systemd/system/hermes.service > /dev/null << 'EOF'
[Unit]
Description=Hermes AI Agent
After=network.target
[Service]
Type=simple
User=hermes
ExecStart=/usr/local/bin/hermes --daemon
Restart=always
RestartSec=10
[Install]
WantedBy=multi-user.target
EOF
# Enable and start
sudo systemctl enable hermes
sudo systemctl start hermes
With this setup, your Hermes agent runs 24/7, responds to messages on all connected platforms, retains memory across restarts (stored in ~/.hermes/), and automatically restarts if it crashes.
Advanced: Voice Mode & Scheduling
Two features that deserve separate mention because they push Hermes beyond what most agent frameworks offer.
Voice Mode enables real-time spoken interaction with your agent. Think of it as a voice-first AI assistant that has all of Hermes' tools, skills, and memory at its disposal.
# Start Hermes in voice mode
hermes --voice
# Or enable voice within an existing session
/voice on
Voice mode uses your system microphone for input and speakers for output. It works well for hands-free coding sessions, brainstorming, or when you're away from your keyboard but want to give your agent instructions. The latency is reasonable — not instant, but usable for real conversations.
Cron Scheduling lets you set up recurring tasks that the agent executes automatically. This is built into Hermes directly — no external cron daemon needed.
# Schedule a daily backup reminder at 9 AM
hermes schedule --cron "0 9 * * *" --task "Check disk usage on production server and alert me if above 80%"
# Schedule weekly research
hermes schedule --cron "0 10 * * 1" --task "Search for new AI tools released this week and summarize the top 5"
# List active schedules
hermes schedule --list
Combined with messaging gateways, this means you can have Hermes run a scheduled task at 6 AM and send you the results on Telegram before you even wake up. We use this pattern for automated monitoring, content research, and daily briefings.
Hermes Agent vs Claude Code vs OpenClaw
We get asked this constantly: "Should I use Hermes, Claude Code, or OpenClaw?" The honest answer is they serve different purposes, and many developers (ourselves included) use more than one. Here's how they compare:
Choose Hermes if you want a general-purpose agent that works across messaging platforms, learns from experience, and runs on any model. It's the Swiss Army knife approach.
Choose Claude Code if you want the most polished coding experience with deep integration into development workflows. It's laser-focused on writing, testing, and deploying code. Read our best AI coding assistant comparison for the full picture.
Choose OpenClaw if you want an open-source coding agent with ClawHub marketplace and strong community support, particularly if you prefer Apache licensing over MIT.
Real-World Use Cases
Theory is nice, but here are four concrete ways we've seen Hermes used effectively:
1. Personal DevOps Agent on Telegram
Deploy Hermes on a $5 VPS, connect it to Telegram, and give it SSH access to your servers via MCP. Now you can message your bot "check if the API server is healthy" or "restart the Next.js service" from your phone. Hermes runs the commands, sends you the output, and remembers your server configuration between sessions. Over time, it builds skills for your specific infrastructure and handles routine ops tasks faster.
2. Automated Research & Briefing System
Use Hermes' cron scheduling to run daily research tasks. Schedule it to search for news in your industry every morning at 7 AM, summarize the findings, and send a briefing to your Slack channel. Connect a web search MCP server and the agent does the heavy lifting. The skill system means the briefings get better formatted and more relevant over time as Hermes learns what you actually care about.
3. Multi-Platform Customer Support Bot
Connect Hermes to Discord, Slack, and Email simultaneously. Load it with your product documentation via MCP filesystem access. Now you have a support agent that answers questions across platforms with consistent knowledge. The SOUL.md personality file lets you define its tone — professional for email, casual for Discord. Skills accumulate as it handles repeated questions, improving response quality automatically.
4. Local-First AI Assistant with Ollama
For privacy-conscious users or teams with data restrictions: run Hermes with Ollama and a local model. No data leaves your machine. Pair it with the filesystem MCP server for document analysis, the SQLite server for local databases, and use it as a completely private AI assistant that still benefits from the skill learning system. Perfect for legal, healthcare, or government environments where cloud AI is restricted.
The common thread across all these use cases: Hermes is infrastructure you own. It doesn't phone home, it doesn't lock you in, and it gets better the more you use it. That's a compelling proposition in a market where most AI tools are subscription services that reset every session.
Discover More AI Tools
Know an AI tool the world should see?
We review and list the best AI tools every week. Submit yours for a free listing on PopularAiTools.ai.
Submit an AI ToolFrequently Asked Questions
Hermes Agent is one of the most ambitious open-source AI projects of 2026. The self-improving skill system, multi-platform messaging, and model flexibility make it genuinely different from the dozens of agent frameworks we've tested this year. If you've been looking for an AI agent you actually own — one that runs on your terms, gets better with use, and doesn't lock you into a subscription — this is it.
Get started at github.com/NousResearch/hermes-agent, and check back as we continue covering the best AI tools at PopularAiTools.ai.
Recommended AI Tools
Cockpit AI
Cockpit AI deploys autonomous AI revenue agents that research prospects, personalize outreach, follow up across channels, and book qualified meetings without human intervention. The most ambitious fully autonomous outbound tool we have tested in 2026.
View Review →Google Gemini 3.1 Flash Live
We tested Google Gemini 3.1 Flash Live across coding, conversation, video analysis, and document processing. At 10-100x cheaper than GPT-5, it is the best value multimodal model in 2026 — with a real-time streaming experience that makes every other model feel sluggish.
View Review →Venn.ai
Venn.ai is the missing permissions layer between your AI tools and business apps. It lets Claude, ChatGPT, Cursor, and VS Code access Salesforce, HubSpot, Gmail, Slack, and 20+ other apps with granular safety controls and audit logging.
View Review →Parallel Code
Parallel Code dispatches 10+ AI coding agents simultaneously, each in isolated git worktrees. Free, open-source, supports Claude Code, Codex CLI, and Gemini CLI. A genuine force multiplier for experienced developers who want to parallelize batch coding work.
View Review →