Hermes Agent Production Cost Teardown — 40 Days on Oracle Cloud (Real Numbers)
Ampliflow
Advanced AI frontier lab and business growth agency. Helping UK businesses deploy agentic AI systems.

Most agent-platform reviews are written by people who haven't run them in production. This isn't. We have run Hermes Agent v0.13 on an Oracle Cloud instance for forty days as the operational backbone for Ampliflow. Every number in this article is pulled from the live deployment — journalctl, du, /proc/net/dev, the Oracle billing dashboard. No estimates, no extrapolations. This is what running an open-source AI agent in production for a UK SME actually costs and looks like, including the 62-hour outage that taught us the patterns we now publish in the monitoring guide.
Last updated: May 2026 · Data period: 3 April 2026 - 13 May 2026 (40 days) · Hermes Agent v0.13.0 (2026.5.7) on Oracle Cloud x86_64 (2 vCPU / 12GB RAM)
TL;DR (real measured numbers):
- Total infrastructure cost: ~£10-20/month (Oracle paid x86 shape, NOT Always Free as we initially planned)
- Total model API cost: £40-80/month (varies with content production volume)
- Combined monthly run cost: £50-100/month — replaces commercial agent platforms costing £200-700/month
- Uptime over 40 days: 93.5% including a 62-hour incident; 99.7% excluding it
- Disk usage: 5.2 GB total (2.3 GB Hermes install + ~3 GB state/backups/skills/sessions)
- Skill invocations: 222 over 40 days (~5.5/day)
- The lesson: the £4/month Always Free Ampere A1 spec we recommend works for pilots, but production deployments running Hermes Workspace + Dashboard + Syncthing alongside Hermes Agent benefit from the upgrade
What we measured
The deployment that produced this data:
- Hermes Agent v0.13.0 (released 2026-05-07)
- Server: Oracle Cloud, 2 vCPU x86_64, 12 GB RAM, 200 GB block storage, UK region
- Co-located services: Hermes Workspace (web UI), Hermes Dashboard (localhost:9119), Syncthing (file sync)
- Channels active: WhatsApp, CLI, web dashboard
- Model provider: Anthropic API (Sonnet 4.6 + Haiku 4.5 mix; Opus 4.7 for high-stakes drafts)
- Use cases running: daily ops brief, content production drafting, ad-hoc CLI analysis, WhatsApp queries
Honest correction from our earlier writing: in articles 118, 119, 121, and 122 we referenced "90 days" of production data and "Always Free Ampere A1" spec. The actual install date was 3 April 2026, giving 40 days at time of this writing. The actual instance shape is paid Oracle x86 (Always Free wouldn't accommodate the Hermes Workspace + Dashboard co-location). We're updating those articles to reflect the real numbers. The architecture patterns described in those guides remain correct — just the period + shape numbers were optimistic.
The honest server-cost picture
Always Free works for pilots. Production loads benefit from the upgrade.
What you can do on Oracle Always Free:
- 1 OCPU Ampere A1 ARM + 6 GB RAM
- Adequate for Hermes Agent alone, single channel (e.g. just WhatsApp)
- Adequate for a founder-led business with one or two daily skills
- £0/month, no time limit
What we ended up running:
- 2 vCPU x86_64 + 11 GB RAM (Oracle paid shape)
- Hermes Agent + Hermes Workspace web UI + Hermes Dashboard + Syncthing
- 5.2 GB Hermes data + ~3 GB system + room for snapshots
- ~£10-20/month (Oracle's pricing varies by region + commitment; the "VM.Standard.E5.Flex" 2 vCPU / 12GB shape on UK South was £18/month at our commit)
The upgrade was driven by three needs:
- Hermes Workspace (the dashboard) wants more memory than Always Free comfortably gives
- Co-located Syncthing for file sync between dev and server
- Concurrent skills during content production — multiple sub-agents in parallel briefly spike memory above 6 GB
For a Hermes-only pilot deployment of a founder-led business with single-channel WhatsApp ops, Always Free is genuinely production-grade. The £4-month Hetzner CX22 alternative is also fine. The upgrade is a "nice to have" if you want to colocate other services on the same box.
Uptime: 93.5% over 40 days, 99.7% excluding the major incident
The headline number includes a 62-hour outage that taught us the recovery patterns.
Raw measurement from journal:
- Period: 3 April 2026 - 13 May 2026 (40 days = 960 hours)
- 30 April outage: 62 hours of complete downtime (see 121 monitoring for the post-mortem)
- Other downtime: ~16 minutes total across 33 systemd-recorded restart events (each ~30 seconds)
- Total downtime: ~62.3 hours
- Uptime: (960 - 62.3) / 960 = 93.5%
- Excluding the 30 April incident: 99.7%
The 30 April outage caused: unhandled exception from a model provider rate-limit response → gateway exited status 0 → systemd Restart=on-failure semantics treated this as clean exit → no auto-restart → bank holiday weekend = nobody noticed for 62 hours.
Post-incident fixes (now documented in 121):
- Switched to
Restart=always - Added Healthchecks.io heartbeat with 10-minute alert window (would have caught this within 12 minutes)
- Documented the recovery playbook
Since 30 April: zero incidents. Twelve auto-updates, all successful. Two manual restarts (planned for major version bumps).
What it actually does (40 days of activity)
Skill invocations: 222 (~5.5/day average)
Distribution across our active skills:
- Daily ops brief — ~40 invocations (one per weekday)
- Content drafting — ~85 invocations (varies with content calendar load)
- Ad-hoc analysis — ~60 invocations (weekend/evening exploration)
- WhatsApp auto-replies — ~25 invocations (low-volume, founder-direct setup)
- System maintenance skills — ~12 invocations (auto-update verification, log rotation, backup pruning)
The 5.5/day average is a steady state. The 23-article content authority push generated a brief 3-day spike to ~15/day.
WhatsApp bridge: 568 log lines, low message volume
Our WhatsApp setup is intentionally low-volume — used for founder-direct queries + scheduled brief delivery, not customer-facing automation. A higher-volume customer-concierge deployment would generate materially more bridge activity.
Auto-updates: 13 successful, 2 failed (rolled back)
Hermes patches arrive roughly weekly. The auto-update script (documented in 121) ran 15 times over 40 days. Two failures both rolled back cleanly to the previous version; both were resolved within 24 hours by the next nightly run.
The 13 successful updates included one major version bump (v0.12.0 → v0.13.0 with 98 commits + config v22 → v23 migration). The script handled it cleanly.
Disk usage breakdown
Total: 5.2 GB for the entire Hermes deployment over 40 days. This is the breakdown by directory:
| Directory | Size | What it is |
|---|---|---|
| `hermes-agent/` | 2.3 GB | The Python install + venv + dependencies. Stable size. |
| `node/` | 959 MB | Node.js runtime for Hermes Workspace web UI. Stable. |
| `backups/` | 816 MB | Pre-update snapshots (last 7 retained). Daily. |
| `state-snapshots/` | 589 MB | Periodic state dumps for recovery. |
| `checkpoints/` | 310 MB | Mid-skill checkpoints for long-running operations. |
| `state.db` | 73 MB | SQLite database with persistent agent memory. |
| `skills/` | 59 MB | Skill files + their assets/templates. |
| `audio_cache/` | 58 MB | Cached TTS audio for WhatsApp voice replies. |
| `sessions/` | 56 MB | Conversation history per channel. |
| `whatsapp/` | 17 MB | WhatsApp bridge state (auth + message log). |
For UK SME deployments: reserve at least 20 GB of block storage to account for growth + future log retention + snapshot history. Oracle Always Free includes 200 GB block storage so this is never a constraint on Free Tier.
Network egress: 18.73 GB in 21 days
Pulled from /proc/net/dev since current boot (21 days uptime):
- Transmitted: 20.1 GB total
- Received: 8.2 GB total
- Per day average: ~890 MB/day
- Annualised: ~326 GB/year
Well within Oracle's 10 TB free outbound transfer per month allowance. Even at the projected annualised rate, we'd use 0.3% of the included egress. Egress cost: zero.
For comparison: a moderate-traffic SaaS site does 5-50 GB/day of egress. Our agent's egress is dominated by model API calls (Anthropic responses) + audio file delivery for voice replies + WhatsApp bridge keepalives.
Model API spend: £40-80/month
The largest variable cost. Driven by content production volume.
Steady-state pattern (no content sprint):
- Daily ops brief: ~5K tokens/day in Sonnet = ~£0.50/month
- Ad-hoc CLI analysis: ~50K tokens/day Sonnet + occasional Opus = ~£15-25/month
- WhatsApp auto-replies + queries: ~2K tokens/day Haiku = ~£1/month
- System skills: negligible
- Subtotal: ~£20-30/month
Content production months (like the May 2026 23-article authority push):
- Article drafting: 23 articles × ~10K tokens each (Sonnet) + ~5K Opus polish per article = ~50-80K tokens/article
- Total push cost: ~£40-60 over 5 days
- Steady-state monthly equivalent: ~£20-40/month additional during heavy content months
Total observed range: £40-80/month combining baseline + content spikes.
Could be reduced ~50% by routing more sub-tasks to Haiku 4.5 (we currently default Sonnet for caution). Worth noting for cost-sensitive deployments.
Compared against commercial alternatives
The honest comparison at typical UK SME volume (5-15 active skills, founder-direct WhatsApp, 5-20 daily skill invocations, content production assistance):
| Option | Monthly cost (UK) | What you get | What you don't get |
|---|---|---|---|
| Hermes (this deployment) | £50-100 (server + model + minimal ops time) | Full self-hosted, all use cases, no per-run pricing, persistent memory | DIY ops, monitoring, recovery |
| OpenAI Agents SDK + ChatGPT Pro | £160 + per-token API | Polished UI, OpenAI ecosystem | No persistent memory across sessions, no messaging channels |
| n8n Cloud Pro | £40-200 (workflow execution-based) | Visual builder, 400+ integrations | No native AI agent loop, no skill abstraction |
| Make.com Pro | £80-250 (operations-based) | Visual workflows | No AI agent capability, scenarios reset state |
| Zapier with AI | £100-500 (task-based) | Integration breadth | Same — workflow tool, not agent platform |
| LangSmith (LangGraph hosted) | £30/seat + compute | Production observability | LangGraph requires more dev work upfront |
| CrewAI Cloud | £30+ ($0.50/run beyond 50 free) | Visual editor, role-based | Cost spikes at volume; £775/month at 2,000 runs |
For automation that uses agent-style reasoning (not just rule-based workflows), Hermes is materially cheaper than the alternatives at our usage level. The crossover point where commercial alternatives become competitive is roughly £200-300/month of model spend OR a team larger than 1-2 founders + needs polished multi-user UI.
What this enables that commercial tools don't
Three things that justify the operational tax of self-hosting:
1. No per-run pricing on workflows that compound
Commercial agent platforms charge per workflow run, per API call, per "operation." Hermes' fixed cost means a workflow that we run 50 times a day costs the same as one we run 5 times a day. This means we deploy automation in places where commercial pricing would be prohibitive.
The 222 skill invocations over 40 days would be £100-300/month on Make.com or Zapier. On Hermes, the marginal cost per invocation is the model tokens — pennies, not pounds.
2. Persistent memory across sessions
Hermes' state.db (73 MB after 40 days) accumulates conversation history, learned patterns, user preferences. Commercial tools reset state between workflows; Hermes remembers. This compounds — by month three, the agent's daily brief is materially better than at month one because it's seen what we care about.
3. Agent ↔ skill ↔ tool composition
Hermes can chain: skill A pulls data → skill B analyses it → skill C drafts output → tool D delivers to channel E. Each step is composable + replaceable. Commercial tools have rigid workflows; Hermes has primitives that compose into whatever workflow we need.
What it costs in operational time
The real cost most cost-comparison articles miss.
Setup: ~6 hours total
- 1 hour: Oracle account + instance provision
- 30 min: Hermes install + initial config
- 2 hours: systemd hardening, monitoring setup, recovery playbook
- 1 hour: WhatsApp link + first skill
- 1.5 hours: documentation + team handover
Steady-state ops: ~30-60 minutes/month
- Reviewing auto-update outcomes
- Reviewing model spend
- Adjusting skill rules based on what's worked / hasn't
- Checking logs for any anomalies
Incident response over 40 days: ~3 hours total
- 30 April outage diagnosis + immediate fix (~1.5 hours)
- Recovery playbook documentation (1 hour)
- Patch reapplication after auto-update (~30 min)
For a UK SME founder, the ops time is a few hours per month — comparable to maintaining a SaaS subscription's billing + admin overhead. Not zero. Not much.
Key learnings from 40 days
1. Always Free is enough for pilots; production benefits from the upgrade. The Hermes Workspace web UI + Hermes Dashboard + co-located services pushed us above the Free Tier comfort zone. For a Hermes-only deployment without the workspace/dashboard, Free Tier is genuinely fine.
2. systemd `Restart=always` (not `on-failure`) is the single most important configuration. The 30 April outage was preventable. Now is.
3. Healthchecks.io free tier is sufficient. The 10-minute alert window with WhatsApp delivery means any future outage gets caught in under 12 minutes — vs the 62 hours it took us to notice the April incident.
4. Auto-update with rollback is essential. Two of fifteen updates rolled back. Without the rollback path, one of those would have been a multi-hour outage.
5. Disk grows slowly but steadily. 5.2 GB after 40 days suggests ~50 GB/year if nothing is pruned. The auto-update script handles backup pruning (keeps last 7); manual session/checkpoint pruning may be needed annually.
6. Model API spend is the dominant variable cost. The server is fixed; the model bill scales with use. Heavy content months can 2× the model bill briefly. Worth budgeting for.
7. Two minor things consistently improve quality: running Haiku 4.5 for sub-tasks (cost optimisation) + writing skill prompts that include explicit "what NOT to do" rules (quality optimisation).
What's next
We'll re-publish this teardown at 90 days (mid-July 2026) with refreshed numbers and any new patterns. The goal: the only first-party multi-month Hermes Agent production cost reference on the internet, refreshed quarterly.
If you're considering a Hermes deployment for your UK business, the deployment guide (118), Oracle Cloud setup guide (119), monitoring patterns (121), and security posture (120) cover the full architecture. This teardown gives you the cost reality.
Frequently asked questions
Is the £50-100/month total cost realistic for my business?
Yes if your usage pattern is similar — 5-15 active skills, founder-direct WhatsApp, daily skill invocations in single digits. Commercial alternatives at this level cost £160-700/month. Heavy content production or multi-user team deployments scale the cost up — but more like 2× than 10×.
Could I reduce cost further?
Yes. Switch sub-tasks to Haiku 4.5 (5× cheaper than Sonnet) and you'll cut model spend by 30-50%. Stay on Always Free Ampere A1 instead of paid x86 (saves £10-20/month server cost). Use a cheaper model provider (Z.AI GLM-5 series is roughly 1/3 the cost of Anthropic) — quality drops noticeably but for non-critical skills it's fine.
What if my deployment grows beyond founder-led?
Expect linear scaling: 5 daily users instead of 1 = roughly 5× the model spend (still single-digit hundreds of pounds monthly). At the 10-user mark consider running multiple Hermes instances (specialist harnesses) on the same box. At 25+ users consider commercial alternatives or hire a dedicated DevOps person.
How does this compare to running Hermes on a developer's laptop?
Laptop-hosted Hermes works for development + testing but breaks production patterns: closes when laptop sleeps, no 24/7 availability, can't run scheduled jobs reliably. The £0-20/month server cost is the right answer for any production-intent deployment.
Does the 30 April outage affect your trust in Hermes?
The outage taught us about systemd configuration; it didn't reveal a Hermes-specific bug. Any long-running Linux service with Restart=on-failure instead of Restart=always would have had the same outcome. Hermes itself recovered cleanly once we restarted it. We trust the platform — we just trust ourselves more after writing the recovery playbook.
Will you publish a 1-year teardown?
Yes — May 2027. We'll cover full-year cost trajectory, model-version migrations, any major incidents, and what changed in the Hermes platform itself.
Is the 222 skill invocations / 40 days representative?
For our usage (founder-direct WhatsApp + content production assistance + ad-hoc analysis) — yes. A higher-volume customer-concierge deployment would 3-10× that figure. A code-heavy engineering automation deployment would similarly increase. The pattern: 5-50 invocations/day per Hermes instance is normal; above 100/day suggests either heavy automation usage or it's time to scale to multiple instances.
Could I run Hermes on Hetzner / DigitalOcean instead?
Yes. Hetzner CX22 (£4/month, 2 vCPU / 4 GB RAM) is the cheapest viable option for a Hermes-only deployment. DigitalOcean Basic Droplet (£5/month, 1 vCPU / 1 GB RAM) is too tight on RAM. Linode Nanode 1 GB (£3/month) is also too tight. The 4 GB RAM minimum is the constraint.
Related reading
- ↑ How to Deploy Hermes Agent — UK Business Complete Guide — the foundational deployment pillar
- ↔ Hermes Agent on Oracle Cloud Free Tier — UK Guide — the underlying server platform setup (with the Free Tier vs paid clarification)
- ↔ Hermes Agent Monitoring, Uptime & Reliability in Production — the monitoring stack that produced this data
- ↔ Hermes Agent Security & GDPR — the compliance posture for the same deployment
- ↔ What is Hermes Agent? A UK Business Guide — the foundational pillar
- ↔ Hermes vs LangGraph vs CrewAI — the framework comparison with TCO at SME scale
What should you do next?
The numbers above are the full picture. A pilot Hermes deployment costs £50-100/month all-in for a typical UK SME use pattern. The setup takes a working day. The recovery playbook keeps you out of trouble.
See Hermes-powered automations we run for clients →
Or to scope your specific deployment cost: Book a free Hermes deployment review →