Turn your hardware into an agent cluster.
Define your machines in YAML. harombe orchestrates autonomous agents across all of them — with container-isolated security, persistent memory, and zero cloud dependencies.
Run autonomous agents across your own machines. No cloud required.
Capabilities
Built In, Not Bolted On
Distributed Clusters
Route work across any mix of hardware — M1 laptops, NVIDIA workstations, CPU-only boxes. Define the topology in YAML. harombe picks the right node for each query.
Container-Isolated Security
Every tool call runs in its own Docker container with its own network rules. Not a sandbox abstraction — actual isolation.
Autonomous Execution
Agents plan, act, and recover autonomously. You define the tools and safety gates — harombe handles the loop.
Memory & RAG
Persistent conversations with cross-session semantic search, fully local. Ask a question today, follow up next week — it still has the context.
MCP Protocol
Expose tools to external clients and connect to external MCP servers. Drop into Claude Desktop or any MCP-compatible client.
Multi-Agent Delegation
Named agent blueprints that delegate tasks to specialized sub-agents. Chain them together for complex workflows.
Voice Interface
Speech-to-text and text-to-speech with voice activity detection. Talk to your agents, hands-free.
Privacy Router
Configurable boundary between local-only, hybrid, and cloud inference. You control exactly what leaves your network.
Distributed by Default
Your Machines, One Framework
A MacBook for quick questions. A workstation for real work. A server for heavy reasoning. harombe routes to the right one automatically.
# ~/.harombe/harombe.yaml
cluster:
routing:
prefer_local: true
fallback_strategy: graceful
load_balance: true
nodes:
- name: laptop
host: localhost
port: 8000
model: qwen2.5:3b
tier: 0 # Fast / local
- name: workstation
host: 192.168.1.100
port: 8000
model: qwen2.5:14b
tier: 1 # Balanced
- name: server
host: server.local
port: 8000
model: qwen2.5:72b
tier: 2 # PowerfulGet Started
Four Commands, Under Five Minutes
Install
One command to get started
Initialize
Detects hardware, writes config
Pull a model
Download a model via Ollama
Start
Autonomous agent with tools
Example Interaction
$ harombe chat
Agent ready. Tools: shell, filesystem, web_search
How can I help?
> Find all Python files in src/ and count them
[tool:shell] find src/ -name "*.py" | wc -l
47
Found 47 Python files in src/.
> Which ones have the most imports?
[tool:shell] for f in $(find src/ -name "*.py"); do ...
[tool:read_file] src/harombe/security/gateway.py
The top 3 files by import count:
1. src/harombe/security/gateway.py — 23 imports
2. src/harombe/agent/loop.py — 19 imports
3. src/harombe/coordination/cluster.py — 17 importsUse Cases
Built for Real-World Applications
Audit a codebase without sending it to the cloud
Point harombe at a repo. It spins up isolated containers, runs analysis tools, and gives you a report. Nothing leaves your machine.
Research agent that remembers everything
Voice-enabled, persistent memory, web search. Ask a question today, follow up next week — it still has the context.
Three machines, one agent
Your laptop handles quick lookups. Your workstation runs 14B models for real analysis. Your server tackles the 72B tasks. One YAML file ties them together.
Slack bot with guardrails
Drop an agent into your team's Slack. It can execute tools, but every dangerous action goes through human approval first.
Architecture
Six-Layer Stack
Clients
Voice, CLI, REST API, Slack, Discord
- •Voice interface with push-to-talk and VAD
- •Interactive CLI with Rich formatting
- •REST API with SSE streaming
- •Channel integrations (Slack, Discord, WebSocket)
Privacy Router
Hybrid local/cloud AI with PII detection
- •Three modes: local-only, hybrid, cloud-assisted
- •PII detection and redaction before cloud calls
- •Context sanitization with sensitivity classification
- •Configurable per-agent privacy boundary
Agent & Memory
ReAct loop, tools, plugins, delegation
- •ReAct loop with multi-step planning and tool calling
- •Multi-agent delegation with named blueprints
- •Container-isolated plugin system with MCP scaffolds
- •SQLite conversations + ChromaDB vector search
Security
Defense-in-depth with container isolation
- •MCP Gateway — every tool call goes through it
- •Per-container Docker isolation with resource limits
- •Network egress filtering (iptables, DNS allowlists)
- •HITL gates with risk classification, audit logging, Vault secrets
Orchestration
Cluster config, mDNS discovery, circuit breakers
- •YAML-based cluster configuration
- •Smart routing by query complexity
- •mDNS auto-discovery on local network
- •Circuit breakers, health monitoring, metrics
Runtimes
llama.cpp, Whisper, TTS, embeddings
- •Ollama for model management (any hardware)
- •Whisper STT (tiny to large-v3)
- •Piper/Coqui TTS for speech output
- •sentence-transformers for local embeddings
Open Source
harombe is free and open source software, licensed under Apache 2.0. Fork it, modify it, deploy it commercially — no restrictions. Contributions are welcome.
Three commands to your first agent.
Scale to a cluster when you're ready.
Open source · Apache 2.0 · Self-hosted · Privacy-first