v0.3.0Apache 2.0

Turn your hardware into an agent cluster.

Define your machines in YAML. harombe orchestrates autonomous agents across all of them — with container-isolated security, persistent memory, and zero cloud dependencies.

terminal
$_

Run autonomous agents across your own machines. No cloud required.

Capabilities

Built In, Not Bolted On

01

Distributed Clusters

Route work across any mix of hardware — M1 laptops, NVIDIA workstations, CPU-only boxes. Define the topology in YAML. harombe picks the right node for each query.

mDNS discovery · Circuit breakers · Load balancing · Health monitoring
02

Container-Isolated Security

Every tool call runs in its own Docker container with its own network rules. Not a sandbox abstraction — actual isolation.

MCP Gateway · Per-tool egress filtering · Audit logging · HITL gates
03

Autonomous Execution

Agents plan, act, and recover autonomously. You define the tools and safety gates — harombe handles the loop.

Shell · Filesystem · Web search · Browser automation · Code execution
04

Memory & RAG

Persistent conversations with cross-session semantic search, fully local. Ask a question today, follow up next week — it still has the context.

SQLite · ChromaDB · sentence-transformers · Context windowing
05

MCP Protocol

Expose tools to external clients and connect to external MCP servers. Drop into Claude Desktop or any MCP-compatible client.

JSON-RPC 2.0 · stdio/HTTP transport · Claude Desktop compatible
06

Multi-Agent Delegation

Named agent blueprints that delegate tasks to specialized sub-agents. Chain them together for complex workflows.

Agent registry · Delegation chains · Cycle detection · Depth limits
07

Voice Interface

Speech-to-text and text-to-speech with voice activity detection. Talk to your agents, hands-free.

Whisper STT · Piper TTS · VAD · Push-to-talk · WebSocket streaming
08

Privacy Router

Configurable boundary between local-only, hybrid, and cloud inference. You control exactly what leaves your network.

PII detection · Data sanitization · Per-agent routing modes

Distributed by Default

Your Machines, One Framework

A MacBook for quick questions. A workstation for real work. A server for heavy reasoning. harombe routes to the right one automatically.

Coordinator
Query routing
Tier 0
Laptop
qwen2.5:3b
Tier 1
Workstation
qwen2.5:14b
Tier 2
Server
qwen2.5:72b
# ~/.harombe/harombe.yaml
cluster:
  routing:
    prefer_local: true
    fallback_strategy: graceful
    load_balance: true

  nodes:
    - name: laptop
      host: localhost
      port: 8000
      model: qwen2.5:3b
      tier: 0    # Fast / local

    - name: workstation
      host: 192.168.1.100
      port: 8000
      model: qwen2.5:14b
      tier: 1    # Balanced

    - name: server
      host: server.local
      port: 8000
      model: qwen2.5:72b
      tier: 2    # Powerful

Get Started

Four Commands, Under Five Minutes

01

Install

pip install harombe

One command to get started

02

Initialize

harombe init

Detects hardware, writes config

03

Pull a model

ollama pull qwen2.5:7b

Download a model via Ollama

04

Start

harombe chat

Autonomous agent with tools

Example Interaction

$ harombe chat
Agent ready. Tools: shell, filesystem, web_search
How can I help?

> Find all Python files in src/ and count them

[tool:shell] find src/ -name "*.py" | wc -l
       47

Found 47 Python files in src/.

> Which ones have the most imports?

[tool:shell] for f in $(find src/ -name "*.py"); do ...
[tool:read_file] src/harombe/security/gateway.py

The top 3 files by import count:
1. src/harombe/security/gateway.py — 23 imports
2. src/harombe/agent/loop.py — 19 imports
3. src/harombe/coordination/cluster.py — 17 imports

Use Cases

Built for Real-World Applications

Audit a codebase without sending it to the cloud

Point harombe at a repo. It spins up isolated containers, runs analysis tools, and gives you a report. Nothing leaves your machine.

Research agent that remembers everything

Voice-enabled, persistent memory, web search. Ask a question today, follow up next week — it still has the context.

Three machines, one agent

Your laptop handles quick lookups. Your workstation runs 14B models for real analysis. Your server tackles the 72B tasks. One YAML file ties them together.

Slack bot with guardrails

Drop an agent into your team's Slack. It can execute tools, but every dangerous action goes through human approval first.

Architecture

Six-Layer Stack

Layer 6

Clients

Voice, CLI, REST API, Slack, Discord

  • Voice interface with push-to-talk and VAD
  • Interactive CLI with Rich formatting
  • REST API with SSE streaming
  • Channel integrations (Slack, Discord, WebSocket)
Layer 5

Privacy Router

Hybrid local/cloud AI with PII detection

  • Three modes: local-only, hybrid, cloud-assisted
  • PII detection and redaction before cloud calls
  • Context sanitization with sensitivity classification
  • Configurable per-agent privacy boundary
Layer 4

Agent & Memory

ReAct loop, tools, plugins, delegation

  • ReAct loop with multi-step planning and tool calling
  • Multi-agent delegation with named blueprints
  • Container-isolated plugin system with MCP scaffolds
  • SQLite conversations + ChromaDB vector search
Layer 3

Security

Defense-in-depth with container isolation

  • MCP Gateway — every tool call goes through it
  • Per-container Docker isolation with resource limits
  • Network egress filtering (iptables, DNS allowlists)
  • HITL gates with risk classification, audit logging, Vault secrets
Layer 2

Orchestration

Cluster config, mDNS discovery, circuit breakers

  • YAML-based cluster configuration
  • Smart routing by query complexity
  • mDNS auto-discovery on local network
  • Circuit breakers, health monitoring, metrics
Layer 1

Runtimes

llama.cpp, Whisper, TTS, embeddings

  • Ollama for model management (any hardware)
  • Whisper STT (tiny to large-v3)
  • Piper/Coqui TTS for speech output
  • sentence-transformers for local embeddings

Open Source

harombe is free and open source software, licensed under Apache 2.0. Fork it, modify it, deploy it commercially — no restrictions. Contributions are welcome.

$
pip install harombe && harombe init && harombe chat

Three commands to your first agent.

Scale to a cluster when you're ready.

Open source · Apache 2.0 · Self-hosted · Privacy-first