OpenHarness delivers core lightweight agent infrastructure: tool-use, skills, memory, and multi-agent coordination.
Join the community: contribute Harness for open agent development.
One Command (oh) to Launch OpenHarness and Unlock All Agent Harnesses.
Supports CLI agent integration including OpenClaw, nanobot, Cursor, and more.
• Streaming Tool-Call Cycle • API Retry with Exponential Backoff • Parallel Tool Execution • Token Counting & Cost Tracking |
• 43 Tools (File, Shell, Search, Web, MCP) • On-Demand Skill Loading (.md) • Plugin Ecosystem (Skills + Hooks + Agents) • Compatible with anthropics/skills & plugins |
• CLAUDE.md Discovery & Injection • Context Compression (Auto-Compact) • MEMORY.md Persistent Memory • Session Resume & History |
• Multi-Level Permission Modes • Path-Level & Command Rules • PreToolUse / PostToolUse Hooks • Interactive Approval Dialogs |
• Subagent Spawning & Delegation • Team Registry & Task Management • Background Task Lifecycle • ClawTeam Integration (Roadmap) |
| Claude Code | OpenHarness | |
|---|---|---|
| Lines of Code | 512,664 | 11,733 (44x lighter) |
| Files | 1,884 | 163 |
| Language | TypeScript | Python |
| Tools | ~44 | 43 (98%) |
| Commands | ~88 | 54 (61%) |
| Skills Compatible | ✅ | ✅ anthropics/skills |
| Plugin Compatible | ✅ | ✅ claude-code/plugins |
| Tests | — | 114 unit + 6 E2E suites |
Leverages Python's power with pure focus on Harness architecture—stripped of enterprise overhead like telemetry, OAuth complexity, and hundreds of React components.
An Agent Harness is the complete infrastructure that wraps around an LLM to make it a functional agent. The model provides intelligence; the harness provides hands, eyes, memory, and safety boundaries.
OpenHarness is an open-source Python implementation designed for researchers, builders, and the community:
Start here: Quick Start · Provider Compatibility · Showcase · Contributing · Changelog
ANTHROPIC_API_KEY=your_key uv run oh -p "Inspect this repository and list the top 3 refactors"
# Clone and install
git clone https://github.com/HKUDS/OpenHarness.git
cd OpenHarness
uv sync --extra dev
# Example: use Kimi as the backend
export ANTHROPIC_BASE_URL=https://api.moonshot.cn/anthropic
export ANTHROPIC_API_KEY=your_kimi_api_key
export ANTHROPIC_MODEL=kimi-k2.5
# Launch
oh # if venv is activated
uv run oh # without activating venv
# Single prompt → stdout
oh -p "Explain this codebase"
# JSON output for programmatic use
oh -p "List all functions in main.py" --output-format json
# Stream JSON events in real-time
oh -p "Fix the bug" --output-format stream-json
OpenHarness currently detects and adapts to a small set of provider profiles in code. The table below is intentionally conservative and reflects the profiles implemented in src/openharness/api/provider.py.
| Provider profile | Detection signal | Auth kind | Voice mode | Notes |
|---|---|---|---|---|
| Anthropic | Default when no custom ANTHROPIC_BASE_URL is set | API key | Not wired in current build | Default Claude-oriented setup |
| Moonshot / Kimi | ANTHROPIC_BASE_URL contains moonshot or model starts with kimi | API key | Not wired in current build | Works through an Anthropic-compatible endpoint |
| Vertex-compatible | Base URL contains vertex or aiplatform | GCP | Not wired in current build | Good fit for Anthropic-style gateways on Vertex |
| Bedrock-compatible | Base URL contains bedrock | AWS | Not wired in current build | Intended for Bedrock-style deployments |
| Generic Anthropic-compatible | Any other explicit ANTHROPIC_BASE_URL | API key | Not wired in current build | Useful for proxies and internal gateways |
If you are evaluating cross-provider workflows or want a concrete demo path, start with Anthropic or the Kimi example above, then compare behavior against your own compatible endpoint.
OpenHarness implements the core Agent Harness pattern with 10 subsystems:
openharness/ engine/ # 🧠 Agent Loop — query → stream → tool-call → loop tools/ # 🔧 43 Tools — file I/O, shell, search, web, MCP skills/ # 📚 Knowledge — on-demand skill loading (.md files) plugins/ # 🔌 Extensions — commands, hooks, agents, MCP servers permissions/ # 🛡️ Safety — multi-level modes, path rules, command deny hooks/ # ⚡ Lifecycle — PreToolUse/PostToolUse event hooks commands/ # 💬 54 Commands — /help, /commit, /plan, /resume, ... mcp/ # 🌐 MCP — Model Context Protocol client memory/ # 🧠 Memory — persistent cross-session knowledge tasks/ # 📋 Tasks — background task management coordinator/ # 🤝 Multi-Agent — subagent spawning, team coordination prompts/ # 📝 Context — system prompt assembly, CLAUDE.md, skills config/ # ⚙️ Settings — multi-layer config, migrations ui/ # 🖥️ React TUI — backend protocol + frontend
The heart of the harness. One loop, endlessly composable:
while True:
response = await api.stream(messages, tools)
if response.stop_reason != "tool_use":
break # Model is done
for tool_call in response.tool_uses:
# Permission check → Hook → Execute → Hook → Result
result = await harness.execute_tool(tool_call)
messages.append(tool_results)
# Loop continues — model sees results, decides next action
The model decides what to do. The harness handles how — safely, efficiently, with full observability.
| Category | Tools | Description |
|---|---|---|
| File I/O | Bash, Read, Write, Edit, Glob, Grep | Core file operations with permission checks |
| Search | WebFetch, WebSearch, ToolSearch, LSP | Web and code search capabilities |
| Notebook | NotebookEdit | Jupyter notebook cell editing |
| Agent | Agent, SendMessage, TeamCreate/Delete | Subagent spawning and coordination |
| Task | TaskCreate/Get/List/Update/Stop/Output | Background task management |
| MCP | MCPTool, ListMcpResources, ReadMcpResource | Model Context Protocol integration |
| Mode | EnterPlanMode, ExitPlanMode, Worktree | Workflow mode switching |
| Schedule | CronCreate/List/Delete, RemoteTrigger | Scheduled and remote execution |
| Meta | Skill, Config, Brief, Sleep, AskUser | Knowledge loading, configuration, interaction |
Every tool has:
Skills are on-demand knowledge — loaded only when the model needs them:
Available Skills: - commit: Create clean, well-structured git commits - review: Review code for bugs, security issues, and quality - debug: Diagnose and fix bugs systematically - plan: Design an implementation plan before coding - test: Write and run tests for code - simplify: Refactor code to be simpler and more maintainable - pdf: PDF processing with pypdf (from anthropics/skills) - xlsx: Excel operations (from anthropics/skills) - ... 40+ more
Compatible with anthropics/skills — just copy .md files to ~/.openharness/skills/.
Compatible with claude-code plugins. Tested with 12 official plugins:
| Plugin | Type | What it does |
|---|---|---|
commit-commands | Commands | Git commit, push, PR workflows |
security-guidance | Hooks | Security warnings on file edits |
hookify | Commands + Agents | Create custom behavior hooks |
feature-dev | Commands | Feature development workflow |
code-review | Agents | Multi-agent PR review |
pr-review-toolkit | Agents | Specialized PR review agents |
# Manage plugins
oh plugin list
oh plugin install <source>
oh plugin enable <name>
OpenHarness is useful as a lightweight harness layer around Claude-style tooling conventions:
For concrete usage ideas instead of generic claims, see docs/SHOWCASE.md.
Multi-level safety with fine-grained control:
| Mode | Behavior | Use Case |
|---|---|---|
| Default | Ask before write/execute | Daily development |
| Auto | Allow everything | Sandboxed environments |
| Plan Mode | Block all writes | Large refactors, review first |
Path-level rules in settings.json:
{
"permission": {
"mode": "default",
"path_rules": [{"pattern": "/etc/*", "allow": false}],
"denied_commands": ["rm -rf /", "DROP TABLE *"]
}
}
React/Ink TUI with full interactive experience:
/ → arrow keys to select → Enter/permissions → select from list/resume → pick from historyoh [OPTIONS] COMMAND [ARGS] Session: -c/--continue, -r/--resume, -n/--name Model: -m/--model, --effort, --max-turns Output: -p/--print, --output-format text|json|stream-json Permissions: --permission-mode, --dangerously-skip-permissions Context: -s/--system-prompt, --append-system-prompt, --settings Advanced: -d/--debug, --mcp-config, --bare Subcommands: oh mcp | oh plugin | oh auth
| Suite | Tests | Status |
|---|---|---|
| Unit + Integration | 114 | ✅ All passing |
| CLI Flags E2E | 6 | ✅ Real model calls |
| Harness Features E2E | 9 | ✅ Retry, skills, parallel, permissions |
| React TUI E2E | 3 | ✅ Welcome, conversation, status |
| TUI Interactions E2E | 4 | ✅ Commands, permissions, shortcuts |
| Real Skills + Plugins | 12 | ✅ anthropics/skills + claude-code/plugins |
# Run all tests
uv run pytest -q # 114 unit/integration
python scripts/test_harness_features.py # Harness E2E
python scripts/test_real_skills_plugins.py # Real plugins E2E
from pydantic import BaseModel, Field
from openharness.tools.base import BaseTool, ToolExecutionContext, ToolResult
class MyToolInput(BaseModel):
query: str = Field(description="Search query")
class MyTool(BaseTool):
name = "my_tool"
description = "Does something useful"
input_model = MyToolInput
async def execute(self, arguments: MyToolInput, context: ToolExecutionContext) -> ToolResult:
return ToolResult(output=f"Result for: {arguments.query}")
Create ~/.openharness/skills/my-skill.md:
---
name: my-skill
description: Expert guidance for my specific domain
---
# My Skill
## When to use
Use when the user asks about [your domain].
## Workflow
1. Step one
2. Step two
...
Create .openharness/plugins/my-plugin/.claude-plugin/plugin.json:
{
"name": "my-plugin",
"version": "1.0.0",
"description": "My custom plugin"
}
Add commands in commands/*.md, hooks in hooks/hooks.json, agents in agents/*.md.
OpenHarness is most useful when treated as a small, inspectable harness you can adapt to a real workflow:
json and stream-json output in automation flows.See docs/SHOWCASE.md for short, reproducible examples.
OpenHarness is a community-driven research project. We welcome contributions in:
| Area | Examples |
|---|---|
| Tools | New tool implementations for specific domains |
| Skills | Domain knowledge .md files (finance, science, DevOps...) |
| Plugins | Workflow plugins with commands, hooks, agents |
| Providers | Support for more LLM backends (OpenAI, Ollama, etc.) |
| Multi-Agent | Coordination protocols, team patterns |
| Testing | E2E scenarios, edge cases, benchmarks |
| Documentation | Architecture guides, tutorials, translations |
# Development setup
git clone https://github.com/HKUDS/OpenHarness.git
cd OpenHarness
uv sync --extra dev
uv run pytest -q # Verify everything works
Useful contributor entry points:
CONTRIBUTING.md for setup, checks, and PR expectationsCHANGELOG.md for user-visible changesdocs/SHOWCASE.md for real-world usage patterns worth documentingMIT — see LICENSE.
Oh my Harness!
The model is the agent. The code is the harness.
Thanks for visiting ✨ OpenHarness!