This skill turns multi-agent collaboration into a durable coordination board instead of a chat-memory workflow.
shared-board for lower token cost and easier human handoff, centralized-dispatch for higher-throughput ranking and automation.auto mode selection using optimize_for, agent_count, machine_count, window_count, and shared_storage.while loop execution with one-time workspace and branch confirmation reused through --while-session.agent_id, plus host_id and window_id tracking for multi-machine and multi-window teams.coordination.db, exported state.json, and human-readable STATUS.md / AGENTS.md / SESSIONS.md.NOTIFICATIONS.md, so direct CLI users can see actionable board signals without running the dispatch service.owned_paths inference from git changes, entry files, and workspace scan candidates.project-plan/handoffs/ for cross-window takeover and review handoff.coordination_dispatch_server.py and coordination_dispatch_client.py, now running coordination commands in-process instead of shelling out to a new Python subprocess per request.next-action, claim-next, and claim-review-next, so centralized mode can use JSON scheduling calls without replacing shared-board CLI flows.registration contract with agent_id, session_id, and expected_session_epoch, inspect SESSIONS.md lease state as JSON, renew leases safely, reap stale idle sessions, and reclaim stale work without going back through raw CLI text parsing.workers, reviewers, workers:<specialty>, reviewers:<specialty>, or agent:<id>, and keep coordination.db history bounded over long runs.state.json, STATUS.md, and doctor output, so later agents can see the current pruning rules without inspecting server startup flags.update_coordination_status.py events and update_coordination_status.py prune-events, so event-backed workflows do not require centralized-dispatch.preflight, intake scan, validate / validate-repair, and normal board mutations, so the persisted timeline is useful even without the dispatch service.notifications now also supports local subscription-style views by target, specialty, kind, priority, and agent:<id>, so separate windows can poll just the reviewer or worker slice they care about.notifications and events can now watch with --watch-seconds, so a local window can long-poll for the next matching signal without needing centralized-dispatch.next-action, next-claimable, and next-reviewable now also support --watch-seconds, so idle windows can wait for the next actionable work item without busy looping.state.json, coordination.db, and SESSIONS.md, so host/window/agent activity is visible as a first-class coordination surface.HANDOFFS.md and handoff-index.json, and shared-board users can list, claim, acknowledge, and watch the next ready handoff without the dispatch service.project-plan/archive/events/*.jsonl, so retention stays bounded without losing audit history./events/stream and /notifications/stream, plus read-only /event-archives and /mode-advisor.revision and --expected-revision, so stale writers can fail fast instead of overwriting newer board state.session_epoch, so a host can reject stale windows after idle-session recovery instead of letting an old binding keep writing.lease_token, so a reclaimed or re-claimed module can reject stale heartbeats and stale completion/release attempts instead of silently accepting an older holder.claim, claim-next, start-review, and claim-review-next now auto-reclaim expired leases inside the same state mutation, so stale work can be taken over without a separate reclaim pass.revision behind the scenes; read-only summary, state, and other load paths now keep optimistic-concurrency counters stable until a real state mutation occurs.candidate_* dependency and ownership hints from confirmed depends_on / owned_paths, so rough scan suggestions stop over-serializing the board by default.safe, balanced, and throughput safety profiles, and projects the chosen policy into STATUS.md, state.json, and preflight.json.validate --repair now checks and repairs not only markdown projections but also session_rows, handoff_rows, and cursor_rows, so stale derived bindings, orphan handoff entries, and broken cursor metadata can be normalized without manual SQLite edits.doctor now reports repository-level counts and health signals for sessions, handoffs, cursors, archive files, stale leases, and current mode advice, so operators can inspect board integrity from one command before deciding whether to repair.project-plan/ directorypython scripts/bootstrap_coordination.py --project-root "<project-root>"
python scripts/bootstrap_coordination.py --project-root "<project-root>" --emit-mcp-config
When --emit-mcp-config is used, the skill writes project-plan/mcp-host-config.json with a ready-to-import stdio MCP server entry for the current project root.
Hosts should then register a worker or reviewer window once and cache the canonical registration.binding fields (agent_id, session_id, expected_session_epoch) before making later dispatch or lease calls.
python scripts/preflight_coordination.py --project-root "<project-root>" --confirm-workspace --confirm-branch --session-kind while-loop --mode auto --optimize-for token --agent-count 2 --window-count 2
python scripts/preflight_coordination.py --project-root "<project-root>" --confirm-workspace --confirm-branch --session-kind while-loop --mode auto --optimize-for token --safety-profile safe
For cross-machine work that should optimize for throughput:
python scripts/preflight_coordination.py --project-root "<project-root>" --confirm-workspace --confirm-branch --session-kind while-loop --mode auto --optimize-for efficiency --agent-count 6 --machine-count 2 --window-count 4 --cross-machine
python scripts/update_coordination_status.py init-agent --project-root "<project-root>" --thread-key "worker-a" --role worker --while-session
python scripts/scan_project_intake.py --project-root "<project-root>" --thread-key "worker-a" --seed-tasks
python scripts/update_coordination_status.py next-action --project-root "<project-root>" --agent-id "<agent-id>" --specialty "backend"
python scripts/update_coordination_status.py next-action --project-root "<project-root>" --agent-id "<agent-id>" --specialty "backend" --watch-seconds 5
python scripts/update_coordination_status.py claim-next --project-root "<project-root>" --agent-id "<agent-id>" --reviewer-id "<reviewer-id>" --specialty "backend" --while-session
python scripts/update_coordination_status.py claim-review-next --project-root "<project-root>" --agent-id "<reviewer-id>" --while-session
python scripts/update_coordination_status.py handoff-bundle --project-root "<project-root>" --module "<module>" --agent-id "<agent-id>"
The bundle is written to:
project-plan/handoffs/<module>.handoff.md
Start the proxy:
python scripts/coordination_dispatch_server.py --project-root "<project-root>" --host 127.0.0.1 --port 8765
Call it through the client:
python scripts/coordination_dispatch_client.py --server-url "http://127.0.0.1:8765" preflight --confirm-workspace --confirm-branch --session-kind while-loop --mode centralized-dispatch --optimize-for efficiency --agent-count 6 --machine-count 2 --window-count 4 --cross-machine
python scripts/coordination_dispatch_client.py --server-url "http://127.0.0.1:8765" update summary
python scripts/coordination_dispatch_client.py --server-url "http://127.0.0.1:8765" validate --repair
python scripts/coordination_dispatch_client.py --server-url "http://127.0.0.1:8765" state
python scripts/coordination_dispatch_client.py --server-url "http://127.0.0.1:8765" state format=summary
python scripts/coordination_dispatch_client.py --server-url "http://127.0.0.1:8765" events since=0 limit=50
python scripts/coordination_dispatch_client.py --server-url "http://127.0.0.1:8765" events-stream since=0 limit=20 timeout=5
python scripts/coordination_dispatch_client.py --server-url "http://127.0.0.1:8765" notifications since=0 limit=50 target=reviewers:backend kind=review-ready
python scripts/coordination_dispatch_client.py --server-url "http://127.0.0.1:8765" notifications-stream since=0 limit=20 timeout=5 target=reviewers:backend kind=review-ready
python scripts/coordination_dispatch_client.py --server-url "http://127.0.0.1:8765" mode-advisor
python scripts/coordination_dispatch_client.py --server-url "http://127.0.0.1:8765" prune-events --max-rows 2000
python scripts/coordination_dispatch_client.py --server-url "http://127.0.0.1:8765" event-archives
python scripts/coordination_dispatch_client.py --server-url "http://127.0.0.1:8765" sessions
python scripts/coordination_dispatch_client.py --server-url "http://127.0.0.1:8765" register-session --thread-key "worker-a" --role worker --host-id machine-a --window-id window-1
python scripts/coordination_dispatch_client.py --server-url "http://127.0.0.1:8765" lease-heartbeat --agent-id "<agent-id>" --session-id "<session-id>" --expected-session-epoch 1 --expected-revision 12 --lease-token "<lease-token>" --module "core-auth" --ttl-seconds 600
python scripts/coordination_dispatch_client.py --server-url "http://127.0.0.1:8765" dispatch claim-next --agent-id "<agent-id>" --session-id "<session-id>" --expected-session-epoch 1 --expected-revision 12 --specialty backend
python scripts/coordination_dispatch_client.py --server-url "http://127.0.0.1:8765" lease-heartbeat --agent-id "<agent-id>" --session-id "<session-id>" --expected-session-epoch 1 --expected-revision 13 --lease-token "<lease-token>" --module "core-auth" --ttl-seconds 600
python scripts/coordination_dispatch_client.py --server-url "http://127.0.0.1:8765" reap-sessions --max-age-seconds 86400
python scripts/coordination_dispatch_client.py --server-url "http://127.0.0.1:8765" lease-reclaim
python scripts/coordination_dispatch_client.py --server-url "http://127.0.0.1:8765" dispatch next-action --agent-id "<agent-id>" --specialty backend
python scripts/coordination_dispatch_client.py --server-url "http://127.0.0.1:8765" dispatch claim-next --agent-id "<agent-id>" --reviewer-id "<reviewer-id>" --specialty backend --while-session
python scripts/coordination_dispatch_client.py --server-url "http://127.0.0.1:8765" dispatch claim-next --agent-id "<agent-id>" --specialty backend --expected-revision 12
Structured session-bound writes should replay the full registration/lease contract, especially session_id, expected_session_epoch, expected_revision, and the latest lease_token where required.
This does not replace shared-board. In local or lower-overhead runs, direct CLI remains first-class:
python scripts/update_coordination_status.py next-action --project-root "<project-root>" --agent-id "<agent-id>" --specialty "backend"
python scripts/update_coordination_status.py next-reviewable --project-root "<project-root>" --agent-id "<reviewer-id>" --specialty "backend" --watch-seconds 5
python scripts/update_coordination_status.py claim-next --project-root "<project-root>" --agent-id "<agent-id>" --reviewer-id "<reviewer-id>" --specialty "backend" --while-session
python scripts/update_coordination_status.py notifications --project-root "<project-root>"
python scripts/update_coordination_status.py notifications --project-root "<project-root>" --target "reviewers:backend" --since 0 --limit 20
python scripts/update_coordination_status.py notifications --project-root "<project-root>" --target "reviewers:backend" --kind "review-ready" --cursor-id "review-feed"
python scripts/update_coordination_status.py ack-notifications --project-root "<project-root>" --cursor-id "review-feed"
python scripts/update_coordination_status.py notifications --project-root "<project-root>" --target "agent:agent-1234abcd" --since 0 --limit 20
python scripts/update_coordination_status.py notifications --project-root "<project-root>" --target "reviewers:backend" --kind "review-ready" --since 0 --limit 20 --watch-seconds 5
python scripts/update_coordination_status.py events --project-root "<project-root>" --since 0 --limit 50 --notification-only --target "reviewers:backend"
python scripts/update_coordination_status.py events --project-root "<project-root>" --cursor-id "event-feed"
python scripts/update_coordination_status.py ack-events --project-root "<project-root>" --cursor-id "event-feed"
python scripts/update_coordination_status.py events --project-root "<project-root>" --source "validate" --since 0 --limit 20 --watch-seconds 5
python scripts/update_coordination_status.py handoffs --project-root "<project-root>"
python scripts/update_coordination_status.py next-handoff --project-root "<project-root>" --agent-id "<agent-id>" --watch-seconds 5
python scripts/update_coordination_status.py claim-handoff --project-root "<project-root>" --module "<module>" --agent-id "<agent-id>"
python scripts/update_coordination_status.py ack-handoff --project-root "<project-root>" --module "<module>" --agent-id "<agent-id>"
python scripts/update_coordination_status.py mode-advisor --project-root "<project-root>"
python scripts/update_coordination_status.py prune-events --project-root "<project-root>" --max-rows 2000
python scripts/update_coordination_status.py event-archives --project-root "<project-root>"
In other words:
shared-board: direct CLI is the normal path, with local notifications, cursor-aware event history, handoff inbox commands, pruning, archive inspection, and NOTIFICATIONS.md / HANDOFFS.md projections.centralized-dispatch: service endpoints, structured dispatch, persisted event polling, SSE streams, notification polling, archive inspection, and mode advice become the higher-efficiency path.Run the stdio MCP bridge when another host wants structured coordination tools:
python scripts/coordination_mcp_server.py --project-root "<project-root>"
Or generate a host-importable config directly from bootstrap:
python scripts/bootstrap_coordination.py --project-root "<project-root>" --emit-mcp-config
That writes:
project-plan/mcp-host-config.json
The file contains a ready mcpServers entry pointing at scripts/coordination_mcp_server.py with the current project root and folder name.
It exposes:
coordination_statecoordination_eventscoordination_notificationscoordination_handoffscoordination_sessionscoordination_mode_advisorcoordination_session_registercoordination_session_reapcoordination_lease_heartbeatcoordination_lease_reclaimcoordination_dispatchcoordination_commandFor hosts that keep a long-lived worker or reviewer window, prefer the structured MCP or dispatch path that carries registration.binding (agent_id, session_id, expected_session_epoch) and then the latest lease_token returned after a claim, review claim, or heartbeat. That gives the host an explicit session binding plus per-lease compare-and-swap style protection for renew/release/finish flows.
Host lifecycle rules:
registration.binding from register-session or coordination_session_register and send those exact fields on later dispatch, lease-heartbeat, and session-bound status-mutation calls.registration.follow_up_fields is canonical for which fields a host must replay on structured follow-up calls; dispatch reuses agent_id, session_id, expected_session_epoch, while lease and status mutations also carry lease_token.current_session_epoch, current_revision, and lease_token without changing the success envelope.state-revision-conflict, session-binding-conflict, session-epoch-conflict, and lease-token-conflict as the canonical conflict set; HTTP maps them to 409, including /leases/reclaim, while malformed payloads still use invalid-arguments semantics.session-reap only for stale idle sessions, and lease-reclaim for expired module leases.Run the full bundle check:
python scripts/verify_skill_bundle.py
The verifier now runs the split regression suites, both smoke paths, and the demo / scan / doctor validation chain.
Keep references/release-checklist.md as the canonical verification matrix and command list.
SKILL.mdreferences/coordination-files.mdreferences/example-walkthrough.mdreferences/example-snapshots.mdreferences/troubleshooting.md