Architecture

OpenFleet is organized as a modular Node.js application with clear component boundaries. Each module has a single responsibility and communicates through well-defined interfaces.

High-Level Flow

┌──────────┐     ┌──────────────┐     ┌────────────────────┐
│  cli.mjs │────▶│  monitor.mjs │────▶│  ve-orchestrator   │
│  (entry) │     │  (supervisor)│     │  (task runner)     │
└──────────┘     └──────┬───────┘     └─────────┬──────────┘
                        │                       │
                   ┌────┴────┐            ┌─────┴──────┐
                   │ telegram │            │ ve-kanban  │
                   │ bot.mjs  │            │ .mjs       │
                   └─────────┘            └────────────┘

cli.mjs loads config and routes to the appropriate handler (setup, doctor, daemon, or main start)
monitor.mjs is the supervisor loop — orchestration, smart PR flow, maintenance, fleet sync
ve-orchestrator.mjs handles native task execution with parallel slots, retries, and merge checks
ve-kanban.mjs wraps VK CLI operations (list, submit, rebase, archive)

Component Map

Component	File	Role
CLI Entry	`cli.mjs`	Command routing, config loading, daemon management
Supervisor	`monitor.mjs`	Main loop, smart PR flow, maintenance scheduling
Orchestrator	`ve-orchestrator.mjs`	Task execution, parallel slots, retry logic
VK Wrapper	`ve-kanban.mjs`	Task board CRUD, attempt lifecycle
Telegram Bot	`telegram-bot.mjs`	Polling, batching, live digest, command handling
Mini App Server	`ui-server.mjs`	HTTP/WS server for Telegram Mini App
Config	`config.mjs`	Unified config loader (CLI + env + .env + JSON + defaults)
Fleet Coordinator	`fleet-coordinator.mjs`	Multi-workstation coordination
Shared State	`shared-state-manager.mjs`	Distributed task claims, heartbeats, conflict resolution
Task Claims	`task-claims.mjs`	Local + shared claim persistence
Sync Engine	`sync-engine.mjs`	Bidirectional kanban sync with shared state
Autofix	`autofix.mjs`	Error pattern detection, guarded auto-fix execution
Agent Pool	`agent-pool.mjs`	Executor management, weighted selection, failover
Codex Shell	`codex-shell.mjs`	Persistent Codex SDK sessions
Copilot Shell	`copilot-shell.mjs`	Persistent Copilot SDK sessions
Claude Shell	`claude-shell.mjs`	Persistent Claude SDK sessions
Container Runner	`container-runner.mjs`	Docker/Podman/Apple Container isolation
Sentinel	`telegram-sentinel.mjs`	Independent watchdog companion
WhatsApp	`whatsapp-channel.mjs`	Optional WhatsApp notification channel

Execution Modes

Internal Mode (`EXECUTOR_MODE=internal`)

Tasks run through the internal agent pool inside the monitor process. The agent pool manages executor selection, weighted distribution, and failover.

Monitor → Agent Pool → [Copilot Shell | Codex Shell | Claude Shell]
                            ↓
                     Task Execution
                            ↓
                     Smart PR Flow → CI Check → Merge

VK Mode (`EXECUTOR_MODE=vk`)

Task execution is delegated to the Vibe-Kanban orchestrator scripts. The monitor handles supervision and PR lifecycle.

Hybrid Mode (`EXECUTOR_MODE=hybrid`)

Combines internal and VK modes. Internal handles primary execution; VK picks up overflow or specific task types.

Shared State Model

Distributed task coordination across agents and workstations uses a shared state system:

Owner heartbeat — Each claim has ownerId (workstation+agent) and ownerHeartbeat timestamp, renewed periodically to prove liveness
Attempt tokens — Unique UUID per attempt for idempotent operations
Retry/ignore flags — retryCount tracks attempts, ignoreReason marks tasks agents should skip
Conflict resolution — Active heartbeat wins over stale claims (first-come-first-served if both active)

Claim Lifecycle

1. claimTaskInSharedState(taskId, ownerId, attemptToken)
2. [work...] renewSharedStateHeartbeat() periodically
3. releaseSharedState(taskId, attemptToken, 'complete'|'failed'|'abandoned')
4. sweepStaleSharedStates() — background cleanup

Smart PR Flow

The smartPRFlow in monitor.mjs handles the full PR lifecycle:

Branch creation — from configured target branch
PR creation — with conventional commit title and structured body
CI monitoring — polls check status
Auto-rebase — on merge conflicts with target
Merge decision — merge when all checks pass
Cleanup — archive task, prune worktree

Error Recovery

OpenFleet uses multiple layers of error recovery:

Autofix — Pattern matching on error signatures with guarded fix execution
Circuit breakers — Prevent infinite retry loops by tracking consecutive failures
Stale sweeps — Background process that reclaims abandoned tasks
Daemon restart policy — Configurable crash tracking with cooldown periods
Sentinel watchdog — Independent process that can restart the monitor on crash

Data Persistence

State is persisted in .cache/openfleet/:

File	Contents
`shared-task-states.json`	Distributed task claims and heartbeat state
`fleet-state.json`	Multi-workstation fleet coordination data
`task-store.json`	Internal kanban task database
`workspaces.json`	Workspace and worktree registry
`sessions/`	Agent session state and history