Architecture
OpenFleet is organized as a modular Node.js application with clear component boundaries. Each module has a single responsibility and communicates through well-defined interfaces.
High-Level Flow
ββββββββββββ ββββββββββββββββ ββββββββββββββββββββββ
β cli.mjs ββββββΆβ monitor.mjs ββββββΆβ ve-orchestrator β
β (entry) β β (supervisor)β β (task runner) β
ββββββββββββ ββββββββ¬ββββββββ βββββββββββ¬βββββββββββ
β β
ββββββ΄βββββ βββββββ΄βββββββ
β telegram β β ve-kanban β
β bot.mjs β β .mjs β
βββββββββββ ββββββββββββββ
cli.mjsloads config and routes to the appropriate handler (setup, doctor, daemon, or main start)monitor.mjsis the supervisor loop β orchestration, smart PR flow, maintenance, fleet syncve-orchestrator.mjshandles native task execution with parallel slots, retries, and merge checksve-kanban.mjswraps VK CLI operations (list, submit, rebase, archive)
Component Map
| Component | File | Role |
|---|---|---|
| CLI Entry | cli.mjs | Command routing, config loading, daemon management |
| Supervisor | monitor.mjs | Main loop, smart PR flow, maintenance scheduling |
| Orchestrator | ve-orchestrator.mjs | Task execution, parallel slots, retry logic |
| VK Wrapper | ve-kanban.mjs | Task board CRUD, attempt lifecycle |
| Telegram Bot | telegram-bot.mjs | Polling, batching, live digest, command handling |
| Mini App Server | ui-server.mjs | HTTP/WS server for Telegram Mini App |
| Config | config.mjs | Unified config loader (CLI + env + .env + JSON + defaults) |
| Fleet Coordinator | fleet-coordinator.mjs | Multi-workstation coordination |
| Shared State | shared-state-manager.mjs | Distributed task claims, heartbeats, conflict resolution |
| Task Claims | task-claims.mjs | Local + shared claim persistence |
| Sync Engine | sync-engine.mjs | Bidirectional kanban sync with shared state |
| Autofix | autofix.mjs | Error pattern detection, guarded auto-fix execution |
| Agent Pool | agent-pool.mjs | Executor management, weighted selection, failover |
| Codex Shell | codex-shell.mjs | Persistent Codex SDK sessions |
| Copilot Shell | copilot-shell.mjs | Persistent Copilot SDK sessions |
| Claude Shell | claude-shell.mjs | Persistent Claude SDK sessions |
| Container Runner | container-runner.mjs | Docker/Podman/Apple Container isolation |
| Sentinel | telegram-sentinel.mjs | Independent watchdog companion |
whatsapp-channel.mjs | Optional WhatsApp notification channel |
Execution Modes
Internal Mode (EXECUTOR_MODE=internal)
Tasks run through the internal agent pool inside the monitor process. The agent pool manages executor selection, weighted distribution, and failover.
Monitor β Agent Pool β [Copilot Shell | Codex Shell | Claude Shell]
β
Task Execution
β
Smart PR Flow β CI Check β Merge
VK Mode (EXECUTOR_MODE=vk)
Task execution is delegated to the Vibe-Kanban orchestrator scripts. The monitor handles supervision and PR lifecycle.
Hybrid Mode (EXECUTOR_MODE=hybrid)
Combines internal and VK modes. Internal handles primary execution; VK picks up overflow or specific task types.
Shared State Model
Distributed task coordination across agents and workstations uses a shared state system:
- Owner heartbeat β Each claim has
ownerId(workstation+agent) andownerHeartbeattimestamp, renewed periodically to prove liveness - Attempt tokens β Unique UUID per attempt for idempotent operations
- Retry/ignore flags β
retryCounttracks attempts,ignoreReasonmarks tasks agents should skip - Conflict resolution β Active heartbeat wins over stale claims (first-come-first-served if both active)
Claim Lifecycle
1. claimTaskInSharedState(taskId, ownerId, attemptToken)
2. [work...] renewSharedStateHeartbeat() periodically
3. releaseSharedState(taskId, attemptToken, 'complete'|'failed'|'abandoned')
4. sweepStaleSharedStates() β background cleanup
Smart PR Flow
The smartPRFlow in monitor.mjs handles the full PR lifecycle:
- Branch creation β from configured target branch
- PR creation β with conventional commit title and structured body
- CI monitoring β polls check status
- Auto-rebase β on merge conflicts with target
- Merge decision β merge when all checks pass
- Cleanup β archive task, prune worktree
Error Recovery
OpenFleet uses multiple layers of error recovery:
- Autofix β Pattern matching on error signatures with guarded fix execution
- Circuit breakers β Prevent infinite retry loops by tracking consecutive failures
- Stale sweeps β Background process that reclaims abandoned tasks
- Daemon restart policy β Configurable crash tracking with cooldown periods
- Sentinel watchdog β Independent process that can restart the monitor on crash
Data Persistence
State is persisted in .cache/openfleet/:
| File | Contents |
|---|---|
shared-task-states.json | Distributed task claims and heartbeat state |
fleet-state.json | Multi-workstation fleet coordination data |
task-store.json | Internal kanban task database |
workspaces.json | Workspace and worktree registry |
sessions/ | Agent session state and history |