The AI coding agent for product engineers.
Most agents write code. Excalibur knows the whole product cycle — discover, build, verify, ship.
With Excalibur you can:
Excalibur enables you to build like never before.
Five capabilities built right into the engine — the moats that hold up against any other AI coding tool.
Time-travel your work
Every run is an immutable, append-only event log — so you scrub it like a video and branch a new run from any step. The stream IS the run: it renders the live view, the replay, the dashboard and the audit trail, byte-identical. Nobody else is built this way.
- Fork from cache: the good prefix replays for free; only what changed re-runs
- Live rewind mid-session (Esc-Esc)
- Live = replay = dashboard = audit — one source of truth
Discover what to build
Before a line of code, Discovery weighs scope, evidence and risk — and recommends build, validate, or don’t-build.
- Product judgment, not just code generation
- Turns an idea into a scoped work item
- Kills bad work before it costs you
→ thin evidence — validate with 3 design partners first · work item WI-142
Your repo on a board
“excalibur serve” opens a task-first web dashboard — embedded in the CLI, one local process, no account. It’s that same event log, rendered for the browser. Terminal-only tools don’t have this.
- Kanban work items → a run’s live checklist, patches, PRs and plans
- Cost insights + a live swarm chronogram you can pause and replay
- --write drives runs from the browser; --share mints a read-only link
backlog
running
done
$2.40 today · 3 runs · live chronogram ▸▸▹ · share ↗ read-only
Any model, any provider
Excalibur isn’t wired to one vendor — OpenAI-compatible (incl. Azure), Anthropic, Ollama and more. Pick a good + fast pair with a single key and switch anytime.
- A frontier model for the hard parts, a cheap one for ghost-text — automatically
- Bring your own key — it lives in an env file, never the repo
- Switch model mid-session with /models
good + fast from one key · no vendor lock-in
The whole product cycle
Most agents write code. Excalibur owns the bookends too — deciding what to build up front, then proving and auditing what shipped at the end.
- A big task auto-sizes into a swarm of agents in isolated worktrees
- An adversarial verification mesh + typed claims gate the finish
- Tests and serious docs are phases, not afterthoughts
→ sized to 3 agents — independent modules, isolated worktrees
✓ verification mesh · tests_passed · type_safe · no_secrets
All five — free, open-source, one npm install away.
Not just code. The whole product cycle.
Chatbots autocomplete. Coding agents build and ship. Excalibur runs the entire cycle — deciding what to build up front, and auditing what shipped at the end — and it’s fully open source.
Coding agents take you from plan to ship. Excalibur owns the bookends too — deciding what’s worth building, and auditing what shipped.
Decide before you build
Discovery clarifies scope first — and can recommend not building it. Most tools start at the code.
Quality gates, built in
Every run plans, implements, tests and documents the change — tests and serious docs are phases, not afterthoughts.
Ship, review & audit
Pull requests, approvals and a full audit trail — not just a diff in your editor.
Value in minutes, not migrations
Install, run it in any repo, and start building — the first run sets itself up; then an agent plans, implements, tests and documents real changes in an isolated branch.
- 01
Install · one binary, local-first
- 02
Run excalibur · first run detects your stack & connects a model
- 03
Describe a task · an agent plans, builds, tests & ships it
Dial the autonomy. The system does the rest.
Express intent — Excalibur picks the workflow and sizes the work. Dial it per task, from a quick answer to a fully agentic run.
reviewaskpatchrunrun --carefulEleven ways to put agents to work
runRun
An agent builds the whole change in an isolated branch.
run --carefulPlan
A ticket becomes a plan the agent runs once you approve.
discoveryDiscover
An agent clarifies scope first — it can say don’t build it.
swarmOrchestrate
Fan a task out to parallel agents — verified against your tests before the merge.
reviewReview
Review the agent’s changes before they ship.
patchPatch
Get a small, reviewable patch from an agent.
askAsk
Understand any codebase before you build.
researchConnect
Agents fetch and search the live web — governed, with citations.
serveDashboard
A local web board: kanban, runs, cost charts, live orchestration.
scheduleSchedule
Run agent tasks on a cadence — every N, or daily at a time.
.excalibur/Trace
Every agent run kept auditable, as local files.
Best practices, built in
Opinionated recipes for how serious teams ship — 14 workflows and 14 methodologies, ready on day one.
Anatomy of a run
- Plan
Scope the change first
- Implement
Write the code in an isolated branch
- Verifygate
Tests must pass to continue
- Documentgate
ADRs, API docs & changelog
- Review
An adversarial second pass
- Pull request
Opened for a human to merge
Tests and serious docs aren’t an afterthought — they’re phases.
Review First
Read and critique before touching a line.
Fast Fix
Small, obvious fixes — minimal ceremony.
Standard Feature
The everyday plan → build → test → ship.
Structured Feature
Bigger work with specs, gates and approvals.
Safe Refactor
Behaviour-preserving, with tests as the net.
Security First
Threat-aware build with a security gate.
Migration
Staged, reversible, data-aware changes.
Explore Alternatives
Several approaches in parallel — compare and pick.
Discovery
Decide what to build — or whether to build at all.
Use the defaults today. Customize every phase, gate and role in YAML tomorrow.
Browse the catalogPowerful — never reckless
Delegate big work without fear. Nothing is modified, applied or pushed without your explicit approval — standard-safe is on from the first command.
One keystroke decides — approve, reject, or allow it always. Risky operations always wait for you.
Approval gates
Every write, command and push pauses for an explicit yes.
Sandboxed execution
Agents run in an isolated sandbox — no network access by default.
Secrets never leak
.env files and private keys are blocked — never read or sent.
Isolated branches
Work lands in dedicated branches — never your working tree.
Redacted prompts
Inputs are scrubbed of secrets before anything is stored.
Inspectable trail
Every action is logged as plain local files you can audit.
Move fast as a team. Stay in control.
The control plane for AI engineering — give every developer agents, with the visibility, policy and proof to run it safely across the whole org.
Workbench · Acme Corp
PreviewSpend · this mo
$1,284
Time saved
14 days
Active runs
7
Approvals
3
- Implement contract renewal remindersrunning
- Migrate billing to the new ledgerawaiting approval
- Add idempotency to webhook handlercompleted
Preview of the Enterprise control plane (in development). The OSS excalibur serve dashboard ships locally today — see Core.
Why leadership says yes
- See exactly what every agent did, what it cost, and what it returned.
- Set the rules once — policies, budgets and approvals enforce themselves.
- Prove every change with a signed, auditable trail your auditors will accept.
Visibility & ROI
Cost, time, usage and quality across the fleet — rolled up org → team → repo, with budgets and forecasts.
Governance & risk
A policy engine, model governance and SSO/SCIM, with an audit trail that exports to your SIEM.
Verifiable quality
Every run’s claims — tests, types, no secrets — roll up into a signed compliance pack.
Memory that compounds
Decisions and earned autonomy build up across the org — and survive turnover.
Coordinate humans + agents
Agentic-agile planning, a native kanban (Linear / Jira sync planned), and approvals from your phone.
Deploy your way
Hybrid or fully self-hosted runners — your code never leaves your infra. Air-gapped if you need it.
Great on its own. Built for scale.
Everything a developer needs — safe by default — is open-source. Teams add governance, compliance and control.
| Capability | Core | Ent |
|---|---|---|
| Interactive m-shell + CLI | Included | Included |
| Any model, any provider (in-shell /models picker) | Included | Included |
| Autonomy levels (L0–L4, L4 default) | Included | Included |
| Built-in workflows & methodologies | Included | Included |
| Discovery — decide before you build | Included | Included |
| Plan-shaping — co-create the plan first | Included | Included |
| Time machine — rewind & fork runs | Included | Included |
| Self-sizing swarm + Explore (best-of-N) | Included | Included |
| Verified fan-in + live wave/DAG chronogram | Included | Included |
| Autonomous loop, background fleet & scheduler | Included | Included |
| Web access & research (fetch · search · crawl) | Included | Included |
| Context compaction | Included | Included |
| Memory that compounds | Local | Org |
| Local dashboard + fleet view | Included | Included |
| Local work items & kanban | Included | Included |
| IDE extension (VS Code · Cursor · Windsurf) | Included | Included |
| Capability | Core | Ent |
|---|---|---|
| Custom agents (Markdown personas) | Included | Included |
| MCP · LSP auto-install · auto-format · SDK | Included | Included |
| Works with CLAUDE.md / AGENTS.md | Included | Included |
| Safe by default — approvals + secrets blocked | Included | Included |
| Sandboxed execution (no network by default) | Included | Included |
| Verifiable claims — tests · types · no secrets | Included | Included |
| Work-item sync (Linear · Jira · …) | Not included | Included |
| Agentic-agile (daily / weekly) | Not included | Included |
| SSO / SCIM · RBAC · multi-tenant | Not included | Included |
| Policy engine + budgets | Local | Org |
| Claim Ledger → Compliance Pack | Not included | Included |
| Insights & ROI (5 lenses) | Not included | Included |
| Audit & cost visibility | Local | Org |
| Hybrid / self-hosted runners | Not included | Included |
| Mobile approval / remote | Not included | Included |
| Support / SLA | Community | Included |
YAML defines how your team works. The SDK connects everything else.
Customize with simple YAML — or go deeper with the TypeScript SDK, MCP servers and LSP diagnostics.
name: Secure Feature
extends: structured-feature
phases:
- plan
- implement
- security-review # added gate
approvals: [sensitive-paths]import { defineExtension } from '@excalibur-oss/extension-sdk'
export default defineExtension({
workItems: [linearProvider],
channels: [slackChannel],
mcpServers: [githubMcp],
policies: [budgetGuard],
})Plus first-class MCP servers, LSP diagnostics, custom agent personas, and an IDE extension for VS Code / Cursor / Windsurf — all built in.
SDK on npm · no glue codeOpen-source · Enterprise-ready
Start locally.
Scale safely.
Start with Excalibur Core on your machine. Bring the workflows, policies and visibility your organization needs with Excalibur Enterprise.
Developers want power. Companies need safety. Excalibur gives both.