The AI coding agent for product engineers.

Most agents write code. Excalibur knows the whole product cycle — discover, build, verify, ship.

Apache-2.0
excalibur — m-shell
Add rate limiting to the public API
L4 — Full agentic · kimi-k2.7-code
Plan4 steps
Implementapi/limiter.ts
edit api/limiter.ts+38 −4
▸ pnpm test apirunning…
Verify
Document
Review
Pull Request
████░░░░1m12s·$0.42·standard-safe·no push

With Excalibur you can:

Time-travel your work·Discover what to build·Run it from a local board
01What you can do

Excalibur enables you to build like never before.

Five capabilities built right into the engine — the moats that hold up against any other AI coding tool.

the immutable event log

Time-travel your work

Every run is an immutable, append-only event log — so you scrub it like a video and branch a new run from any step. The stream IS the run: it renders the live view, the replay, the dashboard and the audit trail, byte-identical. Nobody else is built this way.

  • Fork from cache: the good prefix replays for free; only what changed re-runs
  • Live rewind mid-session (Esc-Esc)
  • Live = replay = dashboard = audit — one source of truth
replay · run_4f2a
step 06 / 14 · Implement+38 −4
fork from here ⑂
planimplementtests failreview
live = replay = dashboard = audit · one immutable stream
the gate no one else has

Discover what to build

Before a line of code, Discovery weighs scope, evidence and risk — and recommends build, validate, or don’t-build.

  • Product judgment, not just code generation
  • Turns an idea into a scoped work item
  • Kills bad work before it costs you
discovery · add AI invoice summaries
clarity
evidence
scope
risk
readiness
recommendation: needs_validation

→ thin evidence — validate with 3 design partners first · work item WI-142

local · no SaaS

Your repo on a board

“excalibur serve” opens a task-first web dashboard — embedded in the CLI, one local process, no account. It’s that same event log, rendered for the browser. Terminal-only tools don’t have this.

  • Kanban work items → a run’s live checklist, patches, PRs and plans
  • Cost insights + a live swarm chronogram you can pause and replay
  • --write drives runs from the browser; --share mints a read-only link
dashboard · localhost:7878

backlog

WI-140 · webhook retries

running

WI-142 · invoice AI ◐

done

WI-138 · auth ✓

$2.40 today · 3 runs · live chronogram ▸▸▹ · share ↗ read-only

no lock-in

Any model, any provider

Excalibur isn’t wired to one vendor — OpenAI-compatible (incl. Azure), Anthropic, Ollama and more. Pick a good + fast pair with a single key and switch anytime.

  • A frontier model for the hard parts, a cheap one for ghost-text — automatically
  • Bring your own key — it lives in an env file, never the repo
  • Switch model mid-session with /models
/models
claude-opus-4good · default
gpt-4o
llama-3.3-70bfast · paired
ollama/qwenlocal

good + fast from one key · no vendor lock-in

not just the diff

The whole product cycle

Most agents write code. Excalibur owns the bookends too — deciding what to build up front, then proving and auditing what shipped at the end.

  • A big task auto-sizes into a swarm of agents in isolated worktrees
  • An adversarial verification mesh + typed claims gate the finish
  • Tests and serious docs are phases, not afterthoughts
workflow · standard-feature
DiscoverPlanBuildTest ⛬Docs ⛬ReviewShipAudit

→ sized to 3 agents — independent modules, isolated worktrees

✓ verification mesh · tests_passed · type_safe · no_secrets

All five — free, open-source, one npm install away.

For developers
02The whole cycle

Not just code. The whole product cycle.

Chatbots autocomplete. Coding agents build and ship. Excalibur runs the entire cycle — deciding what to build up front, and auditing what shipped at the end — and it’s fully open source.

Excalibur
Coding agents

Coding agents take you from plan to ship. Excalibur owns the bookends too — deciding what’s worth building, and auditing what shipped.

Decide before you build

Discovery clarifies scope first — and can recommend not building it. Most tools start at the code.

Quality gates, built in

Every run plans, implements, tests and documents the change — tests and serious docs are phases, not afterthoughts.

Ship, review & audit

Pull requests, approvals and a full audit trail — not just a diff in your editor.

Every step configurableEvery action traceableEvery risky op approved
For developers
03Quickstart

Value in minutes, not migrations

Install, run it in any repo, and start building — the first run sets itself up; then an agent plans, implements, tests and documents real changes in an isolated branch.

  1. 01

    Install · one binary, local-first

  2. 02

    Run excalibur · first run detects your stack & connects a model

  3. 03

    Describe a task · an agent plans, builds, tests & ships it

Apache-2.0
excalibur — m-shell
$npm install -g @excalibur-oss/excalibur
excalibur ready
$excalibur
# first run — setting up here
detected TypeScript · pnpm · CLAUDE.md
model: Groq · key from $GROQ_API_KEY (never stored)
standard-safe · approvals on · secrets blocked
add pagination to the orders endpoint
plan → implement → tests → docs → review
12 tests pass · docs updated
opened PR #128 — ready to review
For developers
04What Excalibur does

Dial the autonomy. The system does the rest.

Express intent — Excalibur picks the workflow and sizes the work. Dial it per task, from a quick answer to a fully agentic run.

less autonomymore autonomy
L0Reviewreview
L1Assistask
L2Propose patchpatch
L3Implement in branchrun
L4Full agenticrun --careful

Eleven ways to put agents to work

01run

Run

An agent builds the whole change in an isolated branch.

02run --careful

Plan

A ticket becomes a plan the agent runs once you approve.

03discovery

Discover

An agent clarifies scope first — it can say don’t build it.

04swarm

Orchestrate

Fan a task out to parallel agents — verified against your tests before the merge.

05review

Review

Review the agent’s changes before they ship.

06patch

Patch

Get a small, reviewable patch from an agent.

07ask

Ask

Understand any codebase before you build.

08research

Connect

Agents fetch and search the live web — governed, with citations.

09serve

Dashboard

A local web board: kanban, runs, cost charts, live orchestration.

10schedule

Schedule

Run agent tasks on a cadence — every N, or daily at a time.

11.excalibur/

Trace

Every agent run kept auditable, as local files.

05Built-in workflows

Best practices, built in

Opinionated recipes for how serious teams ship — 14 workflows and 14 methodologies, ready on day one.

Anatomy of a run

  1. Plan

    Scope the change first

  2. Implement

    Write the code in an isolated branch

  3. Verifygate

    Tests must pass to continue

  4. Documentgate

    ADRs, API docs & changelog

  5. Review

    An adversarial second pass

  6. Pull request

    Opened for a human to merge

Tests and serious docs aren’t an afterthought — they’re phases.

01

Review First

Read and critique before touching a line.

02

Fast Fix

Small, obvious fixes — minimal ceremony.

03

Standard Feature

The everyday plan → build → test → ship.

04

Structured Feature

Bigger work with specs, gates and approvals.

05

Safe Refactor

Behaviour-preserving, with tests as the net.

06

Security First

Threat-aware build with a security gate.

07

Migration

Staged, reversible, data-aware changes.

08

Explore Alternatives

Several approaches in parallel — compare and pick.

09

Discovery

Decide what to build — or whether to build at all.

Use the defaults today. Customize every phase, gate and role in YAML tomorrow.

Browse the catalog
06Safe by default

Powerful — never reckless

Delegate big work without fear. Nothing is modified, applied or pushed without your explicit approval — standard-safe is on from the first command.

excalibur — approval
run · standard-safe
proposes edit src/webhooks/verify.ts +24 −3
approve write to a sensitive path?
y approve N reject a always
# nothing changes until you say yes

One keystroke decides — approve, reject, or allow it always. Risky operations always wait for you.

Approval gates

Every write, command and push pauses for an explicit yes.

Sandboxed execution

Agents run in an isolated sandbox — no network access by default.

Secrets never leak

.env files and private keys are blocked — never read or sent.

Isolated branches

Work lands in dedicated branches — never your working tree.

Redacted prompts

Inputs are scrubbed of secrets before anything is stored.

Inspectable trail

Every action is logged as plain local files you can audit.

For teams & managers
07Excalibur Enterprise

Move fast as a team. Stay in control.

The control plane for AI engineering — give every developer agents, with the visibility, policy and proof to run it safely across the whole org.

app.getexcalibur.dev

Workbench · Acme Corp

Preview

Spend · this mo

$1,284

Time saved

14 days

Active runs

7

Approvals

3

  • Implement contract renewal remindersrunning
  • Migrate billing to the new ledgerawaiting approval
  • Add idempotency to webhook handlercompleted
SSO·Audit log·Budgets·SIEM export

Preview of the Enterprise control plane (in development). The OSS excalibur serve dashboard ships locally today — see Core.

Why leadership says yes

  • See exactly what every agent did, what it cost, and what it returned.
  • Set the rules once — policies, budgets and approvals enforce themselves.
  • Prove every change with a signed, auditable trail your auditors will accept.

Visibility & ROI

Cost, time, usage and quality across the fleet — rolled up org → team → repo, with budgets and forecasts.

Governance & risk

A policy engine, model governance and SSO/SCIM, with an audit trail that exports to your SIEM.

Verifiable quality

Every run’s claims — tests, types, no secrets — roll up into a signed compliance pack.

Memory that compounds

Decisions and earned autonomy build up across the org — and survive turnover.

Coordinate humans + agents

Agentic-agile planning, a native kanban (Linear / Jira sync planned), and approvals from your phone.

Deploy your way

Hybrid or fully self-hosted runners — your code never leaves your infra. Air-gapped if you need it.

08Core vs Enterprise

Great on its own. Built for scale.

Everything a developer needs — safe by default — is open-source. Teams add governance, compliance and control.

CapabilityCoreEnt
Interactive m-shell + CLIIncludedIncluded
Any model, any provider (in-shell /models picker)IncludedIncluded
Autonomy levels (L0–L4, L4 default)IncludedIncluded
Built-in workflows & methodologiesIncludedIncluded
Discovery — decide before you buildIncludedIncluded
Plan-shaping — co-create the plan firstIncludedIncluded
Time machine — rewind & fork runsIncludedIncluded
Self-sizing swarm + Explore (best-of-N)IncludedIncluded
Verified fan-in + live wave/DAG chronogramIncludedIncluded
Autonomous loop, background fleet & schedulerIncludedIncluded
Web access & research (fetch · search · crawl)IncludedIncluded
Context compactionIncludedIncluded
Memory that compoundsLocalOrg
Local dashboard + fleet viewIncludedIncluded
Local work items & kanbanIncludedIncluded
IDE extension (VS Code · Cursor · Windsurf)IncludedIncluded
CapabilityCoreEnt
Custom agents (Markdown personas)IncludedIncluded
MCP · LSP auto-install · auto-format · SDKIncludedIncluded
Works with CLAUDE.md / AGENTS.mdIncludedIncluded
Safe by default — approvals + secrets blockedIncludedIncluded
Sandboxed execution (no network by default)IncludedIncluded
Verifiable claims — tests · types · no secretsIncludedIncluded
Work-item sync (Linear · Jira · …)Not includedIncluded
Agentic-agile (daily / weekly)Not includedIncluded
SSO / SCIM · RBAC · multi-tenantNot includedIncluded
Policy engine + budgetsLocalOrg
Claim Ledger → Compliance PackNot includedIncluded
Insights & ROI (5 lenses)Not includedIncluded
Audit & cost visibilityLocalOrg
Hybrid / self-hosted runnersNot includedIncluded
Mobile approval / remoteNot includedIncluded
Support / SLACommunityIncluded
For developers
09Extensible by design

YAML defines how your team works. The SDK connects everything else.

Customize with simple YAML — or go deeper with the TypeScript SDK, MCP servers and LSP diagnostics.

YAML / Markdownno code
.excalibur/workflows/secure-feature.yamlyaml
name: Secure Feature
extends: structured-feature
phases:
  - plan
  - implement
  - security-review   # added gate
approvals: [sensitive-paths]
methodologiesworkflowsquestion packspromptssafety presetsmodel routing
TypeScript SDKcode
excalibur.extension.tsts
import { defineExtension } from '@excalibur-oss/extension-sdk'

export default defineExtension({
  workItems:  [linearProvider],
  channels:   [slackChannel],
  mcpServers: [githubMcp],
  policies:   [budgetGuard],
})
work-item providerschannelsMCP serversmodel providersagent adapterstoolspolicy evaluators

Plus first-class MCP servers, LSP diagnostics, custom agent personas, and an IDE extension for VS Code / Cursor / Windsurf — all built in.

SDK on npm · no glue code

Open-source · Enterprise-ready

Start locally.
Scale safely.

Start with Excalibur Core on your machine. Bring the workflows, policies and visibility your organization needs with Excalibur Enterprise.

Developers want power. Companies need safety. Excalibur gives both.