OSSExcalibur Core is open-source · Apache-2.0

The AI coding agent for product engineers.

Most agents write code. Excalibur knows the whole product cycle — discover, build, verify, ship.

Apache-2.0

Get started I'm a manager

excalibur — m-shell

›Add rate limiting to the public API

L4 — Full agentic · kimi-k2.7-code

✓Plan4 steps

│

⠋Implementapi/limiter.ts

│└ edit api/limiter.ts+38 −4

│└ ▸ pnpm test apirunning…

│

○Verify

│

○Document

│

○Review

│

○Pull Request

████░░░░1m12s·$0.42·standard-safe·no push

›

With Excalibur you can:

Time-travel your work·Discover what to build·Run it from a local board

01What you can do

Excalibur enables you to build like never before.

Five capabilities built right into the engine — the moats that hold up against any other AI coding tool.

the immutable event log

Time-travel your work

Every run is an immutable, append-only event log — so you scrub it like a video and branch a new run from any step. The stream IS the run: it renders the live view, the replay, the dashboard and the audit trail, byte-identical. Nobody else is built this way.

Fork from cache: the good prefix replays for free; only what changed re-runs
Live rewind mid-session (Esc-Esc)
Live = replay = dashboard = audit — one source of truth

replay · run_4f2a

step 06 / 14 · Implement+38 −4

fork from here ⑂

planimplementtests failreview

live = replay = dashboard = audit · one immutable stream

the gate no one else has

Discover what to build

Before a line of code, Discovery weighs scope, evidence and risk — and recommends build, validate, or don’t-build.

Product judgment, not just code generation
Turns an idea into a scoped work item
Kills bad work before it costs you

discovery · add AI invoice summaries

clarity

evidence

scope

risk

readiness

recommendation: needs_validation

→ thin evidence — validate with 3 design partners first · work item WI-142

local · no SaaS

Your repo on a board

“excalibur serve” opens a task-first web dashboard — embedded in the CLI, one local process, no account. It’s that same event log, rendered for the browser. Terminal-only tools don’t have this.

Kanban work items → a run’s live checklist, patches, PRs and plans
Cost insights + a live swarm chronogram you can pause and replay
--write drives runs from the browser; --share mints a read-only link

dashboard · localhost:7878

backlog

WI-140 · webhook retries

running

WI-142 · invoice AI ◐

done

WI-138 · auth ✓

$2.40 today · 3 runs · live chronogram ▸▸▹ · share ↗ read-only

no lock-in

Any model, any provider

Excalibur isn’t wired to one vendor — OpenAI-compatible (incl. Azure), Anthropic, Ollama and more. Pick a good + fast pair with a single key and switch anytime.

A frontier model for the hard parts, a cheap one for ghost-text — automatically
Bring your own key — it lives in an env file, never the repo
Switch model mid-session with /models

/models

●claude-opus-4good · default

○gpt-4o

○llama-3.3-70bfast · paired

○ollama/qwenlocal

good + fast from one key · no vendor lock-in

not just the diff

The whole product cycle

Most agents write code. Excalibur owns the bookends too — deciding what to build up front, then proving and auditing what shipped at the end.

A big task auto-sizes into a swarm of agents in isolated worktrees
An adversarial verification mesh + typed claims gate the finish
Tests and serious docs are phases, not afterthoughts

workflow · standard-feature

Discover→Plan→Build→Test ⛬→Docs ⛬→Review→Ship→Audit

→ sized to 3 agents — independent modules, isolated worktrees

✓ verification mesh · tests_passed · type_safe · no_secrets

All five — free, open-source, one npm install away.

For developers

02The whole cycle

Not just code. The whole product cycle.

Chatbots autocomplete. Coding agents build and ship. Excalibur runs the entire cycle — deciding what to build up front, and auditing what shipped at the end — and it’s fully open source.

DiscoverPlanBuildTestDocumentReviewShipAudit

Excalibur

Coding agents

Coding agents take you from plan to ship. Excalibur owns the bookends too — deciding what’s worth building, and auditing what shipped.

Decide before you build

Discovery clarifies scope first — and can recommend not building it. Most tools start at the code.

Quality gates, built in

Every run plans, implements, tests and documents the change — tests and serious docs are phases, not afterthoughts.

Ship, review & audit

Pull requests, approvals and a full audit trail — not just a diff in your editor.

Every step configurableEvery action traceableEvery risky op approved

For developers

03Quickstart

Value in minutes, not migrations

Install, run it in any repo, and start building — the first run sets itself up; then an agent plans, implements, tests and documents real changes in an isolated branch.

01
Install · one binary, local-first
02
Run excalibur · first run detects your stack & connects a model
03
Describe a task · an agent plans, builds, tests & ships it

Apache-2.0

View on GitHub Read the docs

excalibur — m-shell

$npm install -g @excalibur-oss/excalibur

✓excalibur ready

$excalibur

# first run — setting up here

✓detected TypeScript · pnpm · CLAUDE.md

→model: Groq · key from $GROQ_API_KEY (never stored)

⚑standard-safe · approvals on · secrets blocked

›add pagination to the orders endpoint

plan → implement → tests → docs → review

✓12 tests pass · docs updated

→opened PR #128 — ready to review

›

For developers

04What Excalibur does

Dial the autonomy. The system does the rest.

Express intent — Excalibur picks the workflow and sizes the work. Dial it per task, from a quick answer to a fully agentic run.

less autonomymore autonomy

L0Reviewreview

L1Assistask

L2Propose patchpatch

L3Implement in branchrun

L4Full agenticrun --careful

Eleven ways to put agents to work

01run

Run

An agent builds the whole change in an isolated branch.

02run --careful

Plan

A ticket becomes a plan the agent runs once you approve.

03discovery

Discover

An agent clarifies scope first — it can say don’t build it.

04swarm

Orchestrate

Fan a task out to parallel agents — verified against your tests before the merge.

05review

Review

Review the agent’s changes before they ship.

06patch

Patch

Get a small, reviewable patch from an agent.

07ask

Ask

Understand any codebase before you build.

08research

Connect

Agents fetch and search the live web — governed, with citations.

09serve

Dashboard

A local web board: kanban, runs, cost charts, live orchestration.

10schedule

Schedule

Run agent tasks on a cadence — every N, or daily at a time.

11.excalibur/

Trace

Every agent run kept auditable, as local files.

+ extend

Build your own

Add workflows and integrations with YAML or the TypeScript SDK.

Explore

05Built-in workflows

Best practices, built in

Opinionated recipes for how serious teams ship — 14 workflows and 14 methodologies, ready on day one.

Anatomy of a run

Plan
Scope the change first
Implement
Write the code in an isolated branch
Verifygate
Tests must pass to continue
Documentgate
ADRs, API docs & changelog
Review
An adversarial second pass
Pull request
Opened for a human to merge

Tests and serious docs aren’t an afterthought — they’re phases.

Review First

Read and critique before touching a line.

Fast Fix

Small, obvious fixes — minimal ceremony.

Standard Feature

The everyday plan → build → test → ship.

Structured Feature

Bigger work with specs, gates and approvals.

Safe Refactor

Behaviour-preserving, with tests as the net.

Security First

Threat-aware build with a security gate.

Migration

Staged, reversible, data-aware changes.

Explore Alternatives

Several approaches in parallel — compare and pick.

Discovery

Decide what to build — or whether to build at all.

Use the defaults today. Customize every phase, gate and role in YAML tomorrow.

Browse the catalog

06Safe by default

Powerful — never reckless

Delegate big work without fear. Nothing is modified, applied or pushed without your explicit approval — standard-safe is on from the first command.

excalibur — approval

▌run · standard-safe

proposes edit src/webhooks/verify.ts +24 −3

⚑approve write to a sensitive path?

y approve N reject a always

# nothing changes until you say yes

›

One keystroke decides — approve, reject, or allow it always. Risky operations always wait for you.

Approval gates

Every write, command and push pauses for an explicit yes.

Sandboxed execution

Agents run in an isolated sandbox — no network access by default.

Secrets never leak

.env files and private keys are blocked — never read or sent.

Isolated branches

Work lands in dedicated branches — never your working tree.

Redacted prompts

Inputs are scrubbed of secrets before anything is stored.

Inspectable trail

Every action is logged as plain local files you can audit.

For teams & managers

07Excalibur Enterprise

Move fast as a team. Stay in control.

The control plane for AI engineering — give every developer agents, with the visibility, policy and proof to run it safely across the whole org.

app.getexcalibur.dev

Workbench · Acme Corp

Preview

Spend · this mo

$1,284

Time saved

14 days

Active runs

Approvals

Implement contract renewal remindersrunning
Migrate billing to the new ledgerawaiting approval
Add idempotency to webhook handlercompleted

SSO·Audit log·Budgets·SIEM export

Preview of the Enterprise control plane (in development). The OSS excalibur serve dashboard ships locally today — see Core.

Why leadership says yes

See exactly what every agent did, what it cost, and what it returned.
Set the rules once — policies, budgets and approvals enforce themselves.
Prove every change with a signed, auditable trail your auditors will accept.

Visibility & ROI

Cost, time, usage and quality across the fleet — rolled up org → team → repo, with budgets and forecasts.

Governance & risk

A policy engine, model governance and SSO/SCIM, with an audit trail that exports to your SIEM.

Verifiable quality

Every run’s claims — tests, types, no secrets — roll up into a signed compliance pack.

Memory that compounds

Decisions and earned autonomy build up across the org — and survive turnover.

Coordinate humans + agents

Agentic-agile planning, a native kanban (Linear / Jira sync planned), and approvals from your phone.

Deploy your way

Hybrid or fully self-hosted runners — your code never leaves your infra. Air-gapped if you need it.

Explore Enterprise Book a demo

08Core vs Enterprise

Great on its own. Built for scale.

Everything a developer needs — safe by default — is open-source. Teams add governance, compliance and control.

Capability	Core	Ent
Interactive m-shell + CLI	Included	Included
Any model, any provider (in-shell /models picker)	Included	Included
Autonomy levels (L0–L4, L4 default)	Included	Included
Built-in workflows & methodologies	Included	Included
Discovery — decide before you build	Included	Included
Plan-shaping — co-create the plan first	Included	Included
Time machine — rewind & fork runs	Included	Included
Self-sizing swarm + Explore (best-of-N)	Included	Included
Verified fan-in + live wave/DAG chronogram	Included	Included
Autonomous loop, background fleet & scheduler	Included	Included
Web access & research (fetch · search · crawl)	Included	Included
Context compaction	Included	Included
Memory that compounds	Local	Org
Local dashboard + fleet view	Included	Included
Local work items & kanban	Included	Included
IDE extension (VS Code · Cursor · Windsurf)	Included	Included

Capability	Core	Ent
Custom agents (Markdown personas)	Included	Included
MCP · LSP auto-install · auto-format · SDK	Included	Included
Works with CLAUDE.md / AGENTS.md	Included	Included
Safe by default — approvals + secrets blocked	Included	Included
Sandboxed execution (no network by default)	Included	Included
Verifiable claims — tests · types · no secrets	Included	Included
Work-item sync (Linear · Jira · …)	Not included	Included
Agentic-agile (daily / weekly)	Not included	Included
SSO / SCIM · RBAC · multi-tenant	Not included	Included
Policy engine + budgets	Local	Org
Claim Ledger → Compliance Pack	Not included	Included
Insights & ROI (5 lenses)	Not included	Included
Audit & cost visibility	Local	Org
Hybrid / self-hosted runners	Not included	Included
Mobile approval / remote	Not included	Included
Support / SLA	Community	Included

Get started Talk to us about Enterprise

For developers

09Extensible by design

YAML defines how your team works. The SDK connects everything else.

Customize with simple YAML — or go deeper with the TypeScript SDK, MCP servers and LSP diagnostics.

YAML / Markdownno code

.excalibur/workflows/secure-feature.yamlyaml

name: Secure Feature
extends: structured-feature
phases:
  - plan
  - implement
  - security-review   # added gate
approvals: [sensitive-paths]

methodologiesworkflowsquestion packspromptssafety presetsmodel routing

TypeScript SDKcode

excalibur.extension.tsts

import { defineExtension } from '@excalibur-oss/extension-sdk'

export default defineExtension({
  workItems:  [linearProvider],
  channels:   [slackChannel],
  mcpServers: [githubMcp],
  policies:   [budgetGuard],
})

work-item providerschannelsMCP serversmodel providersagent adapterstoolspolicy evaluators

Plus first-class MCP servers, LSP diagnostics, custom agent personas, and an IDE extension for VS Code / Cursor / Windsurf — all built in.

SDK on npm · no glue code

Open-source · Enterprise-ready

Start locally.
Scale safely.

Start with Excalibur Core on your machine. Bring the workflows, policies and visibility your organization needs with Excalibur Enterprise.

Get started GitHub Book a demo

Developers want power. Companies need safety. Excalibur gives both.

The AI coding agent for product engineers.

Excalibur enables you to build like never before.

Time-travel your work

Discover what to build

Your repo on a board

Any model, any provider

The whole product cycle

Not just code. The whole product cycle.

Decide before you build

Quality gates, built in

Ship, review & audit

Value in minutes, not migrations

Dial the autonomy. The system does the rest.

Run

Plan

Discover

Orchestrate

Review

Patch

Ask

Connect

Dashboard

Schedule

Trace

Build your own

Best practices, built in

Review First

Fast Fix

Standard Feature

Structured Feature

Safe Refactor

Security First

Migration

Explore Alternatives

Discovery

Powerful — never reckless

Approval gates

Sandboxed execution

Secrets never leak

Isolated branches

Redacted prompts

Inspectable trail

Move fast as a team. Stay in control.

Visibility & ROI

Governance & risk

Verifiable quality

Memory that compounds

Coordinate humans + agents

Deploy your way

Great on its own. Built for scale.

YAML defines how your team works. The SDK connects everything else.

Start locally.Scale safely.

Start locally.
Scale safely.