System Architecture

Quick Reference

Type: Multi-agent AI coding assistant (CLI + SDK + Web Platform)

Stack: TypeScript, Bun, React 19, Next.js, PostgreSQL, Drizzle ORM

Key Modules: CLI, SDK, Agent Runtime, Common, Internal, Web

Deployment: npm (CLI + SDK), Vercel (Web), Docker (DB)

Status: Production

Overview

Codebuff is an open-source AI coding assistant that coordinates specialized agents to understand codebases and make precise edits through natural language. Unlike single-model tools, Codebuff uses a multi-agent architecture where a primary orchestrator (Base2) delegates tasks to specialized agents (Editor, Reviewer, Thinker, etc.).

The system operates in three primary modes:

CLI — Interactive terminal UI for direct developer use
SDK — Programmatic API for embedding agent workflows in applications
Web Platform — Dashboard for account management, Agent Store, and billing

High-Level Architecture

graph TB
    subgraph UserLayer["User Layer"]
        DEV["👨‍💻 Developer"]
        APP["📱 Application"]
    end

    subgraph Interface["Interface Layer"]
        CLI["🖥️ CLI (Terminal UI)"]
        SDK_PUB["📦 SDK (npm)"]
        WEB["🌐 Web Dashboard"]
    end

    subgraph Core["Core Engine"]
        AGENT_RT["⚙️ Agent Runtime"]
        COMMON["📚 Common Library"]
        AGENTS["🤖 Agent Definitions"]
    end

    subgraph Services["Backend Services"]
        INTERNAL["🗄️ Internal Package"]
        BILLING["💳 Billing"]
        CODEMAP["🗺️ Code Map"]
    end

    subgraph External["External Services"]
        OPENROUTER["OpenRouter (LLMs)"]
        STRIPE["Stripe"]
        POSTHOG["PostHog"]
        GITHUB["GitHub OAuth"]
    end

    subgraph Data["Data Layer"]
        PG["PostgreSQL"]
        BQ["BigQuery"]
    end

    DEV --> CLI
    APP --> SDK_PUB
    DEV --> WEB

    CLI --> AGENT_RT
    SDK_PUB --> AGENT_RT
    WEB --> INTERNAL

    AGENT_RT --> AGENTS
    AGENT_RT --> COMMON
    AGENT_RT --> OPENROUTER

    INTERNAL --> PG
    INTERNAL --> STRIPE
    INTERNAL --> POSTHOG
    INTERNAL --> GITHUB

    BILLING --> STRIPE
    CODEMAP --> AGENT_RT

Core Components

Component	Package	Description	Technology	Key Files
CLI	`@codebuff/cli`	Terminal user interface	OpenTUI, React 19, Commander, Zustand	`cli/src/index.tsx`, `cli/src/chat.tsx`
SDK	`@codebuff/sdk`	Published npm package for programmatic use	Vercel AI SDK, tree-sitter, websockets	`sdk/src/client.ts`, `sdk/src/run.ts`
Agent Runtime	`@codebuff/agent-runtime`	Agent execution engine	Tool executor, LLM API, stream parsing	`packages/agent-runtime/src/run-agent-step.ts`
Common	`@codebuff/common`	Shared types, tools, utilities	Zod schemas, tool definitions	`common/src/tools/list.ts`
Internal	`@codebuff/internal`	Backend services	Drizzle ORM, NextAuth, OpenRouter	`packages/internal/src/db/schema.ts`
Web	`@codebuff/web`	Web dashboard	Next.js App Router, Tailwind	`web/src/app/`
Agents	`@codebuff/agents`	Agent definitions	TypeScript agent configs	`agents/base2/base2.ts`
Billing	`@codebuff/billing`	Payment processing	Stripe SDK	`packages/billing/`

Agent Execution Flow

sequenceDiagram
    participant U as User
    participant CLI as CLI/SDK
    participant RT as Agent Runtime
    participant B2 as Base2 Agent
    participant LLM as LLM (OpenRouter)
    participant T as Tool Executor
    participant FS as File System

    U->>CLI: Natural language prompt
    CLI->>RT: Start agent run
    RT->>B2: Initialize with context

    loop Agent Steps
        B2->>LLM: Send prompt + context
        LLM-->>B2: Tool calls (streaming)
        B2->>T: Execute tool

        alt File Operation
            T->>FS: read/write/search
            FS-->>T: Results
        else Spawn Sub-Agent
            T->>RT: Spawn (Editor/Reviewer/etc.)
            RT-->>T: Agent result
        else Terminal Command
            T->>FS: Execute command
            FS-->>T: Output
        end

        T-->>B2: Tool result
    end

    B2-->>RT: Task completed
    RT-->>CLI: Final result
    CLI-->>U: Display changes

Architecture Decision Records (ADR)

#	Decision	Context	Status
1	Multi-agent over single-agent	Better specialization, context management, and error recovery	Accepted
2	Bun as runtime	Faster startup, built-in TypeScript, test runner, package manager	Accepted
3	OpenRouter for LLM routing	Access any model (Claude, GPT, DeepSeek, Qwen) via single API	Accepted
4	OpenTUI for CLI rendering	React-based terminal rendering with Yoga layout	Accepted
5	Drizzle ORM over Prisma	Type-safe, lightweight, SQL-first approach	Accepted
6	tree-sitter for code parsing	Language-agnostic AST analysis in SDK	Accepted
7	Credit-based billing	Flexible usage tracking across tiers and organizations	Accepted

ADR-001: Multi-Agent Architecture

Context: AI coding assistants typically use a single model for all tasks. This creates context window limitations and reduces accuracy for specialized operations.

Decision: Implement a multi-agent system where a primary orchestrator (Base2) delegates to specialized agents (Editor, File Explorer, Reviewer, Thinker, Researcher).

Consequences:

✅ Better accuracy through agent specialization
✅ Independent context windows per agent
✅ Parallel execution possible
⚠️ More complex orchestration logic
⚠️ Higher total token usage per task

ADR-002: Bun Runtime

Context: The project needed a fast TypeScript runtime with built-in tooling to reduce dependency on external build tools.

Decision: Use Bun as the primary runtime, package manager, test runner, and bundler.

Consequences:

✅ 30-40x faster package installs
✅ Built-in TypeScript execution (no compilation step)
✅ Native test runner (no Jest dependency for CLI/SDK)
⚠️ Ecosystem compatibility issues with some npm packages
⚠️ SDK must still support Node.js ≥18 for consumers

Security Model

Layer	Mechanism	Details
Authentication	NextAuth + GitHub OAuth	Web sessions, PATs, CLI tokens
API Keys	Encrypted at rest	`encrypted_api_keys` table with per-user encryption
Session Types	Typed sessions	`web`, `pat`, `cli` — different trust levels
Organization	RBAC	`owner`, `admin`, `member` roles per org
Rate Limiting	Credit-based	Per-block and weekly limits with configurable overrides
Fingerprinting	Device tracking	Session-fingerprint binding for abuse prevention

Scalability & Performance

Aspect	Strategy	Details
LLM Calls	OpenRouter routing	Automatic model fallback and load balancing
Database	Connection pooling	`pg` driver with Drizzle ORM
Context Management	Context pruning	`context-pruner.ts` (~37KB) manages token window
Agent Hierarchy	Ancestor tracking	`ancestor_run_ids` array for tracing agent trees
Streaming	Server-sent events	Real-time token streaming from LLM to CLI
Analytics	BigQuery pipeline	Async event logging for analytics

Codebase Analysis — Full project scan
Database — Schema documentation
Deployment — Setup guide
Data Flow — Data flow diagrams

System Architecture

System Architecture

Overview

High-Level Architecture

Core Components

Agent Execution Flow

Architecture Decision Records (ADR)

Security Model

Scalability & Performance

Related Pages