Porting Python's Spec-Kit to TypeScript: Using SDD to Build SDD

We ported spec-kit from Python to TypeScript using spec-kit itself to manage the process. 247 tests. Zero runtime dependencies. Seven specs, shipped.

I – The Problem: Python in Enterprise

Spec-Kit is a CLI tool for Spec-Driven Development — the workflow where every feature starts as a specification before any code is written. It's the backbone of how AI coding agents like Copilot, Claude, Gemini, and OpenCode can build features systematically.

The Python version works beautifully. Typer for the CLI, Rich for terminal styling, YAML frontmatter for command templates. Install with uv tool install specify-cli, run specify init my-project --ai copilot, and you're writing specs in minutes.

But enterprise environments don't have Python.

At many large organizations, developer workstations come locked down. Node.js is approved. Python is not. These aren't edge cases — they're the majority of Fortune 500 engineering teams. And when you can't install Python, you can't use spec-kit.

The obvious solution: port it to TypeScript. Zero dependencies. Ship a single binary with Bun. Suddenly spec-kit works everywhere Node.js works, and with the --compile flag, everywhere at all.

II – Using SDD to Build SDD

Here's the recursive beauty of this project: we're using the Python version of spec-kit to manage the TypeScript port of spec-kit.

# Initialize the project with spec-kit (Python)
cd spec-kit
specify init . --ai opencode --force

This created our .specify/ directory with templates, scripts, and the memory folder where our constitution lives. Then we wrote the constitution — the unchanging principles that guide every decision:

## Principles

1. **Zero Runtime Dependencies** - No npm packages except devDependencies
2. **Multi-Runtime Compatibility** - Node.js 18+, Bun, Deno
3. **TypeScript-First** - Full type safety, no `any` types
4. **Oakoliver CLI Libraries** - @oakoliver/bubbletea, lipgloss, huh, glamour
5. **Idiomatic TypeScript API** - camelCase, classes, fluent patterns

The mission statement captures why this matters:

Enable Spec-Driven Development in enterprise environments where Python installation is restricted.

Now every spec, every implementation decision, every PR filters through these principles. When I'm tempted to add a YAML parsing dependency, the constitution says no — implement it from scratch or use JSON.

III – Breaking Down the Monolith

The initial instinct was to write one massive spec covering everything: init command, check command, extension management, preset management, agent registration, template resolution. A 200-line specification with eight user stories and forty functional requirements.

This is the wrong approach for two reasons.

First, it's not implementable in parallel. A single enormous spec means a single enormous branch. No way to ship incremental value. No way to test pieces in isolation.

Second, it's not how spec-kit is designed to work. The whole point of SDD is decomposition — break features into specs that can be reasoned about independently. The Python spec-kit documentation itself says: "Each spec should be implementable in one feature branch, testable independently, and deliverable without blocking other work."

So we refactored. Seven specs instead of one:

Spec	Branch	Description
001	`001-project-overview`	Architecture overview (not a feature)
002	`002-core-types`	TypeScript types, agent configs, utilities
003	`003-init-command`	The `specify init` command
004	`004-check-command`	The `specify check` command
005	`005-agent-system`	Command registration for 20+ agents
006	`006-extensions`	Extension management (add/remove/list)
007	`007-presets`	Preset management (templates/overrides)

Each spec has clear dependencies. 002-core-types depends on nothing — it's the foundation. 003-init-command depends on 002. Extensions and presets depend on both the core types and the agent system. This is the dependency graph that enables parallel work.

flowchart TD
    S002["002 – Core Types"]
    S003["003 – Init Command"]
    S004["004 – Check Command"]
    S005["005 – Agent System"]
    S006["006 – Extensions"]
    S007["007 – Presets"]

    S002 --> S003
    S002 --> S004
    S002 --> S005
    S005 --> S006
    S005 --> S007
    S002 --> S006
    S002 --> S007

IV – What Python Had, What Bun Gives Us

Typer → Bun + @oakoliver/huh

Python's Typer is magical. Decorators, automatic help text, type coercion, nested command groups. It's the best CLI framework I've used.

But we're going zero-dependency. So instead of reaching for Commander.js or Yargs, we build on what we've already ported: @oakoliver/bubbletea for the event loop, @oakoliver/huh for interactive prompts, @oakoliver/lipgloss for styling.

The init command becomes a TUI application:

import { Program } from '@oakoliver/bubbletea';
import { Form, Select, Confirm } from '@oakoliver/huh';
import { Style } from '@oakoliver/lipgloss';

// Interactive agent selection when --ai is not provided
const agentField = new Select()
  .title('AI Assistant')
  .options(SUPPORTED_AGENTS.map(a => ({ label: a, value: a })))
  .filtering(true);

const form = new Form(agentField);

This is actually better than Typer's approach. When you run specify init my-project without --ai, you get an interactive fuzzy-filterable list of agents — not a help message telling you what flags to pass.

Rich → @oakoliver/lipgloss + glamour

Rich gives you styled output, progress bars, tables, markdown rendering. We have all of this:

import { Style, JoinHorizontal } from '@oakoliver/lipgloss';
import { Render } from '@oakoliver/glamour';

const success = new Style()
  .foreground('10')  // green
  .bold(true);

const banner = Render(bannerMarkdown, { theme: 'dark' });
console.log(banner);
console.log(success.render('✓ Project initialized'));

httpx → Bun.fetch

Python uses httpx for downloading templates from GitHub. Bun's fetch is built-in and identical to the web standard:

// No dependency needed
const response = await fetch(templateUrl);
const buffer = await response.arrayBuffer();

YAML → JSON

The Python version stores configuration in YAML. We're using JSON exclusively. Not because JSON is better, but because JSON.parse() is built into every JavaScript runtime. YAML parsing requires either a dependency or a custom parser.

The init-options.json format is actually cleaner:

{
  "ai": "opencode",
  "script": "sh",
  "branchNumbering": "sequential",
  "aiSkills": false
}

pathspec → Native glob

Python uses pathspec for gitignore-style matching. Bun has Bun.Glob built-in:

const glob = new Bun.Glob('**/*.md');
for await (const file of glob.scan('./specs')) {
  // Process file
}

V – Performance: The Bun Advantage

Side-by-side, same machine (Apple M2 Max), same commands:

Operation	Python (v0.4.2)	TypeScript (Bun)	Speedup
`specify version`	~570ms	29ms	~20x
`specify init` (full project)	~205ms	35ms	~6x
`specify check`	~200ms	72ms	~3x
Test suite	n/a (no bundled tests)	247 tests in under 200ms	—
Runtime dependencies	24 packages	0	—

The version command is the most dramatic. Python spends the majority of that 570ms just importing its dependency tree — Rich, Typer, httpx, PyYAML, and 20 others — before a single line of application code runs. Bun loads the entire TypeScript CLI and prints output in 29ms.

For a CLI tool that runs frequently — once per feature, multiple times per day — this matters more than the numbers suggest. The psychological difference between "instant" and "half a second" is the difference between a tool you reach for naturally and one you avoid.

VI – The Specifications

Each spec brought its own challenges. Here's what we learned building them.

Spec 002: Core Types and Configuration

The foundation layer shipped first: all TypeScript types, agent configurations, and configuration file handling.

What We Built

Two modules totaling ~400 lines:

types.ts — All 23 agent configurations, type definitions for InitOptions, ExtensionManifest, PresetManifest, and utility functions
config.ts — Loading/saving configuration files with graceful fallbacks

The Agent Configuration Challenge

Python's spec-kit has a flat AGENT_CONFIGS dictionary. We needed the same data, but with proper TypeScript types:

export interface AgentConfig {
  dir: string;           // ".claude/commands"
  format: CommandFormat; // "markdown" | "toml"
  args: string;          // "$ARGUMENTS" or "{{args}}"
  extension: string;     // ".md", ".agent.md", "/SKILL.md", ".toml"
}

export const AGENT_CONFIGS: Record<string, AgentConfig> = {
  claude: { dir: '.claude/commands', format: 'markdown', args: '$ARGUMENTS', extension: '.md' },
  gemini: { dir: '.gemini/commands', format: 'toml', args: '{{args}}', extension: '.toml' },
  copilot: { dir: '.github/agents', format: 'markdown', args: '$ARGUMENTS', extension: '.agent.md' },
  codex: { dir: '.agents/skills', format: 'markdown', args: '$ARGUMENTS', extension: '/SKILL.md' },
  // ... 19 more agents
};

The tricky part: skill-based agents (Codex, Kimi) use directory-per-command structure. The /SKILL.md extension signals this:

export function getCommandFilePath(projectRoot: string, agent: string, commandName: string): string {
  const config = AGENT_CONFIGS[agent];
  
  // Skill-based agents: speckit.specify → speckit.specify/SKILL.md
  if (config.extension === '/SKILL.md') {
    return `${projectRoot}/${config.dir}/${commandName}/SKILL.md`;
  }
  
  // Normal agents: speckit.specify → speckit.specify.md
  return `${projectRoot}/${config.dir}/${commandName}${config.extension}`;
}

Test Coverage Matching Python

We matched Python's test_agent_config_consistency.py coverage:

Each of 23 agents tested individually for correct configuration
Legacy agents (q, amazonq) verified as removed
TOML agents verified to use {{args}} placeholder
Skill-based agents verified to use /SKILL.md extension
All agents verified to have directories starting with .

The init options tests match test_branch_numbering.py:

Sequential vs timestamp mode persistence
Round-trip JSON serialization
Graceful handling of corrupted/partial files
Default value merging

Bun Advantage: Built-in Test Runner

No Jest configuration. No test framework installation. Just:

bun test
# 80 pass, 318 expect() calls, 91ms

Python's pytest requires setup.py/pyproject.toml configuration, conftest.py fixtures, and a separate test runner installation. Bun includes everything.

Spec 003: Init Command

The main specify init command — the heart of the CLI.

What We Built

Four modules totaling ~1,100 lines:

cli.ts — Entry point with manual argument parsing (zero-dep)
init.ts — Core init logic with interactive TUI
templates.ts — Template bundling and resolution
ui.ts — True color gradient banner with @oakoliver/lipgloss

The Interactive Agent Selection

When you run specify init my-project without --ai, you get an interactive fuzzy-searchable list:

import { NewSelect, NewOption, Run } from '@oakoliver/huh';

async function selectAgent(): Promise<string> {
  const options = SUPPORTED_AGENTS.map(agent => NewOption(agent, agent));
  const select = NewSelect('copilot')
    .title('AI Assistant')
    .description('Select the AI coding agent to configure')
    .options(options)
    .height(10);
  await Run(select);
  return select.getValue();
}

This is better than Python's approach — instead of printing "Error: Missing required option --ai", users get a beautiful interactive picker.

Template Bundling

Templates ship inside the npm package and are resolved at runtime:

function getTemplatesDir(): string {
  const thisFile = import.meta.url;
  const srcDir = path.dirname(fileURLToPath(thisFile));
  return path.join(srcDir, '..', 'templates');
}

The templates/ directory contains:

9 command templates (speckit.*.md)
File templates (spec, plan, tasks, checklist, constitution)
Bash scripts (create-new-feature.sh, setup-plan.sh)

Constitution v1.1.0

While implementing init, we added Principle VI to the constitution:

1:1 Migration Fidelity — JSON output uses snake_case keys matching Python. Directory structures match exactly. Agent configs stay synchronized.

This principle caught several breaking changes before they shipped.

Spec 004: Check Command

The specify check command validates project structure and tool availability.

What It Checks

interface CheckResult {
  tools: { name: string; available: boolean; version?: string }[];
  structure: { item: string; exists: boolean; expected: string }[];
  overall: 'valid' | 'fixable' | 'invalid';
}

Tools: git, gh (GitHub CLI)
Structure: .specify/, init-options.json, templates/, scripts/, specs/, agent commands

The --fix flag auto-repairs missing directories and reinstalls templates from bundled copies.

Spec 005: Agent Registration System

The command registration system — the heart of how spec-kit installs commands to different AI agents.

What We Built

One module (~350 lines) in src/registrar.ts:

YAML frontmatter parser — Zero-dependency parser for command metadata
TOML generator — String builder for Gemini/Tabnine format
Markdown registration — Claude, Cursor, OpenCode, etc.
Copilot registration — Creates both .agent.md and companion .prompt.md
Skill registration — Directory-per-command for Codex/Kimi
Unregistration — Clean removal with empty directory cleanup

The Zero-Dependency YAML Challenge

Python has pyyaml. We have nothing. But command frontmatter is simple:

description: Create a feature specification
handoffs:
  - label: Next Step

No anchors, no flow syntax, no complex types. A 60-line parser handles it:

function parseSimpleYaml(yaml: string): Record<string, unknown> {
  const result: Record<string, unknown> = {};
  const lines = yaml.split('\n');
  let currentArray: unknown[] | null = null;

  for (const line of lines) {
    if (!line.trim() || line.trim().startsWith('#')) continue;
    
    const trimmed = line.trim();
    
    // Array item
    if (trimmed.startsWith('- ') && currentArray) {
      currentArray.push(parseYamlValue(trimmed.slice(2)));
      continue;
    }
    
    // Key-value pair
    const colonIndex = trimmed.indexOf(':');
    if (colonIndex > 0) {
      const key = trimmed.slice(0, colonIndex).trim();
      const value = trimmed.slice(colonIndex + 1).trim();
      
      if (value === '') {
        currentArray = [];
        result[key] = currentArray;
      } else {
        result[key] = parseYamlValue(value);
        currentArray = null;
      }
    }
  }
  return result;
}

The Zero-Dependency TOML Challenge

Gemini and Tabnine expect TOML format. A 15-line generator handles it:

export function toToml(description: string, prompt: string): string {
  const escapedDesc = escapeTomlString(description);
  return `description = "${escapedDesc}"

prompt = """
${prompt}
"""`;
}

That's it. We only need description and prompt fields — no tables, no arrays.

Agent-Specific Formats

Each of 23 agents has unique requirements:

Agent Type	Example	Output Format
Standard Markdown	Claude, Cursor	`.claude/commands/speckit.specify.md`
Agent+Prompt	Copilot	`.github/agents/.agent.md` + `.github/prompts/.prompt.md`
TOML	Gemini, Tabnine	`.gemini/commands/speckit.specify.toml`
Skill Directory	Codex, Kimi	`.agents/skills/speckit.specify/SKILL.md`

The registerCommands() function routes to the correct handler based on agent config:

if (agent === 'copilot') {
  paths = await registerCopilotCommand(projectRoot, commandName, content);
} else if (config.format === 'toml') {
  paths = [await registerTomlCommand(projectRoot, agent, commandName, content)];
} else if (config.extension === '/SKILL.md') {
  paths = [await registerSkillCommand(projectRoot, agent, commandName, content)];
} else {
  paths = [await registerMarkdownCommand(projectRoot, agent, commandName, content)];
}

Following the SDD Workflow

This spec was implemented using spec-kit's own workflow:

Created branch: git checkout -b 005-agent-system
Ran setup: ./.specify/scripts/bash/setup-plan.sh --json
Wrote plan.md: Technical context, constitution check, design decisions
Wrote research.md: YAML parsing, TOML generation, skill format research
Wrote tasks.md: 9 tasks with acceptance criteria
Implemented and tested: 39 tests, all passing

The workflow forced us to think through the design before coding. The research.md captured decisions like "why no YAML library" with rationale and alternatives considered.

Spec 006: Extension Management

The extension system — installing, removing, and managing modular spec-kit add-ons.

What We Built

One large module (~700 lines) in src/extension.ts:

Zero-dep YAML parser — Handles nested objects, arrays, quoted strings
ExtensionManifest — Schema validation with SHA-256 hashing
ExtensionRegistry — JSON persistence with deep-copy safety
ExtensionManager — Full lifecycle (install, remove, enable/disable)

Zero-Dependency YAML Parser (Extended)

The simple parser from Spec 005 couldn't handle extension manifests. We needed nested objects:

extension:
  id: my-extension
  version: 1.0.0
provides:
  commands:
    - name: speckit.ext.hello
      file: commands/hello.md

The extended parser (~150 lines) handles:

Nested object blocks (indent-based)
Arrays of objects (- name: ...)
Quoted strings with escape sequences
Boolean/number coercion
Null values

function parseSimpleYaml(content: string): Record<string, unknown> {
  const lines = content.split('\n');
  const root: Record<string, unknown> = {};
  const stack: { obj: Record<string, unknown>; indent: number }[] = [{ obj: root, indent: -1 }];
  let currentArray: unknown[] | null = null;
  
  for (const line of lines) {
    // Skip comments and empty lines
    if (!line.trim() || line.trimStart().startsWith('#')) continue;
    
    const indent = line.length - line.trimStart().length;
    const trimmed = line.trim();
    
    // Pop stack when dedenting
    while (stack.length > 1 && indent <= stack[stack.length - 1].indent) {
      stack.pop();
    }
    
    // Handle array item
    if (trimmed.startsWith('- ')) {
      // ... array handling
    }
    
    // Handle key-value
    const colonIdx = trimmed.indexOf(':');
    if (colonIdx > 0) {
      const key = trimmed.slice(0, colonIdx).trim();
      const value = trimmed.slice(colonIdx + 1).trim();
      
      if (value === '') {
        // Nested object or array follows
        const newObj: Record<string, unknown> = {};
        stack[stack.length - 1].obj[key] = newObj;
        stack.push({ obj: newObj, indent });
      } else {
        stack[stack.length - 1].obj[key] = parseValue(value);
      }
    }
  }
  
  return root;
}

Registry Safety Patterns

The registry stores metadata per extension. Critical pattern: deep copies everywhere.

get(extensionId: string): ExtensionMetadata | null {
  const data = this._load();
  const entry = data.extensions[extensionId];
  if (!entry) return null;
  
  // Return deep copy to prevent external mutation
  return JSON.parse(JSON.stringify(entry));
}

This prevents bugs where modifying the returned object accidentally corrupts the registry.

Version Specifier Parsing

Extensions declare compatibility with semver specifiers: >=0.1.0, >=1.0.0 <2.0.0, !=1.5.0.

function checkCompatibility(manifest: ExtensionManifest, currentVersion: string): boolean {
  const specifier = manifest.requiresSpeckitVersion;
  const parts = specifier.split(/\s+/);
  
  for (const part of parts) {
    const match = part.match(/^(>=|>|<=|<|==|!=)?(.+)$/);
    if (!match) continue;
    
    const [, op = '>=', version] = match;
    const cmp = compareVersions(currentVersion, version);
    
    switch (op) {
      case '>=': if (cmp < 0) return false; break;
      case '>':  if (cmp <= 0) return false; break;
      case '<=': if (cmp > 0) return false; break;
      case '<':  if (cmp >= 0) return false; break;
      case '==': if (cmp !== 0) return false; break;
      case '!=': if (cmp === 0) return false; break;
    }
  }
  return true;
}

Spec 007: Preset Management

Presets customize templates — the spec, plan, tasks, and checklist formats.

What We Built

One module (~550 lines) in src/preset.ts:

PresetManifest — Template type validation
PresetRegistry — Same API as ExtensionRegistry
PresetManager — Install/remove/enable/disable
PresetResolver — Priority-based template resolution

Resolution Priority

When resolving a template (e.g., spec-template), the resolver checks:

Project overrides — .specify/templates/spec-template.md
Presets (by priority) — Lower number = higher precedence
Core templates — Bundled with the package

resolve(templateName: string): { content: string; source: 'override' | 'preset' | 'core'; source_id?: string } | null {
  // 1. Check project override
  const overridePath = path.join(this.projectRoot, '.specify', 'templates', `${templateName}.md`);
  if (fs.existsSync(overridePath)) {
    return { content: fs.readFileSync(overridePath, 'utf-8'), source: 'override' };
  }
  
  // 2. Check presets by priority
  const presets = this.registry.listByPriority();
  for (const [presetId, metadata] of presets) {
    const template = this._findTemplateInPreset(presetId, templateName);
    if (template) {
      return { content: template, source: 'preset', source_id: presetId };
    }
  }
  
  // 3. Fall back to core
  const corePath = path.join(getTemplatesDir(), 'files', `${templateName}.md`);
  if (fs.existsSync(corePath)) {
    return { content: fs.readFileSync(corePath, 'utf-8'), source: 'core' };
  }
  
  return null;
}

This layering enables teams to customize spec-kit without forking — install a preset that matches your workflow.

flowchart TD
    R["resolve(templateName)"] --> O{"Project override?"}
    O -- Found --> RO["Return override"]
    O -- Not found --> P{"Presets by priority?"}
    P -- Found --> RP["Return preset template"]
    P -- Not found --> C{"Core templates?"}
    C -- Found --> RC["Return bundled core"]
    C -- Not found --> RN["Return null"]

VII – Lessons Learned

Accumulated insights from completing all 7 specs.

Lesson 1: Specs Should Be Small

The initial instinct to write one comprehensive spec created a document that was correct but unusable. Good specs are small enough to implement in a single focused session, testable independently, and shippable without blocking other work.

Lesson 2: Match Source Test Coverage First

Before writing implementation code, we pulled Python's test file list and mapped every test function. This revealed coverage gaps immediately — Python has 12 test files with hundreds of test cases. Our TypeScript tests now mirror that structure, ensuring we don't miss edge cases the original authors discovered.

Lesson 3: Types Are Documentation

Python's AGENT_CONFIGS is a dict[str, dict]. TypeScript's is Record<string, AgentConfig> where AgentConfig has four required fields. The type system catches mistakes that Python would only catch at runtime — like forgetting the args placeholder for a new agent.

Lesson 4: Zero-Dependency Parsers Are Simpler Than You Think

We dreaded implementing YAML and TOML parsing without libraries. Turns out, when you only need a subset of the format (simple key-value pairs, basic arrays), a 60-line parser does the job. The Python version uses full pyyaml for parsing simple frontmatter — massive overkill.

Lesson 5: The SDD Workflow Forces Good Design

By writing plan.md and research.md before code, we documented decisions like "why no YAML library" with rationale. When someone asks "why didn't you just use js-yaml?", the answer is in research.md with alternatives considered. This is invaluable for maintenance.

Lesson 6: Deep Copies Prevent Registry Corruption

The extension and preset registries return deep copies of all data. Without this, code like registry.get('my-ext').enabled = false would mutate the internal state without triggering a save. Deep copies make mutations explicit.

Lesson 7: Interactive Beats Error Messages

When --ai is not provided, Python prints an error. TypeScript shows an interactive picker. This pattern — "make the common case easy" — applies throughout. The TUI approach using @oakoliver/huh components makes the CLI more discoverable.

Lesson 8: 1:1 Parity Catches Bugs

Principle VI (1:1 Migration Fidelity) caught multiple issues: camelCase vs snake_case in JSON, different directory structures, missing agents. By enforcing exact parity with the Python version, we ensure existing Python spec-kit users can migrate without surprises.

VIII – Final Tally

Spec	Status	Tests
001-project-overview	Complete	—
002-core-types	Complete	80
003-init-command	Complete	35
004-check-command	Complete	—
005-agent-system	Complete	39
006-extensions	Complete	53
007-presets	Complete	40

Total: 247 tests, zero runtime dependencies, 23 AI agents supported.

IX – Installation

The spec-kit TypeScript port is available on npm:

# Install globally
npm install -g @oakoliver/specify-cli

# Or with Bun
bun install -g @oakoliver/specify-cli

Usage

# Initialize a new project with interactive agent selection
specify init my-project

# Initialize with specific agent
specify init my-project --ai opencode

# Initialize in current directory
specify init . --ai claude --here

# Check project structure
specify check

# Check and auto-fix issues
specify check --fix

Source Code

Source code: github.com/oakoliver/specify-cli

This is the seventh Charmbracelet-ecosystem port, following Lipgloss, Glamour, Bubble Tea, Bubbles, Huh, and Glow. The full stack is now complete.

Port completed. Enterprise environments can now use Spec-Driven Development without Python.

– Antonio