How kb Works: 27 Modules, Zero Dependencies, and a Compiler That Turns Raw Sources Into a Wiki

This is the technical deep-dive companion to Stop Pasting Into ChatGPT. Build a Knowledge Base Your AI Can Actually Search. — which covers what kb does and how to use it. This article covers how it's built.

I spent the last few weeks building kb — a CLI that transforms URLs, PDFs, and markdown into an Obsidian-compatible wiki with LLM-powered Q&A. The companion article shows the workflow. This one tears open the engine.

27 modules. 154 tests. Zero dependencies outside the @oakoliver ecosystem. Here's how every piece fits together.

I – The Stack

Everything runs on Bun with zero dependencies outside the @oakoliver ecosystem. That's not a flex — it's an architectural decision. Every library in the stack is something I either ported from Go's Charm ecosystem or rewrote from Python's scientific computing stack:

Layer	Package	Origin
Terminal styling	lipgloss	Port of Go's lipgloss
Spinners & TUI	bubbles	Port of Go's Bubbles
Markdown rendering	glamour	Port of Go's Glamour
Full-text search	bm25s	Rewrite of Python's bm25s — 2x faster
PDF extraction	pageindex	Built from scratch, no native deps
Schema validation	zod	The one external dependency

This means bun build --compile produces a standalone binary with the entire tool baked in. No runtime. No peer dependencies. One file.

II – The Compiler

kb is, at its core, a compiler. Raw sources go in, structured knowledge comes out.

┌─────────────┐     ┌─────────────┐     ┌─────────────┐
│   Sources   │────▶│  kb ingest  │────▶│    raw/     │
│ URLs, PDFs, │     │             │     │  Markdown   │
│  Markdown   │     └─────────────┘     └──────┬──────┘
└─────────────┘                                │
                                               ▼
┌─────────────┐     ┌─────────────┐     ┌─────────────┐
│   Answer    │◀────│  kb query   │◀────│ kb compile  │
│  + Sources  │     │   (LLM)     │     │    (LLM)    │
└─────────────┘     └─────────────┘     └──────┬──────┘
                                               │
                    ┌─────────────┐            ▼
                    │  kb find    │◀────┌─────────────┐
                    │   (BM25)    │     │   wiki/     │
                    └─────────────┘     │  Articles   │
                                        └─────────────┘

Ingestion handles source type detection automatically — URLs get fetched and converted to markdown, PDFs get text extraction via pageindex, markdown files get copied, git repos get cloned for README extraction. Everything lands in raw/ with a manifest tracking paths, titles, hashes, and types.

Compilation is where LLM does the heavy lifting. The system extracts concepts (abstract ideas), entities (named things), and syntheses (multi-source summaries). Each gets YAML frontmatter with source citations and [[wikilinks]] to related articles.

Search has two modes: find uses BM25 for fast keyword search without LLM involvement. query retrieves relevant context and synthesizes answers with streaming output.

Incremental Everything

The system never reprocesses unchanged content:

Hash-based duplicate detection — same content hash means skip ingestion
Dependency graph tracking — knows which articles depend on which sources
Stale propagation — source changes mark dependent articles for recompilation

A wiki with 1000 articles and 5 changed sources recompiles in under 10 seconds.

III – The Problems That Weren't Obvious

TTY-Aware Output

Every command outputs differently based on context:

export const isTTY = process.stdout.isTTY ?? false;

export function output<T>(data: T, humanFormat?: (data: T) => string): void {
  if (isTTY && humanFormat) {
    console.log(humanFormat(data));
  } else {
    console.log(JSON.stringify(data, null, 2));
  }
}

In a terminal, you get styled text with spinners:

⣾ Compiling 3 changed sources...
  ✓ Created: wiki/concepts/attention-mechanism.md

Piped to another process, you get JSON:

{
  "created": ["wiki/concepts/attention-mechanism.md"],
  "updated": [],
  "unchanged": 42
}

This makes kb scriptable while remaining human-friendly.

Wiki Resolution

The resolveWikiRoot() function traverses up the directory tree looking for a .kb/ directory, falling back to ~/.kb/ for global wikis. This means you can run kb find from any subdirectory and it finds the wiki root automatically.

async function resolveWikiRoot(): Promise<{ path: string }> {
  let current = process.cwd();
  
  while (current !== '/') {
    const kbDir = join(current, '.kb');
    if (await exists(kbDir)) {
      return { path: current };
    }
    current = dirname(current);
  }
  
  // Check global wiki
  const globalKb = join(homedir(), '.kb');
  if (await exists(globalKb)) {
    return { path: join(homedir(), '.kb') };
  }
  
  throw new Error('No knowledge base found');
}

LLM Error Handling

The spec required: "Fail immediately on LLM API errors with clear error message and no partial output."

This is harder than it sounds. Streaming responses mean content arrives incrementally. If the connection drops mid-stream, you have partial output. The solution: buffer everything during compilation, only write files on complete success.

async function compileArticle(source: ManifestEntry): Promise<string> {
  const chunks: string[] = [];
  
  for await (const chunk of provider.stream(request)) {
    chunks.push(chunk);
  }
  
  // Only reached on complete success
  return chunks.join('');
}

If anything throws, no files are written. The wiki stays consistent.

The Frontmatter System

Every wiki article has YAML frontmatter that needs parsing and serialization:

---
title: Attention Mechanism
type: concept
created: 2026-04-03T10:35:00Z
updated: 2026-04-03T10:35:00Z
sources:
  - raw/articles/attention-paper.md
related:
  - "[[Transformer Architecture]]"
tags:
  - machine-learning
---

Zod schemas validate the structure:

export const FrontmatterSchema = z.object({
  title: z.string(),
  type: z.enum(['concept', 'entity', 'synthesis', 'query']),
  created: z.string().datetime(),
  updated: z.string().datetime(),
  sources: z.array(z.string()),
  related: z.array(z.string()),
  tags: z.array(z.string()).optional(),
});

The parser handles edge cases — inline arrays (tags: [ml, nlp]), multi-line arrays with dashes, quoted strings with special characters. The serializer produces consistent output that round-trips cleanly.

IV – The Eight Commands

I won't walk through every flag — the companion article and docs cover usage. Here's what's interesting about the implementation.

init

Creates the knowledge base structure:

kb init my-research

# Creates:
# my-research/
# ├── .kb/config.json
# ├── raw/
# ├── wiki/
# └── queries/

Supports --global for a system-wide wiki at ~/.kb/.

ingest

Adds sources with automatic type detection:

kb ingest https://arxiv.org/abs/1706.03762    # URL → markdown
kb ingest ./paper.pdf                          # PDF → text
kb ingest ./notes.md                           # Copy markdown
kb ingest https://github.com/user/repo         # Clone → README

Duplicates are detected by content hash and skipped.

compile

Transforms sources into wiki articles:

kb compile              # Incremental (only changed)
kb compile --full       # Full recompilation
kb compile --dry-run    # Preview without writing

The LLM extracts concepts, entities, and relationships, generating properly linked articles.

find

Fast keyword search without LLM:

kb find "attention mechanism" --limit 5

# wiki/concepts/attention-mechanism.md (0.95)
#   ...allows models to focus on relevant parts of the input...

Uses BM25 ranking via the bm25s package.

query

Natural language Q&A with streaming:

kb query "How does attention work in transformers?"

# Based on the knowledge base articles, attention mechanisms...
# [streams in real-time]
#
# Sources: [[Attention Mechanism]], [[Transformer]]
# Saved to: queries/2026-04-03-attention-transformers.md

Answers are saved to queries/ for later promotion to permanent wiki articles.

lint

Validates wiki integrity:

kb lint

# ✗ Broken link: wiki/concepts/foo.md → [[Nonexistent]]
# ✗ Orphan: wiki/entities/orphaned.md
# Found 2 errors

Detects broken wikilinks, orphan articles, invalid frontmatter, and stale content.

status

Shows knowledge base statistics:

kb status

# Wiki: ./research-wiki/
# Sources: 42 (3 new)
# Articles: 87
#   - Concepts: 45
#   - Entities: 32
#   - Syntheses: 10
# Health: 2 stale, 1 orphan

promote

Moves valuable query responses into permanent wiki:

kb promote queries/2026-04-03-attention.md --as concept

# ✓ Promoted to wiki/concepts/attention.md
# Added backlinks to 3 cited articles

V – The Integration Story

kb is designed to work with LLM agents. The JSON output mode makes it scriptable:

# Batch ingestion
for url in $(cat urls.txt); do
  result=$(kb ingest "$url" --json)
  echo "$result" | jq -r '.title'
done

# Health monitoring
status=$(kb status --json)
stale=$(echo "$status" | jq '.health.stale')
if [ "$stale" -gt 0 ]; then
  kb compile
fi

For humans, the Obsidian integration is seamless — open wiki/ as a vault and you get graph view, backlinks, and all the Obsidian features working with the compiled articles.

VI – The Full Picture

Eight commands, 27 modules organized across six layers (commands, core, index, llm, ingest, output), 154 tests with 322 assertions, 7 documentation files, and zero dependencies outside the @oakoliver ecosystem.

The tool compiles to a standalone binary. The test suite runs in under two seconds. The incremental compilation makes 1000-article wikis practical.

Everything is files. JSON for data, markdown for content, YAML for metadata. No database process. No background service. git diff shows you exactly what changed.

VII – The Tradeoffs I Chose

Every architectural decision in kb is a tradeoff against production RAG systems.

No vector database. BM25 keyword search is fast enough for wikis up to thousands of articles. For semantic understanding, the LLM provides it at query time.

No embedding service. The compiled wiki articles are the knowledge representation. Frontmatter provides structured metadata. Wikilinks encode relationships.

No chunking strategy. Sources compile to complete articles. The LLM decides how to organize information.

Local-first. Everything is files — JSON for data, markdown for content. Git-friendly diffs, no database to manage.

This isn't a replacement for production RAG pipelines. It's what you reach for when the alternative is copy-pasting into ChatGPT. The companion article shows what that looks like in practice.

Links: