llm — Oak Oliver Engineering

Stop Pasting Into ChatGPT. Build a Knowledge Base Your AI Can Actually Search.

I built a CLI tool that turns URLs, PDFs, and markdown into a searchable knowledge base with AI-powered Q&A. No vector database. No embedding service. Just files, BM25 search, and LLM synthesis — all from your terminal.

April 3, 2026 6 min read

Read Story →

Runtime

How kb Works: 27 Modules, Zero Dependencies, and a Compiler That Turns Raw Sources Into a Wiki

A deep dive into the architecture of kb — the CLI knowledge base tool. Hash-based incremental compilation, BM25 search, TTY-aware dual output, Zod-validated frontmatter, and a dependency graph that recompiles only what changed. 154 tests, 27 TypeScript modules, the full @oakoliver stack.

April 3, 2026 7 min read

Read Story →

Engineering

Wiring Apple's Neural Engine Into a Zig Inference Runtime: What We Learned Building ANE Dispatch

We built a runtime executor for on-device LLM inference in Zig, then wired Apple's Neural Engine into the dispatch layer. Here's what ANE actually requires — spatial packing, 32-element minimums, fence synchronization, and why fused kernels matter more than raw TFLOPS.

March 26, 2026 14 min read

Read Story →

Security

Stop Exposing Your API Keys: How I Built a Five-Layer AI Proxy That Lets Users Call LLMs Without the Security Nightmare

A deep dive into Vibe's AI proxy architecture — server-side key management for Gemini and OpenAI, per-user credit deduction, rate limiting, provider abstraction, and why your frontend should never touch an LLM directly.

February 14, 2026 13 min read

Read Story →

Tag: llm

Stop Pasting Into ChatGPT. Build a Knowledge Base Your AI Can Actually Search.

How kb Works: 27 Modules, Zero Dependencies, and a Compiler That Turns Raw Sources Into a Wiki

Wiring Apple's Neural Engine Into a Zig Inference Runtime: What We Learned Building ANE Dispatch

Stop Exposing Your API Keys: How I Built a Five-Layer AI Proxy That Lets Users Call LLMs Without the Security Nightmare