ALL TAGS

Tag: llm

Engineering

Wiring Apple's Neural Engine Into a Zig Inference Runtime: What We Learned Building ANE Dispatch

We built a runtime executor for on-device LLM inference in Zig, then wired Apple's Neural Engine into the dispatch layer. Here's what ANE actually requires — spatial packing, 32-element minimums, fence synchronization, and why fused kernels matter more than raw TFLOPS.

March 26, 2026 14 min read
Read Story →
Security

Stop Exposing Your API Keys: How I Built a Five-Layer AI Proxy That Lets Users Call LLMs Without the Security Nightmare

A deep dive into Vibe's AI proxy architecture — server-side key management for Gemini and OpenAI, per-user credit deduction, rate limiting, provider abstraction, and why your frontend should never touch an LLM directly.

February 14, 2026 13 min read
Read Story →