All posts
Experiments, evals, and honest writeups — no filler
2026-05-08
We generated a production GitHub CLI — 1,404 Go files, SQLite mirror, MCP server — in 45 seconds. Here's what Printing Press actually does, what the compound query gap means for AI agents, and why the local mirror is the key insight.
Read →
2026-05-07
265 documents. 10 queries. Two retrieval paths measured side-by-side. We tested whether pre-compiling Q&A pairs at index time can replace naive chunk retrieval — and where the approach breaks on cross-document and novel questions.
Read →
2026-05-07
MiniMax-M2.7 as a multi-hop retrieval agent on HotpotQA. +45pp over baseline RRF on hard 3-hop questions. But also: a surprising divergence between LanceDB and Qdrant at agent-level that raw recall metrics completely miss.
Read →