AI Studio
Applied AI, with a long memory.
Where I keep my working notes on what actually works in production — and what only works in a demo.
Currently shipping
An eval-first workflow for LLM features that takes minutes, not weeks.
I'm packaging the playbook I use with teams into a small toolkit. If your AI feature keeps "feeling worse" after every prompt change, we should talk.
Agents
Practical multi-step agents that plan, call tools, and recover from failure.
Evals
Reproducible offline + online evals. Treat your eval set like a product.
Retrieval
Hybrid retrieval, reranking, and grounding with citation guarantees.
Small models
Distillation, quantization, and on-device deployment patterns.
Safety
Red-teaming, jailbreak monitors, and PII-aware logging at scale.
Cost & latency
Caching, speculative decoding, and routing — making AI bills sane.