Skip to content

Writing and mumblings

I write about a mix of consulting, open source, personal work, and applying llms. I won't email you more than twice a month, not every post I write is worth sharing but I'll do my best to share the most interesting stuff including my own writing, thoughts, and experiences.

Subscribe to my Newsletter Follow me on X

For posts about RAG (Retrieval-Augmented Generation) or LLMs (Large Language Models), check out the category labels in the sidebar. Here are some of my best posts on these topics:

Personal Stories

RAG and LLM Insights

Take the RAG course

Context Engineering

Consulting and Tech Advice

Learn Indie Consulting

Talks and Interviews

What Is the Coding Agents Speaker Series?

I hosted a series of conversations with the teams behind the most successful coding agents in the industry—Cognition (Devin), Sourcegraph (Amp), Cline, and Augment. Coding agents are the most economically viable agents today—they're generating real revenue, being used daily by professional developers, and solving actual business problems at scale.

This makes them incredibly important to study. While other agent applications remain largely experimental, coding agents have crossed the chasm into production use. The patterns and principles these teams discovered aren't just theoretical—they're battle-tested insights from systems processing millions of real-world tasks.

This series captures those hard-won lessons, revealing what works and what doesn't when building agents that actually deliver economic value.

Related Series

Context Engineering Series: Technical implementation patterns for agentic RAG systems, including tool response design, context management, and system architecture. This Speaker Series provides strategic insights, while Context Engineering offers implementation details.

**[RAG Master Series](./rag-series-index.md)**: Comprehensive guide to retrieval-augmented generation systems. Many coding agent insights (like why simple approaches beat complex ones) apply directly to RAG system design and optimization.

What Is the RAG Master Series?

Retrieval-Augmented Generation (RAG) has become the foundation of modern AI applications that need to access and reason about external knowledge. This comprehensive series distills years of consulting experience helping companies build, improve, and scale RAG systems in production.

RAG systems are fundamentally different from other AI applications - they combine the complexity of information retrieval with the unpredictability of language generation. This series provides a systematic approach to mastering both aspects, from basic implementations to enterprise-grade systems serving millions of users.

This guide covers everything from fundamental concepts to advanced optimization techniques, anti-patterns to avoid, and real-world case studies from successful deployments across industries.

Text Chunking - Anton (ChromaDB)

I hosted a special session with Anton from ChromaDB to discuss their latest technical research on text chunking for RAG applications. This session covers the fundamentals of chunking strategies, evaluation methods, and practical tips for improving retrieval performance in your AI systems.

Why Grep Beat Embeddings in Our SWE-Bench Agent (Lessons from Augment)

I hosted Colin Flaherty, previously a founding engineer at Augment and co-author of Meta's Cicero AI, to discuss autonomous coding agents and retrieval systems. This session explores how agentic approaches are transforming traditional RAG systems, what we can learn from state-of-the-art coding agents, and how these insights might apply to other domains.

Stop Trusting MTEB Rankings (Kelly Hong, Chroma)

I hosted a session with Kelly Hong from Chroma, who presented her research on generative benchmarking for retrieval systems. She explained how to create custom evaluation sets from your own data to better test embedding models and retrieval pipelines, addressing the limitations of standard benchmarks like MTEB.

How Extend Achieves 95%+ Document Automation (Lessons from Eli Badgio)

I hosted a special session with Eli Badgio, CTO of Extend, to discuss AI-native document processing in the cloud. Extend helps companies achieve 95%+ extraction accuracy for customers like Brex and other Fortune 500s. This session covered mapping document workflows, building task-specific evaluations, and implementing partial automation with human-in-the-loop approaches.