Do Your Engineers Know How to Leverage AI?
This past spring, two senior engineers at different companies received the same challenge from their CEOs: "We need to move faster. Use AI to get there."
I write about a mix of consulting, open source, personal work, and applying llms. I won't email you more than twice a month, not every post I write is worth sharing but I'll do my best to share the most interesting stuff including my own writing, thoughts, and experiences.
Subscribe to my Newsletter Follow me on X
For posts about RAG (Retrieval-Augmented Generation) or LLMs (Large Language Models), check out the category labels in the sidebar. Here are some of my best posts on these topics:
This past spring, two senior engineers at different companies received the same challenge from their CEOs: "We need to move faster. Use AI to get there."
I hosted a series of conversations with the teams behind the most successful coding agents in the industry—Cognition (Devin), Sourcegraph (Amp), Cline, and Augment. Coding agents are the most economically viable agents today—they're generating real revenue, being used daily by professional developers, and solving actual business problems at scale.
This makes them incredibly important to study. While other agent applications remain largely experimental, coding agents have crossed the chasm into production use. The patterns and principles these teams discovered aren't just theoretical—they're battle-tested insights from systems processing millions of real-world tasks.
This series captures those hard-won lessons, revealing what works and what doesn't when building agents that actually deliver economic value.
Related Series
Context Engineering Series: Technical implementation patterns for agentic RAG systems, including tool response design, context management, and system architecture. This Speaker Series provides strategic insights, while Context Engineering offers implementation details.
RAG Master Series: Comprehensive guide to retrieval-augmented generation systems. Many coding agent insights (like why simple approaches beat complex ones) apply directly to RAG system design and optimization.
Retrieval-Augmented Generation (RAG) has become the foundation of modern AI applications that need to access and reason about external knowledge. This comprehensive series distills years of consulting experience helping companies build, improve, and scale RAG systems in production.
RAG systems are fundamentally different from other AI applications - they combine the complexity of information retrieval with the unpredictability of language generation. This series provides a systematic approach to mastering both aspects, from basic implementations to enterprise-grade systems serving millions of users.
This guide covers everything from fundamental concepts to advanced optimization techniques, anti-patterns to avoid, and real-world case studies from successful deployments across industries.
I hosted a session featuring Chris Lovejoy, Head of Clinical AI at Anterior, who shared valuable insights from his experience building AI agents for specialized industries. Chris brings a unique perspective as a former medical doctor who transitioned to AI, working across healthcare, education, recruiting, and retail sectors.
I hosted a special session with Anton from ChromaDB to discuss their latest technical research on text chunking for RAG applications. This session covers the fundamentals of chunking strategies, evaluation methods, and practical tips for improving retrieval performance in your AI systems.
I hosted Colin Flaherty, previously a founding engineer at Augment and co-author of Meta's Cicero AI, to discuss autonomous coding agents and retrieval systems. This session explores how agentic approaches are transforming traditional RAG systems, what we can learn from state-of-the-art coding agents, and how these insights might apply to other domains.
I hosted a session with Walden Yan, co-founder and CPO of Cognition, to explore why multi-agent systems might not be the optimal approach for coding contexts. We discussed the theory of context engineering, the challenges of context passing between agents, and how single agents with proper context management can often outperform multi-agent setups.
I hosted a session with Kelly Hong from Chroma, who presented her research on generative benchmarking for retrieval systems. She explained how to create custom evaluation sets from your own data to better test embedding models and retrieval pipelines, addressing the limitations of standard benchmarks like MTEB.
I hosted a special session with Eli Badgio, CTO of Extend, to discuss AI-native document processing in the cloud. Extend helps companies achieve 95%+ extraction accuracy for customers like Brex and other Fortune 500s. This session covered mapping document workflows, building task-specific evaluations, and implementing partial automation with human-in-the-loop approaches.
I hosted a session with Ayush, an ML Engineer at LanceDB, to explore how fine-tuning re-rankers and embedding models can significantly improve retrieval performance in RAG systems. We discussed practical approaches to enhancing retrieval quality, the trade-offs involved, and when these techniques make the most business sense.