The main idea: When AI tools do messy tasks, they can either stay focused or get confused by too much information.
This is part of the Context Engineering Series that shows how to build better AI tools based on what I've learned from coding assistants and business systems.
I've been helping companies build agentic RAG systems and studying coding agents from Cognition, Claude Code, Cursor, and others. These coding agents are likely creating a trillion-dollar industry—making them the most economically viable agents to date.
This series shares what I've learned from these teams and conversations with professional developers using these systems daily, exploring what we can apply to other industries.
If you want hands-on help, I recommend reaching out to my friend Nila: nila.is. Please mention you came from me.
Related Series
Coding Agents Speaker Series: Deep insights from the teams behind leading coding agents including Cognition (Devin), Sourcegraph (Amp), Cline, and Augment. While this Context Engineering series focuses on technical implementation patterns, the Speaker Series reveals strategic insights and architectural decisions.
RAG Master Series: Comprehensive guide to building and scaling retrieval-augmented generation systems. Context Engineering principles directly enhance RAG implementations—structured tool responses and faceted search are foundational RAG optimization techniques.
The core insight: In agentic systems, how we structure tool responses is as important as the information they contain.
This is the first post in a series on context engineering. I'm starting here because it's the lowest hanging fruit—something every company can audit and experiment with immediately.
I find this to be a pretty interesting topic because I personally believe that coding agents are probably executing at the frontier of agentic ray systems.
The world of autonomous coding agents is rapidly evolving, with fundamental disagreements emerging about the best approaches to building reliable, high-performance systems. This Lightning Series brings together the minds behind some of the most successful coding agents—from SWE-Bench champions to billion-dollar products—to debate the core architectural decisions shaping the future of AI-powered development.
| These are all just notes from a 30-minute conversation I had with somebody. A fun little exercise, as you will see.
When people ask me what a hot take is, here's mine: more agent tools and AI tools should be pricing on outcomes and trying hard to figure out what that means. This aligns with my broader thoughts on pricing AI tools as headcount alternatives.
The question hit me personally as a small investor in Lovable and a consultant focused on value-based pricing: Why am I not building my consulting business, my courses, my job board on Lovable instead of spreading them across Stripe, Maven, Circle, Kit, and Podia, It's because I could only possibly pay $100/month, and for that, they could not possibly offer me the features I need to.
I hosted a Lightning Lesson with Skylar Payne, an experienced AI practitioner who's worked at companies like Google and LinkedIn over the past decade. Skylar shared valuable insights on common RAG (Retrieval-Augmented Generation) anti-patterns he's observed across multiple client engagements, providing practical advice for improving AI systems through better data handling, retrieval, and evaluation practices.
tl;dr: You should build a system that lets you discover value before you commit resources.
!! Key Takeaways
Before asking what to build, start with a simple chatbot to discover what users are interested in. There's no need to reach for a complex agent or workflow before we see real user demand.
Leverage tools like Kura to understand broad user behavior patterns. The sooner we start collecting real user data, the better.
This week, I had conversations with several VPs of AI at large enterprises, and I kept hearing the same story: XX teams experimenting with AI, a CEO demanding results for the next board meeting, sales conference, quarterly review, and no clear path from pilot to production.
These conversations happen because I built improvingrag.com—a resource that helps teams build better RAG systems, which has lead me into many conversations from researchers, engineers, and executives. But the questions aren't about RAG techniques. They're about strategy: "How do we go from experiments to production?" "How do we know what to invest in?" "How do we show ROI?"
I hosted a lightning lesson featuring Ben from Raindrop and Sid from Oleve to discuss AI monitoring, production testing, and data analysis frameworks. This session explored how to effectively identify issues in AI systems, implement structured monitoring, and develop frameworks for improving AI products based on real user data.