Skip to content

Applied AI

Decomposing RAG Systems to Identify Bottlenecks

There's a reason Google has separate interfaces for Maps, Images, News, and Shopping. The same reason explains why many RAG systems today are hitting a performance ceiling. After working with dozens of companies implementing RAG, I've discovered that most teams focus on optimizing embeddings while missing two fundamental dimensions that matter far more: Topics and Capabilities.

If you're interested in learning more about how to systematically improve RAG systems, you can sign up for the free email course here:

Sign up for the Free Email Course

Now, let's dive into some frequently asked questions from the course:

A Tale of Unmet Expectations

Let me share a recent case study that illustrates this perfectly:

A construction company implemented a state-of-the-art RAG system for their technical documentation. Despite using the latest embedding models and spending weeks optimizing their prompts, user satisfaction stayed stubbornly around 50%. When we analyzed their query logs, we discovered something fascinating: 20% of queries were simply counting objects in blueprints ("How many doors are on the 15th floor?").

No amount of embedding optimization would help here. By adding a simple object detection model for blueprints, satisfaction for those queries jumped to 87% in just one week.

The Two Dimensions That Actually Matter

After analyzing millions of queries across various industries, I've identified two fundamental dimensions that determine RAG system success:

  1. Topics: Do we have the information users want?
  2. Capabilities: Can we effectively access and process that information?

Let's dive deep into each dimension.

Topics: Content Coverage

Topics represent your system's knowledge inventory. Think of this like a store's product catalog - you can't sell what you don't have.

Examples of Topic Gaps:

  • Missing documentation sections
  • Lack of data for specific time periods
  • Absence of particular use cases or scenarios
  • Missing specific types of content (images, videos, tables)

Real World Example: Netflix

When Netflix notices users searching for "Adam Sandler basketball movies", that's a topic gap - they simply don't have that content. No amount of better search or recommendations will help if the content doesn't exist.

Capabilities: Processing Power

Capabilities represent your system's ability to manipulate and retrieve information in specific ways. This is where most RAG systems fall short.

Common Capability Requirements:

  1. Temporal Understanding
  2. "What changed last week?"
  3. "Show me the latest updates"
  4. Understanding fiscal vs calendar years

  5. Numerical Processing

  6. Counting objects in documents
  7. Calculating trends or changes
  8. Aggregating data across sources

  9. Entity Resolution

  10. Connecting related documents
  11. Understanding document hierarchies
  12. Mapping aliases and references

Real World Example: DoorDash

When DoorDash notices orders dropping after 9 PM, adding more restaurants won't help. They need a capability to filter for "open now" restaurants. No amount of inventory helps if users can't find what's actually available.

The Impact on User Experience

Consider how these dimensions affect real user interactions:

  1. Topic Failures:
  2. "Zero results found"
  3. Completely irrelevant responses
  4. Missing critical information

  5. Capability Failures:

  6. Partially correct answers
  7. Unable to process time-based queries
  8. Can't compare or contrast information

Building a Systematic Approach

Here's how to implement this framework in your RAG system:

  1. Analyze Query Patterns
  2. Categorize failed queries into topic vs capability gaps
  3. Identify clusters of similar issues
  4. Track frequency and impact of each gap

  5. Measure Impact

  6. Query volume (how often does this come up?)
  7. Success rate (how often do we fail?)
  8. Business impact (what does failing cost us?)

  9. Prioritize Improvements

  10. Focus on high-volume, low-success-rate queries
  11. Balance implementation cost against potential impact
  12. Build capabilities that can be reused across topics

Best Practices for Implementation

  1. Start with Data Collection
  2. Log all queries and their success rates
  3. Track which capabilities are used for each query
  4. Monitor topic coverage over time

  5. Build Modular Systems

  6. Separate topic management from capability implementation
  7. Allow for easy addition of new capabilities
  8. Enable A/B testing of different approaches

  9. Measure Everything

  10. Track success rates by topic and capability
  11. Monitor usage patterns of different capabilities
  12. Calculate ROI of topic expansions

Looking Forward

The future of RAG isn't just about better embeddings or larger context windows. It's about: - Building specialized indices for different query types - Developing robust capability routing systems - Creating feedback loops for continuous improvement

Conclusion

Stop focusing solely on embedding optimization. Start analyzing your queries through the lens of topics and capabilities. This framework will help you: - Identify the real bottlenecks in your system - Make strategic decisions about improvements - Build a more effective and scalable RAG application

Remember: The goal isn't to build a perfect system. It's to build a system that gets better every day at solving real user problems.


If you're working on a RAG system right now, try this: Take your last 20 failed queries and sort them into topic vs capability issues. You might be surprised by what patterns emerge.

If you're interested in learning more about how to systematically improve RAG systems, you can sign up for the free email course here:

Sign up for the Free Email Course

How to Lead AI Engineering Teams

Have you ever wondered why some teams seem to effortlessly deliver value while others stay busy but make no real progress?

I recently had a conversation that completely changed how I think about leading teams. While discussing team performance with a VP of Engineering who was frustrated with their team's slow progress, I suggested focusing on better standups and more experiments.

That's when Skylar Payne dropped a truth bomb that made me completely rethink everything:

"Leaders are living and breathing the business strategy through their meetings and context, but the people on the ground don't have any fucking clue what that is. They're kind of trying to read the tea leaves to understand what it is."

That moment was a wake-up call.

I had been so focused on the mechanics of execution that I'd missed something fundamental: The best processes in the world won't help if your team doesn't understand how their work drives real value.

In less than an hour, I learned more about effective leadership than I had in the past year. Let me share what I discovered.

The Process Trap

For years, I believed the answer to team performance was better processes. More standups, better ticket tracking, clearer KPIs.

I was dead wrong.

Here's the truth that surprised me: The most effective teams have very little process. What they do have is: - Crystal clear alignment on what matters - A shared understanding of how the business works - The ability to make independent decisions - A systematic way to learn and improve

Let me break down how to build this kind of team.

The "North Star" Framework

Instead of more process, teams need a clear way to connect their daily work to real business value. This is where the North Star Framework comes in.

Here's how it works:

  1. Define One Key Metric: Choose a single metric that summarizes the value you deliver to customers. For example, Amplitude uses "insights shared and read by at least three people."

  2. Break It Down: Identify the key drivers that teams can actually impact. These become your focus areas.

  3. Create a Rhythm:

  4. Weekly: Review input metrics
  5. Quarterly: Check relationships between inputs and your North Star
  6. Yearly: Validate that your North Star predicts revenue

  7. Make It Visible: Run weekly business reviews where leadership shares these metrics with everyone. Start manual before building dashboards - trustworthy data matters more than automation.

This framework does something powerful: it helps every team member understand how their work drives real value.

The Weekly Business Review

One of the most powerful tools in this framework is the weekly business review. But this isn't your typical metrics meeting.

Here's how to make it work: - Make it a leadership-level meeting that ICs can attend - Focus on building business intuition, not just sharing numbers - Take notes on anomalies and patterns - Share readouts with the entire team - Use it to develop a shared mental model of how the business works

Rethinking Team Structure

Here's another counterintuitive insight: how you organize your teams might be creating unnecessary friction.

Instead of dividing responsibilities by project, try dividing them by metrics. Here's why: - Project-based teams require precise communication boundaries - Metric-based teams can work more fluidly - It reduces communication overhead - Teams naturally align around outcomes instead of outputs

Think about it: When teams own metrics instead of projects, they have the freedom to find the best way to move those metrics.

Early Stage? Even More Important

I know what you're thinking: "This sounds great for big companies, but we're too early for this."

That's what I thought too. But here's what I learned: Being early stage isn't an excuse for throwing spaghetti at the wall.

You can still be systematic, just differently:

  1. Start Qualitative:
  2. Draft clear goals and hypotheses
  3. Generate specific questions to validate them
  4. Talk to customers systematically
  5. Document and learn methodically

  6. Focus on Learning:

  7. Treat tickets as experiments, not features
  8. Make outcomes about learning, not just shipping
  9. Accept that progress is nonlinear
  10. Build systematic ways to capture insights

  11. Build Foundations:

  12. Document your strategy clearly
  13. Make metrics and goals transparent
  14. Share regular updates on progress
  15. Create systems for capturing and sharing learnings

The Experiment Mindset

One crucial shift is thinking about work differently: - The ticket is not the feature - The ticket is the experiment - The outcome is learning

This mindset change helps teams focus on value and learning rather than just shipping features.

Put It Into Practice

Here are five things you can do today to start implementing these ideas:

  1. Define Your North Star: What's the one metric that best captures the value you deliver to customers?

  2. Start Weekly Business Reviews: Schedule a weekly meeting to review key metrics with your entire team. Start simple - even a manual spreadsheet is fine.

  3. Audit Your Process: Look at every process you have. Ask: "Is this helping people make better decisions?" If not, consider dropping it.

  4. Document Your Strategy: Write down how you think the business works. Share it widely and iterate based on feedback.

  5. Shift to Experiments: Start treating work as experiments to test hypotheses rather than features to ship.

The Real Test

The real test of whether this is working isn't in your processes or even your metrics. It's in whether every team member can confidently answer these questions:

  • "What should I be spending my time on today?"
  • "How does my work drive value for our business?"
  • "What am I learning that could change our direction?"

When your team can answer these without hesitation, you've built something special.

Remember: Your team members are smart, capable people. They don't need more process - they need context and clarity to make good decisions.

Give them that, and you'll be amazed at what they can achieve.

P.S. What would you say is your team's biggest obstacle to working this way? Leave a comment below.

SWE vs AI Engineering Standups

When I talk to engineering leaders struggling with their AI teams, I often hear the same frustration: "Why is everything taking so long? Why can't we just ship features like our other teams?"

This frustration stems from a fundamental misunderstanding: AI development isn't just engineering - it's applied research. And this changes everything about how we need to think about progress, goals, and team management. In a previous article I wrote about communication for AI teams. Today I want to talk about standups specifically.

The ticket is not the feature, the ticket is the experiment, the outcome is learning.

The right way to do AI engineering updates

Helping software engineers enhance their AI engineering processes through rigorous and insightful updates.


In the dynamic realm of AI engineering, effective communication is crucial for project success. Consider two scenarios:

Scenario A: "We made some improvements to the model. It seems better now."

Scenario B: "Our hypothesis was that fine-tuning on domain-specific data would improve accuracy. We implemented this change and observed a 15% increase in F1 score, from 0.72 to 0.83, on our test set. However, inference time increased by 20ms on average."

Scenario B clearly provides more value and allows for informed decision-making. After collaborating with numerous startups on their AI initiatives, I've witnessed the transformative power of precise, data-driven communication. It's not just about relaying information; it's about enabling action, fostering alignment, and driving progress.

A surprising reason to not list your consulting prices

As I've shared insights on indie consulting, marketing strategies, and referral techniques, a recurring question from my newsletter subscribers is about pricing. Specifically, many ask if they should lower their rates or make them public.

In this article, we'll delve into the counterintuitive reasons why listing your consulting prices might not be the best strategy, regardless of whether you're aiming to appear affordable or exclusive. We'll explore the potential drawbacks of transparent pricing, introduce more effective alternatives like minimum level of engagement pricing, and provide actionable strategies to help you maximize your value and earnings as a consultant.

Building on the foundation laid in my previous posts about building a consulting practice and using the right tools, this piece will add another crucial element to your consulting toolkit: strategic pricing.

Implementing Naturalistic Dialogue in AI Companions

Ever think, "This AI companion sounds odd"? You're onto something. Let's explore naturalistic dialogue and how it could change our digital interactions.

I've been focused on dialogue lately. Not the formal kind, but the type you'd hear between friends at a coffee shop. Conversations that flow, full of inside jokes and half-finished sentences that still make sense. Imagine if your AI companion could chat like that.

This post will define naturalistic dialogue, characterized by:

  1. Contextual efficiency: saying more with less
  2. Implicit references: alluding rather than stating
  3. Fragmentation: incomplete thoughts and imperfections
  4. Organic flow: spontaneity

We'll then examine AI-generated dialogue challenges and propose a solution using chain-of-thought reasoning and planning to craft more natural responses.

Art of Looking at RAG Data

In the past year, I've done a lot of consulting on helping companies improve their RAG applications. One of the biggest things I want to call out is the idea of topics and capabilities.

I use this distinction to train teams to identify and look at the data we have to figure out what we need to build next.

10 Ways to Be Data Illiterate (and How to Avoid Them)

Data literacy is an essential skill in today's data-driven world. As AI engineers, understanding how to properly handle, analyze, and interpret data can make the difference between success and failure in our projects. In this post, we will explore ten common pitfalls that lead to data illiteracy and provide actionable strategies to avoid them. By becoming aware of these mistakes and learning how to address them, you can enhance your data literacy and ensure your work is both accurate and impactful. Let's dive in and discover how to navigate the complexities of data with confidence and competence.

Data Flywheel Go Brrr: Using Your Users to Build Better Products

You need to be taking advantage of your users wherever possible. It’s become a bit of a cliche that customers are your most important stakeholders. In the past, this meant that customers bought the product that the company sold and thus kept it solvent. However, as AI seemingly conquers everything, businesses must find replicable processes to create products that meet their users’ needs and are flexible enough to be continually improved and updated over time. This means your users are your most important asset in improving your product. Take advantage of that and use your users to build a better product!

Unraveling the History of Technological Skepticism

Technological advancements have always been met with a mix of skepticism and fear. From the telephone disrupting face-to-face communication to calculators diminishing mental arithmetic skills, each new technology has faced resistance. Even the written word was once believed to weaken human memory.

Technology Perceived Threat
Telephone Disrupting face-to-face communication
Calculators Diminishing mental arithmetic skills
Typewriter Degrading writing quality
Printing Press Threatening manual script work
Written Word Weakening human memory