Skip to content

Decomposing RAG Systems to Identify Bottlenecks

There's a reason Google has separate interfaces for Maps, Images, News, and Shopping. The same reason explains why many RAG systems today are hitting a performance ceiling. After working with dozens of companies implementing RAG, I've discovered that most teams focus on optimizing embeddings while missing two fundamental dimensions that matter far more: Topics and Capabilities.

If you're interested in learning more about how to systematically improve RAG systems, you can sign up for the free email course here:

Sign up for the Free Email Course

Now, let's dive into some frequently asked questions from the course:

A Tale of Unmet Expectations

Let me share a recent case study that illustrates this perfectly:

A construction company implemented a state-of-the-art RAG system for their technical documentation. Despite using the latest embedding models and spending weeks optimizing their prompts, user satisfaction stayed stubbornly around 50%. When we analyzed their query logs, we discovered something fascinating: 20% of queries were simply counting objects in blueprints ("How many doors are on the 15th floor?").

No amount of embedding optimization would help here. By adding a simple object detection model for blueprints, satisfaction for those queries jumped to 87% in just one week.

The Two Dimensions That Actually Matter

After analyzing millions of queries across various industries, I've identified two fundamental dimensions that determine RAG system success:

  1. Topics: Do we have the information users want?
  2. Capabilities: Can we effectively access and process that information?

Let's dive deep into each dimension.

Topics: Content Coverage

Topics represent your system's knowledge inventory. Think of this like a store's product catalog - you can't sell what you don't have.

Examples of Topic Gaps:

  • Missing documentation sections
  • Lack of data for specific time periods
  • Absence of particular use cases or scenarios
  • Missing specific types of content (images, videos, tables)

Real World Example: Netflix

When Netflix notices users searching for "Adam Sandler basketball movies", that's a topic gap - they simply don't have that content. No amount of better search or recommendations will help if the content doesn't exist.

Capabilities: Processing Power

Capabilities represent your system's ability to manipulate and retrieve information in specific ways. This is where most RAG systems fall short.

Common Capability Requirements:

  1. Temporal Understanding
  2. "What changed last week?"
  3. "Show me the latest updates"
  4. Understanding fiscal vs calendar years

  5. Numerical Processing

  6. Counting objects in documents
  7. Calculating trends or changes
  8. Aggregating data across sources

  9. Entity Resolution

  10. Connecting related documents
  11. Understanding document hierarchies
  12. Mapping aliases and references

Real World Example: DoorDash

When DoorDash notices orders dropping after 9 PM, adding more restaurants won't help. They need a capability to filter for "open now" restaurants. No amount of inventory helps if users can't find what's actually available.

The Impact on User Experience

Consider how these dimensions affect real user interactions:

  1. Topic Failures:
  2. "Zero results found"
  3. Completely irrelevant responses
  4. Missing critical information

  5. Capability Failures:

  6. Partially correct answers
  7. Unable to process time-based queries
  8. Can't compare or contrast information

Building a Systematic Approach

Here's how to implement this framework in your RAG system:

  1. Analyze Query Patterns
  2. Categorize failed queries into topic vs capability gaps
  3. Identify clusters of similar issues
  4. Track frequency and impact of each gap

  5. Measure Impact

  6. Query volume (how often does this come up?)
  7. Success rate (how often do we fail?)
  8. Business impact (what does failing cost us?)

  9. Prioritize Improvements

  10. Focus on high-volume, low-success-rate queries
  11. Balance implementation cost against potential impact
  12. Build capabilities that can be reused across topics

Best Practices for Implementation

  1. Start with Data Collection
  2. Log all queries and their success rates
  3. Track which capabilities are used for each query
  4. Monitor topic coverage over time

  5. Build Modular Systems

  6. Separate topic management from capability implementation
  7. Allow for easy addition of new capabilities
  8. Enable A/B testing of different approaches

  9. Measure Everything

  10. Track success rates by topic and capability
  11. Monitor usage patterns of different capabilities
  12. Calculate ROI of topic expansions

Looking Forward

The future of RAG isn't just about better embeddings or larger context windows. It's about: - Building specialized indices for different query types - Developing robust capability routing systems - Creating feedback loops for continuous improvement

Conclusion

Stop focusing solely on embedding optimization. Start analyzing your queries through the lens of topics and capabilities. This framework will help you: - Identify the real bottlenecks in your system - Make strategic decisions about improvements - Build a more effective and scalable RAG application

Remember: The goal isn't to build a perfect system. It's to build a system that gets better every day at solving real user problems.


If you're working on a RAG system right now, try this: Take your last 20 failed queries and sort them into topic vs capability issues. You might be surprised by what patterns emerge.

If you're interested in learning more about how to systematically improve RAG systems, you can sign up for the free email course here:

Sign up for the Free Email Course

Comments