I write about a mix of consulting, open source, personal work, and applying llms. I won't email you more than twice a month, not every post I write is worth sharing but I'll do my best to share the most interesting stuff including my own writing, thoughts, and experiences.
For posts about RAG (Retrieval-Augmented Generation) or LLMs (Large Language Models), check out the category labels in the sidebar. Here are some of my best posts on these topics:
I hosted a Lightning Lesson with Skylar Payne, an experienced AI practitioner who's worked at companies like Google and LinkedIn over the past decade. Skylar shared valuable insights on common RAG (Retrieval-Augmented Generation) anti-patterns he's observed across multiple client engagements, providing practical advice for improving AI systems through better data handling, retrieval, and evaluation practices.
tl;dr: You should build a system that lets you discover value before you commit resources.
!! Key Takeaways
Before asking what to build, start with a simple chatbot to discover what users are interested in. There's no need to reach for a complex agent or workflow before we see real user demand.
Leverage tools like Kura to understand broad user behavior patterns. The sooner we start collecting real user data, the better.
This week, I had conversations with several VPs of AI at large enterprises, and I kept hearing the same story: XX teams experimenting with AI, a CEO demanding results for the next board meeting, sales conference, quarterly review, and no clear path from pilot to production.
These conversations happen because I built improvingrag.com—a resource that helps teams build better RAG systems, which has lead me into many conversations from researchers, engineers, and executives. But the questions aren't about RAG techniques. They're about strategy: "How do we go from experiments to production?" "How do we know what to invest in?" "How do we show ROI?"
I hosted a lightning lesson featuring Ben from Raindrop and Sid from Oleve to discuss AI monitoring, production testing, and data analysis frameworks. This session explored how to effectively identify issues in AI systems, implement structured monitoring, and develop frameworks for improving AI products based on real user data.
Today I spoke to an executive about SaaS products, and they told me something that shifted my perspective entirely: AI agents need to be compared to budgets that companies draw from headcount, not tooling.
This is one of those insights that seems obvious in retrospect, but completely changes how you think about positioning AI tools in today's market—especially in this era of widespread tech layoffs and economic uncertainty.
The world of RAG evaluation feels needlessly complex. Everyone's building frameworks, creating metrics, and generating dashboards that make you feel like you need a PhD just to know if your system is working.
Let me share something I wish I'd understood sooner: consistent content creation isn't just a marketing tactic—it's the foundation of a thriving consulting business.
When I started my consulting journey, I was stuck in the time-for-money trap. I'd jump on Zoom calls with prospects, explain the same concepts repeatedly, and wonder why scaling was so difficult. Then I had a realization that changed everything: what if I could have these conversations at scale?
Now I extract blog post ideas from every client call. Every Friday, I review about 17 potential topics from the week's conversations. I test them with social posts, see which ones get traction (some get 700 views, others 200,000), and develop the winners into comprehensive content.
Here's why this approach has transformed my business:
Imagine this: you open Cursor, ask it to build a feature in YOLO-mode, and let it rip. You flip back to Slack, reply to a few messages, check your emails, and return...
It's still running.
What the hell is going on? .sh files appear, there's a fresh Makefile, and a mysterious .gitignore. Anxiety creeps in. Should you interrupt it? Could you accidentally trash something critical?
Relax—you're not alone. This anxiety is common, especially among developers newer to powerful agents like Cursor's. Fortunately, Git is here to save the day.
In Part 1, you learned the basics of safely using Git with Cursor agents. Now, let's level up your workflow by diving into advanced Git practices and explicitly instructing Cursor to handle these for you.
Retrieval-Augmented Generation (RAG) systems have become essential tools for enterprises looking to harness their vast repositories of internal knowledge. While the theoretical foundations of RAG are well-understood, implementing these systems effectively in enterprise environments presents unique challenges that aren't addressed in academic literature or consumer applications. This article delves into advanced techniques for fine-tuning embedding models in enterprise RAG systems, based on insights from Manav Rathod, a software engineer at Glean who specializes in semantic search and ML systems for search ranking and assistant quality.
The discussion focuses on a critical yet often overlooked component of RAG systems: custom-trained embedding models that understand company-specific language, terminology, and document relationships. As Jason Liu aptly noted during the session, "If you're not fine-tuning your embeddings, you're more like a Blockbuster than a Netflix." This perspective highlights how critical embedding fine-tuning has become for competitive enterprise AI systems.