Skip to content

Writing and mumblings

Learning to Learn

After writing my post advice for young people, a couple of people asked about my learning process. I could discuss overcoming plateaus or developing mastery, learning for the joy of learning. I could also talk about how to avoid feeling overwhelmed by new topics and break them down into smaller pieces. However, I think that has been done before.

Instead, I'm going to explore a new style. I'm just going to go through a chronological telling of my life and what I learned from just trying new things. I'm going to talk about the tactics and strategies and see how this pans out.

How to build a terrible RAG system

RAG Course

I'm building a RAG Course right now, if you're interested in the course please fill out this form

If you've seen any of my work, you know that the main message I have for anyone building a RAG system is to think of it primarily as a recommendation system. Today, I want to introduce the concept of inverted thinking to address how we should approach the challenge of creating an exceptional system.

What is inverted thinking?

Inversion is the practice of thinking through problems in reverse. It's the practice of “inverting” a problem - turning it upside down - to see it from a different perspective. In its most powerful form, inversion is asking how an endeavor could fail, and then being careful to avoid those pitfalls. [1]

Who am I?

In the next year, this blog will be painted with a mix of technical machine learning content and personal notes. I've spent more of my 20s thinking about my life than machine learning. I'm not good at either, but I enjoy both.

Life story

I was born in a village in China. My parents were the children of rural farmers who grew up during the Cultural Revolution. They were the first generation of their family to read and write, and also the first generation to leave the village.

RAG Course

Check out this course if you're interested in systematically improving RAG.

With the advent of large language models (LLM), retrieval augmented generation (RAG) has become a hot topic. However throught the past year of helping startups integrate LLMs into their stack I've noticed that the pattern of taking user queries, embedding them, and directly searching a vector store is effectively demoware.

What is RAG?

Retrieval augmented generation (RAG) is a technique that uses an LLM to generate responses, but uses a search backend to augment the generation. In the past year using text embeddings with a vector databases has been the most popular approach I've seen being socialized.


Simple RAG that embedded the user query and makes a search.

So let's kick things off by examining what I like to call the 'Dumb' RAG Model—a basic setup that's more common than you'd think.

Kojima's Philosophy in LLMs: From Sticks to Ropes

Hideo Kojima's unique perspective on game design, emphasizing empowerment over guidance, offers a striking parallel to the evolving world of Large Language Models (LLMs). Kojima advocates for giving players a rope, not a stick, signifying support that encourages exploration and personal growth. This concept, when applied to LLMs, raises a critical question: Are we merely using these models as tools for straightforward tasks, or are we empowering users to think critically and creatively?

Good LLM Observability is just plain observability

In this post, I aim to demystify the concept of LLM observability. I'll illustrate how everyday tools employed in system monitoring and debugging can be effectively harnessed to enhance AI agents. Using Open Telemetry, we'll delve into creating comprehensive telemetry for intricate agent actions, spanning from question answering to autonomous decision-making.

If you want to learn about my consulting practice check out my services page. If you're interested in working together please reach out to me via email

What is Open Telemetry?

Essentially, Open Telemetry comprises a suite of APIs, tools, and SDKs that facilitate the creation, collection, and exportation of telemetry data (such as metrics, logs, and traces). This data is crucial for analyzing and understanding the performance and behavior of software applications.

Freediving under ice

Growing up, I wasn't very physically active. However, as I got older and had more time, I made a conscious effort to get in shape and improve my relationship with my body.

I had done plenty of sports before like you know ping pong or rock climbing or jiu jitsu but after I got my hand injuries during covid I really couldn't do any of that...

Recommendations with Flight at Stitch Fix

As a data scientist at Stitch Fix, I faced the challenge of adapting recommendation code for real-time systems. With the absence of standardization and proper performance testing, tracing, and logging, building reliable systems was a struggle.

To tackle these problems, I created Flight – a framework that acts as a semantic bridge and integrates multiple systems within Stitch Fix. It provides modular operator classes for data scientists to develop, and offers three levels of user experience.