Skip to content

Writing and mumblings

10 “Foot Guns" for Fine-Tuning and Few-Shots

Let me share a story that might sound familiar.

A few months back, I was helping a Series A startup with their LLM deployment. Their CTO pulled me aside and said, "Jason, we're burning through our OpenAI credits like crazy, and our responses are still inconsistent. We thought fine-tuning would solve everything, but now we're knee-deep in training data issues."

Fast forward to today, and I’ve been diving deep into these challenges as an advisor to Zenbase, a production level version of DSPY. We’re on a mission to help companies get the most out of their AI investments. Think of them as your AI optimization guides, they've been through the trenches, made the mistakes, and now we’re here to help you avoid them.

In this post, I’ll walk you through some of the biggest pitfalls. I’ll share real stories, practical solutions, and lessons learned from working with dozens of companies.

Making Money is Negative Margin

In 2020 I had a hand injury that ended my career for 2-3 years. I've only managed to bounce back into being an indie consultant and educator. On the way back to being a productive member of society I've learned a few things:

  1. I have what it takes to be successful, whether that's the feeling of never wanting to be poor again, or some internal motivation, or the 'cares a lot' or the 'chip on the shoulder' - whatever it is, I believe I will be successful
  2. The gift of being enough is the greatest gift I can give myself
  3. I will likely make too many sacrifices by default, not too few, and it will reflect in my regrets later in life

Prompt Template Resource System Specification

Overview

A token-efficient system for referencing external resources in LLM prompts without including their full content, designed to optimize token usage when LLMs generate template calls.

Template Definition Syntax

@template
def template_name(param1, param2, ...):
    # I recognize that these should really be chat messages
    return Template("""
    Template content with placeholders:

    <param1>
    {{param1}}
    </param1>

    <param2>
    {{param2}}
    </param2>
    """)

Resource Reference Types

Type Syntax Description
File file://path/to/resource.txt Load content from file system
String "Direct content" Use literal string value
Tagged Output context://<tag_type>#<id> Reference session-based generated content with any tag
Image image://path/to/image.jpg Reference image resource
Audio audio://path/to/audio.mp3 Reference audio resource
Video video://path/to/video.mp4 Reference video resource

Template Usage

# Basic usage with mixed resource types
response = template_name(
    param1="file://path/to/resource.txt",
    param2="This is direct string content"
)

# Using various tagged output references
response = template_name(
    param1="context://artifact#summary-12345",
    param2="context://thought#reasoning-67890",
    param3="context://candidate-profile"
)

Tagged Outputs & Memory Management

XML Tag Creation and Reference

Any XML tag can be used to create referenceable content. Examples:

<artifact id="summary-123">
Professional developer with 10 years experience...
</artifact>

<thought id="reasoning-456">
This candidate has strong technical skills but limited management experience.
</thought>

<response id="feedback-789">
Your solution correctly implements the algorithm but could be optimized further.
</response>

Reference in subsequent calls:

create_notion_page(title=str, body="context://artifact#summary-123")
follow_up_question(reasoning="context://thought#reasoning-456")
email_template(feedback="context://response#feedback-789")

Error Handling

Error Behavior
Missing file Return error with path information
Invalid resource ID Return error with invalid ID
Permission issues Return security constraint error
Malformed template Return syntax error with details

Comments

Generally, I think here that if we can just save XML tagged data as resources and get names back out, that we can pass them around as context in a way that's more productive.

No One Has Potential But Yourself

I had a conversation with my friend today that shook something loose in my head: no one has potential. Like most of the lies I tell myself, this is obviously false - and yet, sometimes we need these extreme statements to see a deeper truth.

We often combat excess pessimism with excess optimism. We see potential in others and believe they can change. But this is just a projection of our own potential and values and beliefs.

Let me explain.

I want to invite my lawyer, Luke, to talk a little bit about the legal side of consulting. If you're new you should also checkout our consulting stack post.

In August, Luke officially launched Virgil. Their goal at Virgil is to be a one-stop shop for a startup’s back office, combining legal with related services that founders often prefer to outsource, such as bookkeeping, compliance, tax, and people operations. We primarily operate on flat monthly subscriptions, allowing startups to focus on what truly moves the needle.

He launched Virgil with Eric Ries, author of The Lean Startup, and Jeremy Howard, CEO of Answer AI. He's able to rely on the Answer AI team to build tools and help him stay informed about AI. He's licensed to practice in Illinois, and they have a national presence. That's his background and the essence of what we're building at Virgil.

Decomposing RAG Systems to Identify Bottlenecks

There's a reason Google has separate interfaces for Maps, Images, News, and Shopping. The same reason explains why many RAG systems today are hitting a performance ceiling. After working with dozens of companies implementing RAG, I've discovered that most teams focus on optimizing embeddings while missing two fundamental dimensions that matter far more: Topics and Capabilities.

Those Who Can Do, Must Teach: Why Teaching Makes You Better

"Those who can't do, teach" is wrong. Here's proof: I taught at the Data Science Club while learning myself. If I help bring a room of 60 people even 1 week ahead, in an hour, that's 60 weeks of learning value creation. That's more than a year of value from one hour. Teaching isn't what you do when you can't perform. It's how you multiply your impact.

Its a duty.

What is Retrieval Augmented Generation?

Retrieval augmented generation (RAG) is a technique that enhances the capabilities of large language models (LLMs) by integrating them with external knowledge sources. In essence, RAG combines the generative power of LLMs with the vast information stored in databases, documents, and other repositories. This approach enables LLMs to generate more accurate, relevant, and contextually grounded responses.

How to Improve RAG Applications; 6 Proven Strategies

This article explains six proven strategies to improve Retrieval-Augmented Generation (RAG) systems. It builds on my previous articles and consulting experience helping companies enhance their RAG applications.

By the end of this post, you'll understand six key strategies I've found effective when improving RAG applications:

  • Building a data flywheel with synthetic testing
  • Implementing structured query segmentation
  • Developing specialized search indices
  • Mastering query routing and tool selection
  • Leveraging metadata effectively
  • Creating robust feedback loops

How to Get Started in AI Consulting:

Picture this: You're sitting at your desk, contemplating the leap into AI consulting. Maybe you're a seasoned ML engineer looking to transition from contractor to consultant, or perhaps you've been building AI products and want to branch out independently. Whatever brought you here, you're wondering how to transform your technical expertise into a thriving consulting practice.