Software Engineering¶

2025/09/04
in Applied AI, Software Engineering
7 min read

How Should We Choose Agent Frameworks and Form Factors?

This is part of the Context Engineering Series. I'm focusing on agent frameworks because understanding form factors and complexity levels is essential before building any agentic system.

What Do We Actually Mean When We Say We Want to Build Agents?

Field note from a conversation with Vignesh Mohankumar, a successful consultant who helps companies navigate AI implementation decisions. Vignesh and I are both AI consultants helping companies build AI systems—he focuses on implementations and workflows, while I help with overall strategy and execution.

When companies say they want to build agents, I focus on practical outcomes. What specific functionality do you need? What business value are you trying to create?

2025/09/04
in Applied AI, Software Engineering
8 min read

How Do We Prototype Agents Rapidly?

This is part of the Context Engineering Series. I'm focusing on rapid prototyping because testing agent viability quickly is essential for good context engineering decisions.

If your boss is asking you to "explore agents," start here. This methodology will give you evidence in days, not quarters.

Most teams waste months building agent frameworks before they know if their idea actually works. There's a faster way: use Claude Code as your testing harness to validate agent concepts without writing orchestration code.

2025/01/06
in Writing and Communication, Software Engineering
4 min read

I Used AI Agents to Add 50+ Cross-Links to My Blog (And You Can Too)

I just had an AI agent read through 100+ of my blog posts and tell me exactly where to add internal links. In 30 minutes, it found connections I'd missed for years. Here's how I did it.

2024/10/29
in Applied AI, Software Engineering, Writing and Communication
4 min read

How to Lead AI Engineering Teams

Have you ever wondered why some teams seem to effortlessly deliver value while others stay busy but make no real progress?

I recently had a conversation that completely changed how I think about leading teams. While discussing team performance with a VP of Engineering who was frustrated with their team's slow progress, I suggested focusing on better standups and more experiments.

That's when Skylar Payne dropped a truth bomb that made me completely rethink everything:

"Leaders are living and breathing the business strategy through their meetings and context, but the people on the ground don't have any fucking clue what that is. They're kind of trying to read the tea leaves to understand what it is."

That moment was a wake-up call.

I had been so focused on the mechanics of execution that I'd missed something fundamental: The best processes in the world won't help if your team doesn't understand how their work drives real value.

In less than an hour, I learned more about effective leadership than I had in the past year. Let me share what I discovered.

2024/10/25
in Applied AI, Software Engineering, Writing and Communication
5 min read

SWE vs AI Engineering Standups

When I talk to engineering leaders struggling with their AI teams, I often hear the same frustration: "Why is everything taking so long? Why can't we just ship features like our other teams?"

This frustration stems from a fundamental misunderstanding: AI development isn't just engineering - it's applied research. And this changes everything about how we need to think about progress, goals, and team management. In a previous article I wrote about communication for AI teams. Today I want to talk about standups specifically.

The ticket is not the feature, the ticket is the experiment, the outcome is learning.

2024/10/15
in Applied AI, Software Engineering, Writing and Communication
9 min read

Effective Communication in AI Engineering: Moving Beyond Vague Updates

The right way to do AI engineering updates

Helping software engineers enhance their AI engineering processes through rigorous and insightful updates.

In the dynamic realm of AI engineering, effective communication is crucial for project success. Consider two scenarios:

Scenario A: "We made some improvements to the model. It seems better now."

Scenario B: "Our hypothesis was that fine-tuning on domain-specific data would improve accuracy. We implemented this change and observed a 15% increase in F1 score, from 0.72 to 0.83, on our test set. However, inference time increased by 20ms on average."

Scenario B clearly provides more value and allows for informed decision-making. After collaborating with numerous startups on their AI initiatives, I've witnessed the transformative power of precise, data-driven communication. It's not just about relaying information; it's about enabling action, fostering alignment, and driving progress.

2024/05/22
in Software Engineering
3 min read

What is prompt optimization?

Prompt optimization is the process of improving the quality of prompts used to generate content. Often by using few shots of context to generate a few examples of the desired output, then refining the prompt to generate more examples of the desired output.

2024/04/08
in Software Engineering
5 min read

Hiring MLEs at early stage companies

Build fast, hire slow! I hate seeing companies make dumb mistakes, especially regarding hiring, and I’m not against full-time employment. Still, as a consultant, part-time engagements are often more beneficial to me, influencing my perspective on hiring. That said, I've observed two notable patterns in startup hiring practices: hiring too early and not hiring for dedicated research. Unfortunately, these patterns lead to startups hiring machine learning engineers to bolster their generative AI strengths, only to have them perform janitorial work for the first six months of joining. It makes me wonder if startups are making easy-to-correct mistakes based on a sense of insecurity in trying to capture this current wave of AI optimism. Companies hire Machine learning engineers too early in their life cycle.¶

Many startups must stop hiring machine learning engineers too early in the development process, especially when the primary focus should have been on app development and integration work. A full-stack AI engineer can provide much greater value at this stage since they're likely to function as a full-stack developer rather than a specialized machine learning engineer. Consequently, these misplaced machine learning engineers often assist with app development or DevOps tasks instead of focusing on their core competencies of training models and building ML solutions.

After all, my background is in mathematics and physics, not engineering. I would rather spend my days looking at data than trying to spend two or three hours debugging TypeScript build errors.

2024/02/20
in Software Engineering
3 min read

Format your own prompts

This is mostly to add onto Hamels great post called Fuck you show me the prompt

I think too many llm libraries are trying to format your strings in weird ways that don't make sense. In an OpenAI call for the most part what they accept is an array of messages.

from pydantic import BaseModel

class Messages(BaseModel):
    content: str
    role: Literal["user", "system", "assistant"]

But so many libaries wanted me you to submit a string block and offer some synatic sugar to make it look like this: They also tend to map the docstring to the prompt. so instead of accessing a string variable I have to access the docstring via __doc__.

2024/01/19
in Software Engineering
7 min read

Tips for probabilistic software

This writing stems from my experience advising a few startups, particularly smaller ones with plenty of junior software engineers trying to transition into machine learning and related fields. From this work, I've noticed three topics that I want to address. My aim is that, by the end of this article, these younger developers will be equipped with key questions they can ask themselves to improve their ability to make decisions under uncertainty.

Could an experiment just answer my questions?
What specific improvements am I measuring?
How will the result help me make a decision?
Under what conditions will I reevaluate if results are not positive?
Can I use the results to update my mental model and plan future work?