Skip to content

Hiring MLEs at early stage companies

Build fast, hire slow! I hate seeing companies make dumb mistakes, especially regarding hiring, and I’m not against full-time employment. Still, as a consultant, part-time engagements are often more beneficial to me, influencing my perspective on hiring. That said, I've observed two notable patterns in startup hiring practices: hiring too early and not hiring for dedicated research. Unfortunately, these patterns lead to startups hiring machine learning engineers to bolster their generative AI strengths, only to have them perform janitorial work for the first six months of joining. It makes me wonder if startups are making easy-to-correct mistakes based on a sense of insecurity in trying to capture this current wave of AI optimism. Companies hire Machine learning engineers too early in their life cycle.¶

Many startups must stop hiring machine learning engineers too early in the development process, especially when the primary focus should have been on app development and integration work. A full-stack AI engineer can provide much greater value at this stage since they're likely to function as a full-stack developer rather than a specialized machine learning engineer. Consequently, these misplaced machine learning engineers often assist with app development or DevOps tasks instead of focusing on their core competencies of training models and building ML solutions.

After all, my background is in mathematics and physics, not engineering. I would rather spend my days looking at data than trying to spend two or three hours debugging TypeScript build errors.

As a data scientist and a machine learning engineer, most of my skills are best suited for a company at a later stage in its cycle when there might already be two dozen engineers deep into building a product. To even consider joining a company at an early stage, I would have to recognize that most of my responsibilities early on will be around digging into the application, which means no data, objectives, or model. At this point, companies shouldn’t hire machine learning engineers; machine learning engineers shouldn't work for these companies in most cases. The exception to this rule would have to be a more ambitious, longer-term technical vision that requires me to step aside from research. It would have to be so exciting that I would be willing to deal with old errors and study blogging and the bones of the product to get to the more exciting meat of the problem. This diatribe isn’t meant to be a soapbox where I whine about early-stage companies, but without the opportunity to focus on research at some point in the near future and drive impact through improving models, good machine learning engineers will realize that they're better suited for another role or another company.

Machine learning engineers are hired too early in a company's life cycle

Many startups make the mistake of hiring machine learning engineers a bit too early in the development process, especially when the primary focus should have been on app development and integration work. I think a full-stack AI engineer can provide a lot of value at this stage since they're likely to function as a full-stack developer rather than a specialized machine learning engineer. Consequently, these engineers often find themselves assisting with app development or DevOps tasks instead of focusing on their core competencies of training models and building ML solutions.

After all, my background is in mathematics and physics, not engineering. I would rather spend my days looking at data than trying to spend two or three hours debugging TypeScript build errors.

As a data scientist and a machine learning engineer, most of my skills are best suited for a slightly later-stage company when there might already be two dozen engineers deep into building out the product. For me to even consider joining a company at an early stage, I would have to recognize that most of my responsibilities early on will be around digging into the application, which can be seen as a drag. No data, no objectives, no model. If you can consider any of these early-stage companies, I would really have to be sold on a more ambitious, longer-term technical vision and the significance of that problem, needing us to step aside from research. It would have to be so exciting that I would be willing to deal with old errors and studying blogging and all the boring stuff first in order to get to the more exciting meat of the problem. But without the opportunity to focus on research at some point in the near future and drive impact through improving these models, I think machine learning engineers who are really good will likely realize that they're better suited for another role or another company rather than having to flex into traditional machine learning.

Lack of Dedicated Research Teams Hinders AI Progress

The other pattern is that engineering teams often excel at crafting impressive demos that attract attention and gain popularity. However, without a dedicated research team in place, this success frequently leads to significant challenges. Highly motivated engineers behind these demos find themselves overwhelmed by day-to-day engineering tasks, including bug fixes and maintenance, which may not always pertain directly to AI.

This is generally an issue because there are a lot of folks who have caught the bug of generative AI and are letting it distract them. These are people who could go above and beyond to really figure out and understand what it is they want to build, how to build it, and how it could be impactful for the business. Now, they find themselves torn between their passions for advancing AI capabilities and the need to keep the existing core products running smoothly.

Simultaneously, it may be challenging for startups to justify hiring a dedicated machine learning engineer at this stage. Without a clear division between research and engineering tasks, there may not be enough specialized work to warrant a full-time ML engineer, as much of their time would be spent on general engineering responsibilities.

!!! note “This is one of the benefits of hiring part-time consultants!”

To address this issue, startups should carefully consider the timing of their machine learning hires and ensure that there is sufficient data and infrastructure in place to support their work. If companies allow engineers who've caught the AI bug and are already familiar with these systems and infrastructure to be entrepreneurial and lead the development of these teams, they could move very quickly, given the tools that OpenAI and Anthropic provide us. Even if small, a dedicated research team can help maintain the momentum of AI research and development, even as the company scales and faces increased engineering demands.

It's essential to recognize that for many companies, the initial AI work will likely focus on integration rather than pure research. This presents itself as a tradeoff and potential deterrent for onboarding machine learning talent. However, having team members who are strong developers and genuinely interested in AI will be crucial for the company's long-term success. By finding the right balance between research and engineering, startups can lay the foundation for sustained AI innovation and growth.

Subscribe to my writing

I write about a mix of consulting, open source, personal work, and applying llms. I won't email you more than twice a month, not every post I write is worth sharing but I'll do my best to share the most interesting stuff including my own writing, thoughts, and experiences.

Comments