Things that kill MLOps
Most of today’s ML models are built using public datasets and trained in notebooks. No surprise these models are hard to productize. Operations require automation. If the development loop is not automated, it’s broken. The good news, MLOps solves it. Bad news, there’s a lot of fluff.
One thing that I find annoying about MLOps is the attempt to solve the problem using “familiar” tools. One of my favorite books about MLOps gives a great number of tips on how to use Git, CI/CD, Docker, and Terraform for automating machine learning. As much as I love the practical aspect of the book, I find it at best only a workaround. How so?
First of all none of these tools was designed for data. Git is not designed to deal with data. CI/CD is not designed to orchestrate data workflows. Docker and Terraform aren’t made for data scientists. Just putting these tools together hardly makes things easier. What surely it does is that it brings extra complexity. Now, data scientists have to collaborate with engineers and deal with the tension this collaboration brings.
Okay, imagine Git and CI/CD did support data, would that make them work? Let’s take DevOps as an example. DevOps made it possible for developers to deploy software without involving the IT department. This shortened the feedback loop. As a developer, you write software, you build it, and you deploy it. Can we make data scientists capable of deploying models without involving engineers? Certainly not by using “familiar” tools.
Another annoying thing about MLOps is the so-called end-to-end platforms. You may have seen them under the names SageMaker, Vertex AI, DataBricks (and of course many others). Because all of your products share the same platform, you think you can integrate them tighter together and offer a better product experience. Of course, this is what you tell your customers, while what you think is that by offering multiple solutions that share one platform, you lock your customer into one ecosystem. Often this happens not because end-to-end platforms want to lock you in. This happens because they want to maximize profits. To do that, they have to accumulate features.
Thinking of it reminds me of Dieter Ram’s 10 principles of good design that he formulated in his book “Less is more”. Good design is honest. It does not make a product more innovative, powerful, or valuable than it really is. It does not attempt to manipulate the consumer with promises that cannot be kept. Most of the major vendors have the sin of accumulating features. While this is a valid business tactic to maximize the profit, this is also a weak spot of these platforms as with time, their user experience saturates.
Lack of aesthetics
When it comes to tools, I do really miss the aesthetics of the “old school”. And I’m pretty sure, it’s not just me. Developer tools benefit from aesthetics more than from anything else. Just putting multiple things together doesn’t make it a better tool. If a tool isn’t thoroughly designed or isn’t consistent, it’s very unlikely that you’ll enjoy using it – regardless of how many features it has.
Take Git or Docker. If those tools were not elegant, do you think everybody would use them today? Aesthetics is more than simplicity. It’s also about the consistency that helps learn and understand the tool quickly.
Did you like the article? Subscribe to MLOps Fluff and I promise to post more about developer tools, AI, and in particular how to apply it to MLOps.