What is Machine Learning, When to Use it ,When Not to, and Top Enterprise Use Cases of ML
My notes from ‘Building Machine Learning Systems’ by Chip Huyen
I’ve recently started reading Chip Huyen’s book ‘Designing Machine Learning Systems’. I've really enjoyed reading the book so far and thought I’d share some of my notes! As some of you may know, I love experimenting with Python ML/AI packages and usually share the results on my twitter page.
When I post, I often get messages from people curious about whether they can use a package or model in production. For me, it’s been fun and enlightening reading Chip's book. This series of notes will hopefully be useful to anyone curious about how ML systems are built and deployed. Of course, if you’d like more in-depth information, you can buy the book here. I won’t be sharing everything I read but will try to write about the things I think are particularly helpful. In this post I’ll share some of the things from my notes on chapter 1!
What is a Machine Learning System
A misconception many people have about Machine Learning Systems is that they are just ML algorithms e.g logistic regression or different types of neural networks. In fact Machine Learning Systems include:
- Business requirements
- An interface for ML system users and developers
- deployment, monitoring and updating of logics
- Feature engineering
- ML algorithms
- Evaluation
- The data stack
- Infrastructure that enables the delivery of that logic
Here’s a visual of the different components of an ML system that the book covers
Huyen, Chip. Designing Machine Learning Systems (Fig, 1.1, p. 21). O'Reilly Media. Kindle Edition.
Regardless of which ML algorithm you use, this framework should still work!
When to use Machine Learning
I found this definition of machine learning from the book particularly helpful to keep in mind:
'Machine learning is an approach to (1) learn (2) complex patterns from (3) existing data and use these patterns to make (4) predictions on (5) unseen data.'
Huyen, Chip. Designing Machine Learning Systems (p. 22). O'Reilly Media. Kindle Edition.
Machine learning cannot solve all our problems and ML is often not the most optimal solution for solving a problem. From the definition, ML systems need to have the capacity to learn from data that has complex patterns. For example ML could be good for predicting the price of an Airbnb based on square footage, number of rooms, neighborhood etc. It is not appropriate to sort a list of Airbnb’s into states if the list comes with zip codes since this pattern is simple. It’s also important to note that existing data needs to be available or be possible to collect. ML models solve problems that require predictive answers. To predict is to estimate a value in the future. You can reframe most problems predictively. For example not asking, ‘who won the game’, but ‘who will win the game.’
Here are 4 characteristics of a problem where ML will shine:
- It’s repetitive
- The cost of wrong predictions is cheap
- It’s at scale
- The patterns are constantly changing
Here are 3 situations where you should not use ML:
- In an unethical way
- When there are simpler solutions that do the trick
- It’s not cost effective
The Most Common Ways Enterprises are Using ML
One of my favorite images from the first chapter was this diagram from Algorithmia’s 2020 state of enterprise machine learning survey. Huyen notes that even though the market for consumer applications is growing, the majority of ML use cases are still in the enterprise world.
ML applications in enterprises 'serve internal use cases (reducing costs, generating customer insights and intelligence, internal processing automation) and external use cases (improving customer experience, retaining customers, interacting with customers.)' Huyen, Chip. Designing Machine Learning Systems (p. 32). O'Reilly Media. Kindle Edition.
The figure below from 2020 shows us that ‘reducing costs’ was the main way cooperates used ML, followed by ‘generating customer insights/intelligence.’
Figure 1-3. 2020 state of enterprise machine learning. Source: Adapted from an image by Algorithmia. Huyen, Chip. Designing Machine Learning Systems (p. 33). O'Reilly Media. Kindle Edition.
The full report can be found here and the Algorithmia’s report on 2021 Enterprise ML Trends can be found here.
This is the first in a series of posts I’ll share here on my website. As I mentioned earlier, if you’d like more information on any of this you can buy the book here. Thanks for reading and follow my page for more!