How do we deploy machine learning algorithms? Let’s go beyond the Jupyter notebook. This is by no means an exhaustive list. However, this article outlines a couple of the major paradigms, high level architectures and related rools. For each type of deployment, I have listed some typical architecures and use cases of those architecutres. I’ve also included a list of tools to investigate alongside those architectures.
On Call
Machine Learning Deployments that are “on call” are designed to provide predictions only on a ocassional basis. “On Call” deployments make predictions on groups of observations at one time. The predictions are then stored in a database for downstream usecases.
-
Usecases
- Typical Architectures
- Data Warehouse
- Batch Processing
- Batch API Call
- Common Tools
- H2O
- Spark
- Apache AirFlow
On Demand
Predictions services are always available and provide predictions in real time.
- Usecases
- Users interact with predictions on individual observations
- Storing the observation is either unfeasible or undesirable
- You need a demand based auto-scaling and/or high availability
- Typical Architectures
- REST API
- Microservice
- Deployment depends on load
- Common Tools
On Edge
Predictions services are deployed on the device without connecting to any external services or applications.
- Usecases
- Speed of Inference is important
- Privacy
- Lack of Internet Connectivity
- Typical Architecture
- Depends on the device and algorithm
- Typically functions as a package / library
-
Common Tools