ML Pipeline Templates provide step-by-step guidance on implementing typical machine learning scenarios. Each template introduces a machine learning project structure that allows to modularize data processing, model definition, model training, validation, and inference tasks. Using distinct steps makes it possible to rerun only the steps you need, as you tweak and test your workflow. A well-defined, standard project structure helps all team members to understand how a model was created.
ML pipeline templates are based on popular open source frameworks such as Kubeflow, Keras, Seldon to implement end-to-end ML pipelines that can run on AWS, on-prem hardware, and at the edge. Kubeflow is an open source ML platform dedicated to making deployments of ML workflows on Kubernetes simple, portable and scalable. Kubeflow Pipelines is part of the Kubeflow platform that enables execution of data science workflows on Kubeflow, integrated with experimentation and notebook based experiences. Kubeflow Pipelines services on Kubernetes include the hosted Metadata store, container based orchestration engine, notebook server, and UI to help users develop, run, and manage complex ML pipelines at scale. The Kubeflow Pipelines SDK allows for creation and sharing of components and composition and of pipelines programmatically.
The following Agile Stacks Kubeflow extensions are used in this tutorial to address common ML Ops challenges:
Machine learning (ML) is increasingly used in real world systems, such as autonomous vehicles, voice recognition, language translation, and many others. However, to generate a positive ROI using ML, it needs to be operationalized (deployed into production). Since machine learning is still a new field, early adopters have run into obstacles deploying machine learning into production due to friction with engineering and IT teams. In some cases, work done by data scientists and machine learning engineers is wasted because it never escapes the lab due to technical constraints, or can't be scaled to larger data.
To address these problems, Machine Learning can take a page from a DevOps playbook. In order to deliver business value and to build intelligent software products, data science teams need to focus on the following priorities:
Machine Learning Pipelines play an important role in building production ready AI/ML systems. Using ML pipelines, data scientists, data engineers, and IT operations can collaborate on the steps involved in data preparation, model training, model validation, model deployment, and model testing. Machine Learning pipelines address two main problems of traditional machine learning model development: long cycle time between training models and deploying them to production, which often includes manually converting the model to production-ready code; and using production models that had been trained with stale data. There are many common steps in ML pipelines that should be automated and tracked. For example, model validation steps must be tracked to help data scientists with evaluation of model accuracy and picking the optimal hyper parameters. Agile Stacks provide reusable machine learning pipelines that can be used as a template for your machine learning scenarios.
The following diagram shows a typical ML Pipeline. It provides a platform for running machine learning experiments: making it easy for you to try numerous ideas and techniques and manage your various trials/experiments. The arrows indicate that machine learning projects are highly iterative. As you progress through pipeline steps, you will find yourself iterating on a step until reaching desired model accuracy, then proceeding to the next step. Here is an excellent blog by Jeremy Jordan that discusses machine learning workflow in more detail. The workflow is not complete even after you deploy the model into production: you get feedback from real-world interactions and repeat data preparation and training steps.
In this tutorial we will cover how to leverage Kubeflow Pipeline templates to get your ML experiments from the lab into the real world as quickly as possible. Kubeflow Pipelines is a platform for building and deploying portable, scalable machine learning (ML) workflows based on Kubernetes. We use Kubernetes for automating deployment, scaling, and management of containerized applications. Kubernetes has evolved into the de-facto industry standard for container management and for machine learning at scale. There are many reasons why Kubernetes provides the best platform for machine learning at scale: repeatable experiments, portable and reproducible environments, efficient utilization of CPUs/GPUs, tracking and monitoring metrics in production; and proven scalability.
Based on Agile Stacks Kubeflow Pipeline template, we will implement a machine learning pipeline for training, monitoring and deployment of deep learning models. We will use popular open source frameworks such as Kubeflow, Keras, Seldon to implement end-to-end ML pipelines. The Kubeflow project is designed to simplify the deployment of machine learning projects like Keras and TensorFlow on Kubernetes. These frameworks can leverage multiple GPUs in the Kubernetes cluster for machine learning tasks. We will also cover how you can use Kubeflow pipelines to continuously deploy models to production and retrain models on real life data.
You can use the Agile Stacks Machine Learning Stack to create ML pipelines from several reusable templates. When building pipelines with Agile Stacks ML Pipeline Templates, you can focus on machine learning, rather than on infrastructure. In this step you will create an environment template for Kubernetes cluster and use it to deploy a new Kubernetes cluster in your own cloud account. The process of creating the cluster will take 15 minutes. Agile Stacks demo environments typically contain a pre-deployed cluster so you don't have to wait. View the list of existing clusters by clicking Stacks > Instances.
If you are working on this tutorial during a workshop, please follow instructions to login into development environments provided at the start of the workshop. If you are going through this tutorial outside of workshop, you need to provision a development environment in your own AWS or GCP cloud account. You can register for Agile Stacks trial account or subscribe via AWS Marketplace.
Once you receive an activation email, login into Agile Stacks Console and create a cloud account by clicking on Cloud > Cloud Accounts > Create. Then create an environment: Cloud > Environments > Create. For example, name your environment as DEV, and associate it with the cloud account you just added. Now we can create a stack template: Templates > Create. Give your stack template a unique name, and select the following stack options from the catalog: Kubernetes, Dashboard, Harbor, Minio, Prometheus, ACM (or Let's Encrypt), Kubeflow.
Click Build Now - this will save stack template in Git as a set of Terraform and CloudFormation scripts, and deploy it into your cloud account. On the next page, name your Kubernetes cluster - for example dev.demo05.superhub.io. For this tutorial, you can use default setting for the count and types of Kubernetes nodes. Notice that if "On Demand" check box is left unchecked, we will deploy Kubernetes nodes on AWS spot instances which offer a cost effective way to run Kubernetes. At a later time in this workshop, you can add a few GPU nodes to this cluster in order to train the model on a large number of training samples. Refer to the following screen shot for recommended settings. Now you need to wait 10-15 minutes for Kubernetes cluster to be deployed.
As you work through this tutorial, it uses billable components of AWS or GCP - in the range between $1 and $2 per hour. To minimize costs, follow the instructions in step 7 (cleanup) when you've finished with the tutorial.
Note that in order to provide sufficient CPU and RAM for Jupyter notebooks, you need to select at least one Worker node of size r4.xlarge or larger. Later on you can add a GPU node to train the model faster.
After your Kubernetes cluster is deployed, create a new ML Pipeline from a template by clicking on Stacks > Applications > Install. Select "Machine Learning" template, and then Kubeflow (Keras and Seldon). Compose your machine learning application: select previously created environment (DEV), and use the following options:
Source code repository: Github-demos or your own Git token
Kubeflow engine: Kubeflow
Artifact object storage: Minio
Docker registry for distributed training: Harbor
Jupyter Kernel: reuse kernel from AgileStacks
Application Name: "kfp1"
Bucket Name: "kfp1"
Click the Install button. After the application is created, you will receive 3 URLs:
Source code: Git repository URL where ML Pipeline source code is stored
Notebook: URL to the pipeline editor
Bucket: URL to the object storage bucket where all training data and model files are stored
Click on Notebook button to open the Jupyter notebook.
You can click the Notebook button to open Kubeflow Pipeline Workbench. Alternatively, navigate to your stack instance by clicking Stacks > Instances > List. Select your stack instance and click Kubeflow button.
Click Activity (top left icon), and then click Notebooks. You should see a list of Jupyter notebooks for each pipeline template that was created. Select your Notebook server name and click Connect.
Jupyter notebook integrates code and its output into a single document that combines visualizations, narrative text, mathematical equations, and other rich media. The intuitive workflow promotes iterative and rapid development, making notebooks a preferred tool for many data scientists. You can review all steps of the machine learning pipeline by browsing Python files in workspace > src folder.
A well-organized machine learning codebase should modularize data processing, model definition, model training, validation, and inference tasks. Using distinct steps makes it possible to rerun only the steps you need, as you tweak and test your workflow. A step is a computational unit in the pipeline. A well-defined, standard project structure helps all team members to understand how a model was created.
Data sources and intermediate data are reused across the pipeline, which saves compute time and resources.
├── training-latest <- Data volume (S3 bucket) where training data and model are stored workspace
├── components <- Project components used for pipeline steps such as training ├── nbextentions <- Notebook extension libraries that implement common tasks
├── 101-training.ipynb <- Wrapper Notebook for initial experimentation and fine tuning the model
├── 201-distributed-training.ipynb <- Wrapper Notebook for training at scale using containers an KF pipeline
├── 201-data-pipeline.ipynb <- Example Notebook to implement pipeline steps in Go
├── README.md <- The top-level README for developers using this project.
After you train your model, you can deploy it to get predictions on test data or real life data that the model has never seen before. We are going to package the model as a Docker image and deploy it as a REST API. In addition, we are going to create a simple Web application to send requests to the model for predictions.
Seldon is a great framework for deploying and managing models in Kubernetes. Seldon makes models available as a REST APIs or as a gRPC endpoints and helps to deploy multiple versions of the same model for A/B testing or multi-armed bandits. Seldon takes care of scaling the model and keeping it running all your models with a standard API.
Open the file “serving.py” to take a look at the code we use in order to deploy the trained model files and publish it via API to generate model predictions. We are going to package this code in Docker container and deploy it as Seldon API. In the notebook step  we create Dockerfile, and in step  we deploy the model to a Kubernetes cluster by running kubectl command from Jupyter notebook. Note that “templates/seldon.yaml” provides a template for Kubernetes deployment file. We are going to automatically inject all environment specific details via parameters: model name, model version, path to model files, and Seldon authentication secrets. Keras model is referenced from Seldon container via parameter MODEL_FILE with value "/mnt/s3/latest/training/training1.h5". You can examine seldon.yaml file but you don’t need to change any parameters.
When prompted for SELDON_OAUTH_KEY while running Jupyter notebook, you can enter any random string that will be used to initialize Auth key for REST API. The key is used to secure internal access from web applications to the REST API for model inference. When you enter the key, it is stored in Kubernetes secret.
The following screen shows the output from model deployment step 3.2. You can test the model by sending it some test data for predictions using seldon.prediction API as shown in step 3.3.
It takes a few minutes to deploy the model via Seldon - if you run validation too soon, you may get an error. Just wait 2-3 minutes and try sending a test request to Seldon again. Step 3.3 provides a code sample you can use to test a Keras model deployed via Seldon. You can edit the value of test_payload to send several test payloads to your model.
Now the model is deployed as REST API, you will build a simple Python Flask application to provide a web UI for end users to interact with the model. The test application is going to pull a random issue from GitHub when a user clicks 'Populate Random Issue' button. When a user clicks "Generate Title" button, the application will send issue text to the model via REST API to generate the issue summary. In step 4.1 you will build an application container. The source code for web application is available in a Python file which you can view or edit from the Jupyter notebook: workspace / components / flask / app / src / app.py.
Once the application is deployed you can access it using the following URL:
We have completed an end-to-end ML pipeline that allows production ML lifecycle management.
For more information about Kubeflow, visit https://www.kubeflow.org.
The code for this tutorial is on GitHub (Kubeflow Extensions), and it’s available for one-click deployment as ML pipeline template “Kubeflow Pipeline” on Agile Stacks. While you can run this tutorial on any Kubernetes cluster, the easiest way to create a lab environment is with Agile Stacks.
In this tutorial we implemented a complete machine learning pipeline from data preparation to model training, and serving. Feel free to experiment with it and adapt it to your needs. The objective is to make your machine learning projects more agile, make iterations faster, and models better.