Running Distributed TensorFlow on Compute Engine
This lab shows you how to use a distributed configuration of TensorFlow 1.x on multiple Compute Engine instances to train a convolutional neural network model using the MNIST dataset. The MNIST dataset enables handwritten digit recognition, and is widely used in machine learning as a training set for image recognition.
TensorFlow is Google's open source library for machine learning, developed by researchers and engineers in Google's Machine Intelligence organization, which is part of Research at Google. TensorFlow is designed to run on multiple computers to distribute training workloads. For this lab you will run TensorFlow 1.x on multiple Compute Engine virtual machine instances to train the model. You can use Cloud Machine Learning Engine instead, which manages resource allocation tasks for you and can host your trained models. We recommend that you use Cloud ML Engine unless you have a specific reason not to. You can learn more in the this lab that uses Cloud ML Engine and Cloud Datalab.
The following diagram describes the architecture for running a distributed configuration of TensorFlow 1.x on Compute Engine, and using Cloud ML Engine with Cloud Datalab to execute predictions with your trained model.
This Qwiklab shows you how to set up and use this architecture, and explains some of the concepts along the way.
Set up Compute Engine to create a cluster of virtual machines (VMs) to run TensorFlow 1.x.
Learn how to run the distributed TensorFlow 1.x sample code on your Compute Engine cluster to train a model.
Deploy the trained model to Cloud ML Engine to create a custom API for predictions and then execute predictions using a Cloud Datalab notebook.
Join Qwiklabs to read the rest of this lab...and more!
- Get temporary access to the Google Cloud Console.
- Over 200 labs from beginner to advanced levels.
- Bite-sized so you can learn at your own pace.