Skip to content

Latest commit

 

History

History
80 lines (46 loc) · 3.43 KB

tensorflow-vm.md

File metadata and controls

80 lines (46 loc) · 3.43 KB

Train a TensorFlow model in the cloud

In this tutorial, we will train a TensorFlow model using the MNIST dataset in an Azure Deep Learning Virtual Machine.

The MNIST database has a training set of 60,000 examples, and a test set of 10,000 examples of handwritten digits.

Prerequisites

Before you begin, ensure you have the following installed and configured:

Download sample code

Download this GitHub repository containing samples for getting started with deep learning across TensorFlow, CNTK, Theano and more.

Setup Azure Deep Learning Virtual Machine

Please read instructions for setting up Deep Learning Virtual Machine.

Note

Set Location to US West 2 (or others which have Deep Learning VM) and OS type as Linux.

Update .bashrc to Enable Remote Job Submission via Non-interactive Bash Session

Login to your Deep Learning VM using a tool like Putty or similar. Execute below to modify your bashrc file to enable remote deep learning job submission (configures remote behavior to work just like if you logged into the VM).

echo -e ". /etc/profile\n$(cat ~/.bashrc)" > ~/.bashrc

Open project

  • Launch Visual Studio and select File > Open > Project/Solution.

  • Select the examples\tensorflow folder from the samples repository downloaded

Open project

  • Open the TensorflowExamples.sln file.

Open solution

Add Azure Remote VM

In Server Explorer, right click the Remote Machines node under the AI Tools node and select "Add…". Enter the Remote Machine display name, IP host, SSH port, user name and password/key file.

Add a new remote machine

Submit job to Azure VM

Right click on MNIST project in Solution Explorer and select Submit Job.

Job submission to a remote machine

In the submission window:

  • In the list of Cluster to use, select the remote machine (with "rm:" prefix) to submit the job to.

  • Enter a Job name.

  • Click Submit.

Job submission to Docker container

User can also submit jobs to run in a Docker container:

  • Check 'Run In Docker' checkbox

  • User can select the command to run Docker container: "Docker" or "NvidiaDocker"

  • User can select the Docker image in "Docker image" combobox, or input his own docker image built in Docker Hub. The default docker registry is Docker Hub

  • User can choose the identity to run Docker container in remote machine. The default user is "root".

Check status of job

To see status and details of jobs: expand the virtual machine you submitted the job to in the Server Explorer. Double click on Jobs.

Job browser

Clean up resources (optional)

Stop the VM if you plan on using it in the near future. If you are finished with this tutorial, run the following command to clean up your resources:

az group delete --name myResourceGroup