Skip to content

Commit

Permalink
Modefy Quickstart pages of Orca (#3020)
Browse files Browse the repository at this point in the history
* Modefied the quickstart page of Orca

* Add 'zoo' to 'analytics'

* Fix some small typoes

* Alignment
  • Loading branch information
leonardozcm authored Nov 3, 2020
1 parent e92070b commit a5e0c40
Show file tree
Hide file tree
Showing 3 changed files with 91 additions and 67 deletions.
68 changes: 41 additions & 27 deletions docs/docs/Orca/orca-pytorch-quickstart.md
Original file line number Diff line number Diff line change
@@ -1,22 +1,26 @@
## **Orca PyTorch Quickstart**

**In this guide we’ll show you how to organize your PyTorch code into Orca in 3 steps**
**In this guide we’ll show you how to organize your PyTorch code into Orca in 3 steps.**

Organizing your code with Orca makes your code:
* Keep all the flexibility
Scaling your Pytorch applications with Orca makes your code:

* Well-organized and flexible
* Easier to reproduce
* Utilize distributed training without changing your model
* Able to perform distributed training without changing your model

### **Step 0: Prepare environment**
We recommend you to use [Anaconda](https://www.anaconda.com/distribution/#linux) to prepare the environments, especially if you want to run on a yarn cluster(yarn-client mode only).
**Note:** You can install the latest analytics whl by following instructions ([here](https://analytics-zoo.github.io/master/#PythonUserGuide/install/#install-the-latest-nightly-build-wheels-for-pip)).
```
conda create -n zoo python=3.7 #zoo is conda enviroment name, you can set another name you like.
### **Step 0: Prepare Environment**
We recommend you to use [Anaconda](https://www.anaconda.com/distribution/#linux) to prepare the environments, especially if you want to run on a yarn cluster (yarn-client mode only).

Download and install latest analytics-zoo whl by the following instructions [here](../PythonUserGuide/install/#install-the-latest-nightly-build-wheels-for-pip).

**Note:** Conda environment is required to run on Yarn, but not strictly necessary for running on local.

```bash
conda create -n zoo python=3.7 # zoo is conda enviroment name, you can set another name you like.
conda activate zoo
pip install analytics-zoo==0.9.0.dev0 # or above
pip install analytics_zoo-${VERSION}-${TIMESTAMP}-py2.py3-none-${OS}_x86_64.whl
pip install jep==3.9.0
conda install pytorch torchvision cpuonly -c pytorch #command for linux
conda install pytorch torchvision -c pytorch #command for macOS
conda install pytorch torchvision cpuonly -c pytorch # command for linux
conda install pytorch torchvision -c pytorch # command for macOS
```

### **Step 1: Init Orca Context**
Expand All @@ -34,8 +38,9 @@ sc = init_orca_context(
"spark.task.maxFailures": "1",
"spark.driver.extraJavaOptions": "-Dbigdl.failure.retryTimes=1"})
```
**Note:** you should `export HADOOP_CONF_DIR=/path/to/hadoop/conf/dir`
* Reference: [Orca Context](https://analytics-zoo.github.io/master/#Orca/context/)
**Note:** You should `export HADOOP_CONF_DIR=/path/to/hadoop/conf/dir`.

View [Orca Context](./context) for more details.

### **Step 2: Define PyTorch Model, Loss function and Optimizer**
```python
Expand Down Expand Up @@ -64,42 +69,51 @@ class LeNet(nn.Module):
model = LeNet()
model.train()
criterion = nn.NLLLoss()
adam = Adam(args.lr)
adam = Adam(1e-4)
```

### **Step 3: Fit with Orca PyTorch Estimator**
1. Define the data in whatever way you want. Orca just needs a dataloader, a callable datacreator or an Orca SparkXShards
### **Step 3: Fit with Orca PyTorch Estimator**

1) Define the data in whatever way you want. Orca just needs a [Pytorch DataLoader](https://pytorch.org/docs/stable/data.html), a data creator function or [Orca SparkXShards](./data).
```python
torch.manual_seed(args.seed)
import torch
from torchvision import datasets, transforms

torch.manual_seed(0)
dir='./dataset'
batch_size=64
test_batch_size=64
train_loader = torch.utils.data.DataLoader(
datasets.MNIST(args.dir, train=True, download=True,
datasets.MNIST(dir, train=True, download=True,
transform=transforms.Compose([
transforms.ToTensor(),
transforms.Normalize((0.1307,), (0.3081,))
])),
batch_size=args.batch_size, shuffle=True)
batch_size=batch_size, shuffle=True)
test_loader = torch.utils.data.DataLoader(
datasets.MNIST(args.dir, train=False,
datasets.MNIST(dir, train=False,
transform=transforms.Compose([
transforms.ToTensor(),
transforms.Normalize((0.1307,), (0.3081,))
])),
batch_size=args.test_batch_size, shuffle=False)
batch_size=test_batch_size, shuffle=False)
```

2. Create an estimator
2) Create an Estimator
```python
from zoo.orca.learn.pytorch import Estimator

zoo_estimator = Estimator.from_torch(model=model, optimizer=adam, loss=criterion, backend="bigdl")
```

3. Fit with estimator
3) Fit with Estimator

```python
from zoo.orca.learn.metrics import Accuracy
from zoo.orca.learn.trigger import EveryEpoch
zoo_estimator.fit(data=train_loader, epochs=args.epochs, validation_data=test_loader,

zoo_estimator.fit(data=train_loader, epochs=10, validation_data=test_loader,
validation_methods=[Accuracy()], checkpoint_trigger=EveryEpoch())
```

**Note:** you should call `stop_orca_context()` when your application finishes.
**Note:** You should call `stop_orca_context()` when your application finishes.
86 changes: 48 additions & 38 deletions docs/docs/Orca/orca-tf-quickstart.md
Original file line number Diff line number Diff line change
@@ -1,43 +1,47 @@
## **Orca TensorFlow Quickstart**

**In this guide we’ll show you how to organize your TensorFlow code into Orca in 3 steps**
**In this guide we’ll show you how to organize your TensorFlow code into Orca in 3 steps.**

Organizing your code with Orca makes your code:
* Keep all the flexibility
Scaling your TensorFlow applications with Orca makes your code:

* Well-organized and flexible
* Easier to reproduce
* Utilize distributed training without changing your model
* Able to perform distributed training without changing your model

### **Step 0: Prepare Environment**
We recommend you to use [Anaconda](https://www.anaconda.com/distribution/#linux) to prepare the environments, especially if you want to run on a yarn cluster (yarn-client mode only).

Download and install latest analytics-zoo whl by the following instructions [here](../PythonUserGuide/install/#install-the-latest-nightly-build-wheels-for-pip).

### **Step 0: Prepare environment**
Download and install latest analytics whl by following instructions ([here](https://analytics-zoo.github.io/master/#PythonUserGuide/install/#install-the-latest-nightly-build-wheels-for-pip)).
**Note:** Conda environment is required to run on Yarn, but not strictly necessary for running on local.

```bash
conda create -y -n analytics-zoo python==3.7.7
conda activate analytics-zoo
conda create -n zoo python=3.7 # zoo is conda enviroment name, you can set another name you like.
conda activate zoo
pip install analytics_zoo-${VERSION}-${TIMESTAMP}-py2.py3-none-${OS}_x86_64.whl
pip install tensorflow==1.15.0
pip install psutil
```

Note: conda environment is required to run on Yarn, but not strictly necessary for running on local.

### **Step 1: Init Orca Context**
```python
import tensorflow as tf
from zoo.orca import init_orca_context, stop_orca_context
from zoo.orca.learn.tf.estimator import Estimator

# run in local mode
init_orca_context(cluster_mode="local", cores=4)

# run in yarn client mode
init_orca_context(cluster_mode="yarn-client", num_nodes=2, cores=2, driver_memory="6g")
```
* Reference: [Orca Context](https://analytics-zoo.github.io/master/#Orca/context/)
**Note:** You should `export HADOOP_CONF_DIR=/path/to/hadoop/conf/dir`.

View [Orca Context](./context) for more details.

### **Step 2: Define Model, Loss Function and Metrics**

#### **For Keras Users**
* For Keras Users
```python
import tensorflow as tf

model = tf.keras.Sequential(
[tf.keras.layers.Conv2D(20, kernel_size=(5, 5), strides=(1, 1), activation='tanh',
input_shape=(28, 28, 1), padding='valid'),
Expand All @@ -56,8 +60,10 @@ model.compile(optimizer=tf.keras.optimizers.RMSprop(),
metrics=['accuracy'])
```

#### **For Graph Users**
* For Graph Users
```python
import tensorflow as tf

def accuracy(logits, labels):
predictions = tf.argmax(logits, axis=1, output_type=labels.dtype)
is_correct = tf.cast(tf.equal(predictions, labels), dtype=tf.float32)
Expand All @@ -80,14 +86,12 @@ images = tf.placeholder(dtype=tf.float32, shape=(None, 28, 28, 1))
labels = tf.placeholder(dtype=tf.int32, shape=(None,))

logits = lenet(images)

loss = tf.reduce_mean(tf.losses.sparse_softmax_cross_entropy(logits=logits, labels=labels))

acc = accuracy(logits, labels)
```

### **Step 3: Fit with Orca TensorFlow Estimator**
1. Define the dataset in whatever way you want. Orca just needs tf.data.Dataset, Spark DataFrame or Orca SparkXShards.
1) Define the dataset in whatever way you want. Orca supports [tf.data.Dataset](https://www.tensorflow.org/api_docs/python/tf/data/Dataset), [Spark DataFrame](https://spark.apache.org/docs/latest/sql-programming-guide.html) and [Orca SparkXShards](./data).
```python
def preprocess(x, y):
return tf.to_float(tf.reshape(x, (-1, 28, 28, 1))) / 255.0, y
Expand All @@ -103,43 +107,49 @@ val_dataset = tf.data.Dataset.from_tensor_slices((val_feature, val_label))
val_dataset = val_dataset.map(preprocess)
```

2. Create an estimator
2) Create an Estimator

* For Keras Users
```python
est = Estimator.from_keras(keras_model=model)
from zoo.orca.learn.tf.estimator import Estimator

zoo_estimator = Estimator.from_keras(keras_model=model)
```
* For Graph Users
```python
est = Estimator.from_graph(inputs=images,
outputs=logits,
labels=labels,
loss=loss,
optimizer=tf.train.AdamOptimizer(),
metrics={"acc": acc})
from zoo.orca.learn.tf.estimator import Estimator

zoo_estimator = Estimator.from_graph(inputs=images,
outputs=logits,
labels=labels,
loss=loss,
optimizer=tf.train.AdamOptimizer(),
metrics={"acc": acc})
```

3. Fit with estimator
3) Fit with Estimator
```python
est.fit(data=train_dataset,
batch_size=320,
epochs=max_epoch,
validation_data=val_dataset)
zoo_estimator.fit(data=train_dataset,
batch_size=320,
epochs=100,
validation_data=val_dataset)
```

4. Evaluate with estimator
4) Evaluate with Estimator
```python
result = est.evaluate(val_dataset)
result = zoo_estimator.evaluate(val_dataset)
print(result)
```

5. Save Model
5) Save Model

* For Keras Users
```python
est.save_keras_model("/tmp/mnist_keras.h5")
zoo_estimator.save_keras_model("/tmp/mnist_keras.h5")
```
* For Graph Users
```python
est.save_tf_checkpoint("/tmp/lenet/model")
zoo_estimator.save_tf_checkpoint("/tmp/lenet/model")
```

**Note:** you should call `stop_orca_context()` when your application finishes.
**Note:** You should call `stop_orca_context()` when your application finishes.
4 changes: 2 additions & 2 deletions docs/mkdocs.yml
Original file line number Diff line number Diff line change
Expand Up @@ -164,10 +164,10 @@ pages:
- Overview: Orca/overview.md
- OrcaContext: Orca/context.md
- Data: Orca/data.md
- TensorFlow Estimator: Orca/orca-tf-estimator.md
- TensorFlow Quickstart: Orca/orca-tf-quickstart.md
- PyTorch Estimator: Orca/orca-pytorch-estimator.md
- TensorFlow Estimator: Orca/orca-tf-estimator.md
- PyTorch Quickstart: Orca/orca-pytorch-quickstart.md
- PyTorch Estimator: Orca/orca-pytorch-estimator.md
- Powered by: powered-by.md
- Presentations: presentations.md
- Meetup & Webinar: meetup.md
Expand Down

0 comments on commit a5e0c40

Please sign in to comment.