Skip to content

Latest commit

 

History

History
257 lines (204 loc) · 7.73 KB

03_hello_keras.md

File metadata and controls

257 lines (204 loc) · 7.73 KB

Hello Keras - A Simple Artificial Neural Network using Keras

Task

Copy relevant code from below to R/02_hello_keras.R and create a working model

Note how this is now a 3-class classifier and we are evaluating the predictive performance of the model on left out data.

Introduction

The aim of this exercise is to let you build your first artificial neural network using Keras. Naturally, we will use the iris data set.

Load libraries

library("tidyverse")
library("keras")

Data

The famous Iris flower data set contains data to quantify the morphologic variation of Iris flowers of three related species. In other words - A total of 150 observations of 4 input features Sepal.Length, Sepal.Width, Petal.Length and Petal.Width and 3 output classes setosa versicolor and virginica, with 50 observations in each class:

iris %>% head
##   Sepal.Length Sepal.Width Petal.Length Petal.Width Species
## 1          5.1         3.5          1.4         0.2  setosa
## 2          4.9         3.0          1.4         0.2  setosa
## 3          4.7         3.2          1.3         0.2  setosa
## 4          4.6         3.1          1.5         0.2  setosa
## 5          5.0         3.6          1.4         0.2  setosa
## 6          5.4         3.9          1.7         0.4  setosa

Aim

Our aim is to create a model, which connect the 4 input features (Sepal.Length, Sepal.Width, Petal.Length and Petal.Width) to the correct output class (setosa versicolor and virginica) using an artificial neural network. For this task, we have chosen the following simple architecture with one input layer with 4 neurons (one for each feature), one hidden layer with 4 neurons and one output layer with 3 neurons (one for each class), all fully connected:

Our artificial neural network will have a total of 35 parameters: 4 for each input neuron connected to the hidden layer, plus an additional 4 for the associated first bias neuron and 3 for each of the hidden neurons connected to the output layer, plus an additional 3 for the associated second bias neuron. I.e. 4 x 4 + 4 + 4 x 3 + 3 = 35

1. Prepare data

We start with slightly wrangling the iris data set by renaming the input features and converting character labels to numeric:

nn_dat <- iris %>%
  as_tibble %>%
  rename(sepal_l_feat = Sepal.Length,
         sepal_w_feat = Sepal.Width,
         petal_l_feat = Petal.Length,
         petal_w_feat = Petal.Width) %>%
  mutate(class_num = as.numeric(Species) - 1, # factor, so = 0, 1, 2
         class_label = Species)
nn_dat %>%
  head(3)
## # A tibble: 3 x 7
##   sepal_l_feat sepal_w_feat petal_l_feat petal_w_feat Species class_num
##          <dbl>        <dbl>        <dbl>        <dbl> <fct>       <dbl>
## 1          5.1          3.5          1.4          0.2 setosa          0
## 2          4.9          3            1.4          0.2 setosa          0
## 3          4.7          3.2          1.3          0.2 setosa          0
## # … with 1 more variable: class_label <fct>

Then, we split the iris data into a training and a test data set, setting aside 20% of the data for left out data partition, to be used for final performance evaluation:

test_f <- 0.20
nn_dat <- nn_dat %>%
  mutate(partition = sample(x = c("train","test"),
                            size = nrow(.),
                            replace = TRUE,
                            prob = c(1 - test_f, test_f)))
nn_dat %>%
  count(partition)
## # A tibble: 2 x 2
##   partition     n
##   <chr>     <int>
## 1 test         43
## 2 train       107

Based on the partition, we can now create training and test data

# Training data
x_train <- nn_dat %>%
  filter(partition == "train") %>%
  select(contains("feat")) %>%
  as.matrix
y_train <- nn_dat %>%
  filter(partition == "train") %>%
  pull(class_num) %>%
  to_categorical(3)

# Test data
x_test <- nn_dat %>%
  filter(partition == "test") %>%
  select(contains("feat")) %>%
  as.matrix
y_test <- nn_dat %>%
  filter(partition == "test") %>%
  pull(class_num) %>%
  to_categorical(3)

2. Define model

Set architecture (See the green ANN visualisation)

model = keras_model_sequential() %>% 
  layer_dense(units = 4, activation = "relu", input_shape = 4) %>% 
  layer_dense(units = 3, activation = "softmax")

Compile model

model %>%
  compile(loss = "categorical_crossentropy",
          optimizer = optimizer_adam(),
          metrics = c("accuracy")
)

We can get a summary of the model like so:

model %>%
  summary
## Model: "sequential"
## ________________________________________________________________________________
## Layer (type)                        Output Shape                    Param #     
## ================================================================================
## dense (Dense)                       (None, 4)                       20          
## ________________________________________________________________________________
## dense_1 (Dense)                     (None, 3)                       15          
## ================================================================================
## Total params: 35
## Trainable params: 35
## Non-trainable params: 0
## ________________________________________________________________________________

As expected we see 35 trainable parameters.

Train the Artificial Neural Network

Lastly we fit the model and save the training progres in the history object:

history <- model %>%
  fit(x = x_train,
      y = y_train,
      epochs = 200,
      batch_size = 16,
      validation_split = 0
)

Once the model is trained, we can inspect the training process

plot(history)
## `geom_smooth()` using formula 'y ~ x'

Evaluate Network Performance

The final performance can be obtained like so:

perf <- model %>%
  evaluate(x_test, y_test)
perf
##      loss  accuracy 
## 0.2472026 0.9767442

Then we can augment the nn_dat for plotting:

plot_dat <- nn_dat %>%
  filter(partition == "test") %>%
  mutate(y_pred = predict_classes(model, x_test),
         Correct = ifelse(class_num == y_pred, "Yes", "No"))
plot_dat %>%
  select(-contains("feat")) %>%
  head(3)
## # A tibble: 3 x 6
##   Species class_num class_label partition y_pred Correct
##   <fct>       <dbl> <fct>       <chr>      <dbl> <chr>  
## 1 setosa          0 setosa      test           0 Yes    
## 2 setosa          0 setosa      test           0 Yes    
## 3 setosa          0 setosa      test           0 Yes

and lastly, we can visualise the confusion matrix like so:

plot_dat %>%
  ggplot(aes(x = factor(y_pred),
             y = factor(class_num),
             colour = Correct)) +
  geom_jitter() +
  scale_x_discrete(labels = levels(nn_dat$class_label)) +
  scale_y_discrete(labels = levels(nn_dat$class_label)) +
  theme_bw() +
  labs(title = "Classification Performance of Artificial Neural Network",
       subtitle = str_c("Accuracy = ", round(perf["accuracy"], 3) * 100, "%"),
       x = "Predicted iris class",
       y = "True iris class")

Conclusion

Here, we created a 3-class predictor with an accuracy of 97.7% on a left out data partition. I hope, this illustrates how relatively simple it is to get started with Keras.