Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Mlxtend to support categorical data in plot_decision_regions #607

Closed
christianversloot opened this issue Oct 15, 2019 · 7 comments · Fixed by #608
Closed

Mlxtend to support categorical data in plot_decision_regions #607

christianversloot opened this issue Oct 15, 2019 · 7 comments · Fixed by #608

Comments

@christianversloot
Copy link

Hi there,

Thanks for your work! I'm happily using plot_decision_regions to visualize the decision boundary for my Keras models.

I'm currently experimenting with loss functions to get a feel for how they work. Currently, my (very simple) setup is as follows for testing how the Keras implementation of categorical_hinge (multiclass hinge loss) works:

  1. I generate a test dataset with Scikit's make_blobs containing three separable clusters, like this:

3clust

  1. I create a simple MLP architecture in Keras that computes loss by means of categorical_hinge. The model learns to classify the testing data into the correct cluster successfully.

However, when plotting the decision boundaries with Mlxtend's plot_decision_regions, I run into this error:

Traceback (most recent call last):
  File "multiclass-hinge.py", line 56, in <module>
    plot_decision_regions(X_testing, Targets_testing_nonc, clf=model, legend=3)
  File "C:\Users\chris\Anaconda3\envs\tensorflow_gpu\lib\site-packages\mlxtend\plotting\decision_regions.py", line 231, in plot_decision_regions
    Z = Z.reshape(xx.shape)
ValueError: cannot reshape array of size 921600 into shape (480,640)

I believe it originates from the fact that my target data has to be one-hot encoded in order to allow Keras to apply categorical hinge loss. This belief is strengthened by the fact that 921600 divided by 640 is 1440, which itself divided by 3 (number of clusters and hence given one-hot encoding number of target values per target sample, e.g. [1 0 0] ) is the requested 480.

Why this problem emerges is because the actual model clf used by Mlxtend produces one-hot encoded outputs itself, which apparently goes wrong in Mlxtend.

With this issue, I'm hoping that I can request support for categorical data in the plot_decision_regions function. If I'm wrong in my interpretation of this error, I'd appreciate to find out how I can make this visualization run.

Thanks very much!

@rasbt
Copy link
Owner

rasbt commented Oct 16, 2019

Glad to hear you found the function useful overall. Regarding the onehot encoded outputs, I am wondering if the following workaround works:

class Onehot2Int(object):

    def __init__(self, model):
        self.model = model

    def predict(self, X):
        y_pred = self.model(X)
        return np.argmax(y_pred, axis=1)


# fit keras_model
keras_model_no_ohe = Onehot2Int(keras_model)
plot_decision_regions(X, y, keras_model_no_ohe)

We could maybe add an additional parameter to the plot_decision_regions function. But since it's already becoming a pretty complicated function, maybe adding this as an example to the docs would suffice? (Assuming the workaround actually works). Let me know!

@christianversloot
Copy link
Author

This would work if Mlxtend didn't check for array shape, I guess :-)

Traceback (most recent call last):
  File "multiclass-hinge.py", line 67, in <module>
    plot_decision_regions(X_testing, Targets_testing, clf=keras_model_no_ohe, legend=3)
  File "C:\Users\chris\Anaconda3\envs\tensorflow_gpu\lib\site-packages\mlxtend\plotting\decision_regions.py", line 132, in plot_decision_regions
    check_Xy(X, y, y_int=True)  # Validate X and y arrays
  File "C:\Users\chris\Anaconda3\envs\tensorflow_gpu\lib\site-packages\mlxtend\utils\checking.py", line 33, in check_Xy
    raise ValueError('y must be a 1D array. Found %s' % str(y.shape))
ValueError: y must be a 1D array. Found (1000, 3)

Full code so you can test yourself:

'''
  Keras model discussing Categorical (multiclass) Hinge loss.
'''
import keras
from keras.models import Sequential
from keras.layers import Dense
import matplotlib.pyplot as plt
import numpy as np
from sklearn.datasets import make_blobs
from mlxtend.plotting import plot_decision_regions
from keras.utils import to_categorical

# Configuration options
num_samples_total = 3000
training_split = 1000
num_classes = 3

# Generate data
X, targets = make_blobs(n_samples = num_samples_total, centers = [(0,0), (15,15), (0,15)], n_features = num_classes, center_box=(0, 1), cluster_std = 1.5)
categorical_targets = to_categorical(targets)
X_training = X[training_split:, :]
X_testing = X[:training_split, :]
Targets_training = categorical_targets[training_split:]
Targets_testing = categorical_targets[:training_split].astype(np.integer)

# Generate scatter plot for training data
plt.scatter(X_training[:,0], X_training[:,1])
plt.title('Three clusters ')
plt.xlabel('X1')
plt.ylabel('X2')
plt.show()

# Set the input shape
feature_vector_shape = len(X_training[0])
input_shape = (feature_vector_shape,)
loss_function_used = 'categorical_hinge'
print(f'Feature shape: {input_shape}')

# Create the model
model = Sequential()
model.add(Dense(4, input_shape=input_shape, activation='relu', kernel_initializer='he_uniform'))
model.add(Dense(2, activation='relu', kernel_initializer='he_uniform'))
model.add(Dense(num_classes, activation='tanh'))

# Configure the model and start training
model.compile(loss=loss_function_used, optimizer=keras.optimizers.adam(lr=0.03), metrics=['accuracy'])
history = model.fit(X_training, Targets_training, epochs=1, batch_size=5, verbose=1, validation_split=0.2)

# Test the model after training
test_results = model.evaluate(X_testing, Targets_testing, verbose=1)
print(f'Test results - Loss: {test_results[0]} - Accuracy: {test_results[1]*100}%')

# No hot encoding version
class Onehot2Int(object):

    def __init__(self, model):
        self.model = model

    def predict(self, X):
        y_pred = self.model(X)
        return np.argmax(y_pred, axis=1)

# fit keras_model
keras_model_no_ohe = Onehot2Int(model)

# Plot decision boundary
plot_decision_regions(X_testing, Targets_testing, clf=keras_model_no_ohe, legend=3)
plt.show()

# Visualize training process
plt.plot(history.history['loss'], label='Categorical Hinge loss (training data)')
plt.plot(history.history['val_loss'], label='Categorical Hinge loss (validation data)')
plt.title('Categorical Hinge loss for circles')
plt.ylabel('Categorical Hinge loss value')
plt.yscale('log')
plt.xlabel('No. epoch')
plt.legend(loc="upper left")
plt.show()

@rasbt
Copy link
Owner

rasbt commented Oct 16, 2019

The class labels aren't used for model fitting, though, just for assigning the class labels in the plot. So, you can simply pass the class label array as non-onehot encoded array. For example

plot_decision_regions(X_testing,  np.argmax(Targets_testing, axis=1), clf=keras_model_no_ohe, legend=3)

I think you also need to change

class Onehot2Int(object):

    def __init__(self, model):
        self.model = model

    def predict(self, X):
        y_pred = self.model(X)
        return np.argmax(y_pred, axis=1)

to

class Onehot2Int(object):

    def __init__(self, model):
        self.model = model

    def predict(self, X):
        y_pred = self.model.predict(X)
        return np.argmax(y_pred, axis=1)

@christianversloot
Copy link
Author

christianversloot commented Oct 16, 2019

This works beautifully! Quite a shame that I didn't see the .predict thing in the Onehot2Int class 🙈

mh_boundary

Thanks a lot!

Slight question: I publish my code (e.g. the code from my previous comment) on my website, where I dissect the code into small pieces, explaining what happens, so other folks interested in machine learning can learn from my learnings. I also publish the code on my GitHub profile, with maximum open source licenses (CC0). When publishing the code for categorical hinge, I'd like to include your solution for the Mlxtend function plots, so that people can run the code once and get the results. Would you mind if I included your code in my GitHub repo and on my website? Obviously, I'll reference to Mlxtend and this issue to illustrate your help.

Hope to hear from you! Thanks again 😎👍

@christianversloot
Copy link
Author

Taking a look at how you licensed Mlxtend, I assumed you wouldn't if I referenced you properly.
See https://www.machinecurve.com/index.php/2019/10/17/how-to-use-categorical-multiclass-hinge-with-keras/#visualizing-the-decision-boundary for how I included your code.
Additionally, I included in-text references to this issue and Mlxtend, as well as references to your paper and the Mlxtend repository in the References list near the bottom.

Hope this is ok. If not, please let me know, and I'll remove it for sure.

@rasbt
Copy link
Owner

rasbt commented Oct 17, 2019

I am glad that it works! Will add an example to the documentation as well for future reference then.

Also, thanks for asking regarding the code reuse. As you just said, that'd be totally fine with me :). Nice post, btw!

@christianversloot
Copy link
Author

Thanks twice! 😄

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants