Survey Analysis - Powered by Unsupervised Learning

Introduction

Analysing survey results can be challenging. It's common for customers to experience similar experiences leading to groups of individuals selecting the same survey responses. However, they won't all select the same as people are interested in different things. For example, some go to a restaurant for atmosphere whilst others go primarily for the food. Clustering can help us gain a general concensus of our customers.

Clustering is an unsupervised machine learning method. Unsupervised techniques don't require labelled data (it doesn't need us humans to teach the model directly). This notebook will focus on clustering customer survey responses from an airline. This analysis could be used on any surveys that use a numerical scale.

Methodology

Create a real life scenario to make the randomly generated data more meaningful and easier to follow.

An airline company would like to analyze survey results at scale. This could help the company identify loyal customers and improve the retiontion of non loyal customers.

1. Design the Survey

Please note that it's important for each response to be on the same scale. An airline has been collecting feedback from their customers through their app. It's important for the executive team at the airline to understand the reponses to provide improvements to the airline.

0 - Strongly Disagree
1 - Disagree
2 - Neutral
3 - Agree
4 - Strongly Agree

Lastly, you must include the fundamental question: "I would return to "ABC" in the future"

2. Import the data

I randomly generated data as I wanted to test if this analysis would gain insights before carrying out an actual survey on a large number of people.

3. Apply the Elbow Technique to determine the appropriate number of clusters

To make this step dynamic in the future, we should write some code to automatically select the suitable number of clusters. Currently we are plotting a graph and manually selecting the best number of clusters. Instead, we would "plot a graph" and write code to approximate the correct number of K to remove manual intervention.

4. Predict and attach the clusters for each persons survey

5. Create an Interactive Dashboard

An example dashboard I have created can be seen below:

Contents of the notebook

Import relevant packages
Creating the raw data (dataframe)
Unsupervised Learning (KMeans) - Elbow Method to determine K
Unsupervised Learning (KMeans) - Predict Cluster
Output clusters into dataframe
Visualise results
Export to PowerBI and Visualise results
Discussion
Conclusion

Getting Started

To view the notebook in your browser follow the link below: https://github.com/VirajVaitha123/Survey-Analysis---Powered-by-Unsupervised-Learning/blob/master/Customer%20Feedback%20Analytics%20-%20Unsupervised%20Learning.ipynb

Alternatively, to interact with the code please follow the steps below:

Step 1: Download required files

 git clone https://github.com/VirajVaitha123/Survey-Analysis---Powered-by-Unsupervised-Learning.git

Step 2: Create the virtual environment

run the following command relative to your directory to create the environment with the relevant dependencies

conda env create -f DataScience.yml

Step 3: Access notebook in Jupyter Notebook

Jupyter Notebook

Open and edit the notebook

Alternatively, view the notebote if not rendering in github through:

https://nbviewer.jupyter.org/

Copy and paste the notebook link below to nbviewer website: https://github.com/VirajVaitha123/Survey-Analysis---Powered-by-Unsupervised-Learning/blob/master/Customer%20Feedback%20Analytics%20-%20Unsupervised%20Learning.ipynb

TO DO

Data is randomly generated and not representative of a real sittuation, should adjust
Question states Where there any delays?, this is the one questions where a postive score reflects negatively. Each Question should be on a 0 = negative 5= positive scale.
Plotly box plot would look more attractive
Discussion and conclusion, there is no comments on my analysis and readers would not be able to see the outcomes of the algorithm. It's important to add this!

Name		Name	Last commit message	Last commit date
Latest commit History 47 Commits
Images		Images
data		data
Customer Feedback Analytics - Unsupervised Learning.ipynb		Customer Feedback Analytics - Unsupervised Learning.ipynb
DataScience.yml		DataScience.yml
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Survey Analysis - Powered by Unsupervised Learning

Introduction

Methodology

1. Design the Survey

2. Import the data

3. Apply the Elbow Technique to determine the appropriate number of clusters

4. Predict and attach the clusters for each persons survey

5. Create an Interactive Dashboard

Contents of the notebook

Getting Started

Alternatively, view the notebote if not rendering in github through:

About

Releases

Packages

Languages

VirajVaitha123/Survey-Analysis---Powered-by-Unsupervised-Learning

Folders and files

Latest commit

History

Repository files navigation

Survey Analysis - Powered by Unsupervised Learning

Introduction

Methodology

1. Design the Survey

2. Import the data

3. Apply the Elbow Technique to determine the appropriate number of clusters

4. Predict and attach the clusters for each persons survey

5. Create an Interactive Dashboard

Contents of the notebook

Getting Started

Alternatively, view the notebote if not rendering in github through:

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages