Introduction

This is an MVP for the AAD Kin Mentorship Matching Algorithm

Resources

This setup guide requires you to have the following installed on your machine:

python3 -m venv venv

source venv/bin/activate # Linux/Ubuntu

pip install -r "requirements.txt"

data/ - This directory will hold the data for the matching algorithm.
processing/ - This directory will hold the code for processing the data.
matching_script/ - This directory will hold the code for the matching algorithm.

In the output of the matching algorithm script (pairs.csv), 'Mentor Preference Number' and 'Mentee Preference Number' have N/A values for anytime a preference is not in the top 15. Is there a way we can fix this so that AAD can get a better measure of how successful the algorithm was for each person?
It was noted earlier that less industries yielded higher accuracy (which makes sense). On the Microsoft form we can constrain the number of industries each person is allowed to select. This will 'artificially' increase the accuracy of the algorithm, but I think that it would be beneficial since it elminates outliers who might select a large number of industries
What else can be added to the pairs.csv output to give a measure of the algorithms success? How can we make that table look more presentable for AAD?
If not all 200 mentees and mentors answer the form, how can we handle that?

Name		Name	Last commit message	Last commit date
Latest commit History 82 Commits
config		config
data		data
matching_script		matching_script
src		src
tests		tests
.gitignore		.gitignore
DOCS.md		DOCS.md
README.md		README.md
REFACTOR.md		REFACTOR.md
main.py		main.py
requirements.txt		requirements.txt