Skip to content

Code of Fair Lloyd algorithm introduced in "Socially Fair π‘˜-Means Clustering" paper

Notifications You must be signed in to change notification settings

samirasamadi/SociallyFairKMeans

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 

History

8 Commits
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 

Repository files navigation

SociallyFairKMeans

Matlab implementation of the Fair Lloyd algorithm introduced in "Socially Fair π‘˜-Means Clustering" paper. Read the full paper at https://arxiv.org/pdf/2006.10085.pdf

Description of the folders and files:

  • Folder "Fair Lloyd": this folder contains all the main functions.

-- bSearch : computing fair centers using a binary search (section 2.2)

-- compCost : computing fair or vanilla k-means cost of a clustering (based on the value of isFair variable)

-- fairPCA: the fair PCA projection of the data to featureNum dimensions. For details of the fairPCA code, please look at https://github.com/samirasamadi/Fair-PCA

-- findCenters: given a clustering of the data, computers vanilla or fair centers (based on the value of isFair variable)

-- findClustering: given k centers, find the vanilla or fair clustering (based on the value of isFair variable)

-- giveRandCenters: initializes k random centers

-- kmeansCost_S_C: given a set of clusters and a set of arbitrary centers (centers might not be the average of clusters), calculate the kmeans cost

-- Lloyd: Lloyd algorithm. Choose the best among bestOutOf different center initializations. numIters is the number of iteartions for each center initialization.

-- LoadData: oads the datasets that we are using in our experiments. For description of the datasets please refer to section 4 (compasWB was ultimately not used, ignore it)

-- mainK: Lloyd's versus Fair Lloyd for number of clusters = k.

-- mw: multiplicative weight update algorithm. Used in fair PCA.

-- normalizeData: normalize the mean and standard deviation of the data

-- optApprox: the optimal rank d PCA approximation of matrix M. Used in fair PCA.

-- preProcessEductaionVector: converts the education vector for the credit dataset to high-educated and low-educated.

-- projectData: projects the high dimensional data into numFeatures dimension either using PCA or fair PCA. Referred to as pre-processing in section 4.

-- re: Calculate the reconstruction error of matrix Y with respect to matrix Z. Used in fair PCA.

  • Folder "plotsCode": this folder contains the specific codes for plotting the results.

About

Code of Fair Lloyd algorithm introduced in "Socially Fair π‘˜-Means Clustering" paper

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published