Document clustering algorithm based on TF-IDF
Data set used is : Pick up the Reuters R52 dataset from https://www.cs.umb.edu/~smimarog/textmining/datasets/.
First run createDocument_in_folders.java to creats documents in AllDocumnets folder.
Then run kMeans.java to perform clustering on AllDocumnets.