HOME

Welcome to the Team Andromeda wiki!

Goal

Our goal was to design a large-scale document classifier in Apache Spark that maximizes its classification accuracy against a testing dataset.The training Dataset consists of vsmall, small,large sets which range from 1kb all the way to 1GB. Using this dataset from Reuters we train a Baysian Classifier to distinguish between the labels

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

HOME

Welcome to the Team Andromeda wiki!

Goal

Contents

Clone this wiki locally