This is a collection of Scala projects to study text clustering and classification. The projects are:
- batch-cluster: A Scala executable that submits a job to a Spark installation.
- infop-expo: A web application in Play framework that resolves multiple data sets with the batch cluster.
- VirtualBox
- Vagrant
- Chef Development Kit (DK)
vagrant up dev
This will install all the dependencies and data files. This might take a couple of hours, so get some coffee and something to read on the side.
Once the VM is up and running, restart the VM.
vagrant reload dev
You will need a private key for this. Place your key in the project root and rename the key file to infop.pem
or replace these lines in Vagrantfile:
aws.keypair_name = "your_key_name"
override.ssh.private_key_path = "path/to/your/key.pem"
Next, create the AWS instance with:
vagrant up awsdemo --provider aws
Once it is running, restart it:
vagrant reload awsdemo