Example application on how to use mongo-hadoop connector with Apache Spark.
Read more details at http://codeforhire.com/2014/02/18/using-spark-with-mongodb/
- MongoDB installed and running on localhost
- Scala 2.10 and SBT installed
Import data into the database, run either JavaWordCount
or ScalaWordCount
and print the results.
mongoimport -d beowulf -c input beowulf.json
sbt 'run-main JavaWordCount'
sbt 'run-main ScalaWordCount'
mongo beowulf --eval 'printjson(db.output.find().toArray())' | less
The code itself is released to the public domain according to the Creative Commons CC0.
The example files are based on Beowulf from Project Gutenberg and is under its corresponding license.