mongo-spark

Example application on how to use mongo-hadoop connector with Apache Spark.

Read more details at http://codeforhire.com/2014/02/18/using-spark-with-mongodb/

Prerequisites

MongoDB installed and running on localhost
Scala 2.10 and SBT installed

Running

Import data into the database, run either JavaWordCount or ScalaWordCount and print the results.

mongoimport -d beowulf -c input beowulf.json
sbt 'run-main JavaWordCount'
sbt 'run-main ScalaWordCount'
mongo beowulf --eval 'printjson(db.output.find().toArray())' | less

License

The code itself is released to the public domain according to the Creative Commons CC0.

The example files are based on Beowulf from Project Gutenberg and is under its corresponding license.

Name		Name	Last commit message	Last commit date
Latest commit History 5 Commits
lib		lib
project		project
src/main		src/main
.classpath		.classpath
.gitignore		.gitignore
.project		.project
README.md		README.md
beowulf.json		beowulf.json
build.sbt		build.sbt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

mongo-spark

Prerequisites

Running

License

About

Releases

Packages

Languages

xu-xiaoqiang/mongo-spark

Folders and files

Latest commit

History

Repository files navigation

mongo-spark

Prerequisites

Running

License

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages