Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Study googlenet using PSGD with LMDB and hadoop parquet #146

Open
wants to merge 2 commits into
base: master
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
14 changes: 12 additions & 2 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -3,6 +3,16 @@ Distributed Neural Networks for Spark.
Details are available in the [paper](http://arxiv.org/abs/1511.06051).
Ask questions on the [sparknet-users mailing list](https://groups.google.com/forum/#!forum/sparknet-users)!

The research is done on GoogleNet inception v1 using LMDB with any number of partiontioned data.

To further batch input research, should use javacpp preset caffe version rc3-1.2
.

How to prepare data into parquet format is inside the code which are commented out.


You need redi server to handle distribute of worker into correct GPU interfaces.Just start a docker rediserver in standard port.

## Quick Start
**Start a Spark cluster using our AMI**

Expand Down Expand Up @@ -101,7 +111,7 @@ The specific instructions might depend on your cluster configurations, if you ru
```
cd ~/SparkNet
git pull
sbt assembly
sbt assemble
```

4. Now you can for example run the CIFAR App as shown above.
Expand Down Expand Up @@ -160,7 +170,7 @@ The specific instructions might depend on your cluster configurations, if you ru
```
cd ~/SparkNet
git pull
sbt assembly
sbt assemble
```
17. Create the file `~/.bash_profile` and add the following:

Expand Down
53 changes: 33 additions & 20 deletions build.sbt
Original file line number Diff line number Diff line change
Expand Up @@ -4,23 +4,21 @@ assemblySettings

classpathTypes += "maven-plugin"

// resolvers += "Local Maven Repository" at "file://"+Path.userHome.absolutePath+"/.m2/repository"
//resolvers += "javacpp" at "http://www.eecs.berkeley.edu/~rkn/snapshot-2016-03-05/"

resolvers += "javacpp" at "http://www.eecs.berkeley.edu/~rkn/snapshot-2016-03-05/"
libraryDependencies += "org.bytedeco" % "javacpp" % "1.2.1"

libraryDependencies += "org.bytedeco" % "javacpp" % "1.2-SPARKNET"
libraryDependencies += "org.bytedeco.javacpp-presets" % "caffe" % "master-1.2"

libraryDependencies += "org.bytedeco.javacpp-presets" % "caffe" % "master-1.2-SPARKNET"
libraryDependencies += "org.bytedeco.javacpp-presets" % "caffe" % "master-1.2" classifier "linux-x86_64"

libraryDependencies += "org.bytedeco.javacpp-presets" % "caffe" % "master-1.2-SPARKNET" classifier "linux-x86_64"
libraryDependencies += "org.bytedeco.javacpp-presets" % "opencv" % "3.1.0-1.2"

libraryDependencies += "org.bytedeco.javacpp-presets" % "opencv" % "3.1.0-1.2-SPARKNET"
libraryDependencies += "org.bytedeco.javacpp-presets" % "opencv" % "3.1.0-1.2" classifier "linux-x86_64"

libraryDependencies += "org.bytedeco.javacpp-presets" % "opencv" % "3.1.0-1.2-SPARKNET" classifier "linux-x86_64"
//libraryDependencies += "org.bytedeco.javacpp-presets" % "tensorflow" % "master-1.2-SPARKNET"

libraryDependencies += "org.bytedeco.javacpp-presets" % "tensorflow" % "master-1.2-SPARKNET"

libraryDependencies += "org.bytedeco.javacpp-presets" % "tensorflow" % "master-1.2-SPARKNET" classifier "linux-x86_64"
//libraryDependencies += "org.bytedeco.javacpp-presets" % "tensorflow" % "master-1.2-SPARKNET" classifier "linux-x86_64"

// libraryDependencies += "org.bytedeco" % "javacpp" % "1.1"

Expand All @@ -40,33 +38,48 @@ libraryDependencies += "org.bytedeco.javacpp-presets" % "tensorflow" % "master-1

libraryDependencies += "com.google.protobuf" % "protobuf-java" % "2.5.0"

libraryDependencies += "org.apache.spark" %% "spark-sql" % "1.4.1" % "provided"

libraryDependencies += "com.databricks" % "spark-csv_2.11" % "1.3.0"
libraryDependencies += "org.apache.spark" %% "spark-sql" % "1.6.1" % "provided"

libraryDependencies += "org.apache.spark" %% "spark-core" % "1.4.1" % "provided"
libraryDependencies += "org.apache.spark" %% "spark-core" % "1.6.1" % "provided"

libraryDependencies += "net.java.dev.jna" % "jna" % "4.2.1"

libraryDependencies += "org.scalatest" % "scalatest_2.10" % "2.0" % "test"

libraryDependencies += "com.amazonaws" % "aws-java-sdk" % "1.10.21"

libraryDependencies += "net.coobird" % "thumbnailator" % "0.4.2"

libraryDependencies ++= Seq("com.twelvemonkeys.imageio" % "imageio" % "3.1.2",
"com.twelvemonkeys.imageio" % "imageio-jpeg" % "3.1.2")
libraryDependencies += "com.twelvemonkeys.imageio" % "imageio-jpeg" % "3.1.2"

libraryDependencies += "com.twelvemonkeys.imageio" % "imageio-metadata" % "3.1.2"

libraryDependencies += "com.twelvemonkeys.imageio" % "imageio-core" % "3.1.2"

libraryDependencies += "com.twelvemonkeys.common" % "common-lang" % "3.1.2"

libraryDependencies += "com.netflix.curator" % "curator-framework" % "1.3.3"

libraryDependencies += "net.debasishg" %% "redisclient" % "3.0"

// the following is needed to make spark more compatible with amazon's aws package
dependencyOverrides ++= Set(
"com.fasterxml.jackson.core" % "jackson-databind" % "2.4.4"
"com.fasterxml.jackson.core" % "jackson-databind" % "2.6.5",
"com.fasterxml.jackson.core" % "jackson-annotations" % "2.6.5"
)

// test in assembly := {}
test in assembly := {}

parallelExecution in test := false
// fork in test := true

/*
mergeStrategy in assembly := {
case x if x.startsWith("META-INF") => MergeStrategy.discard // Bumf
case x if x.endsWith(".html") => MergeStrategy.discard // More bumf
case x if x.contains("slf4j-api") => MergeStrategy.last
case x if x.contains("org/cyberneko/html") => MergeStrategy.first
case PathList("com", "esotericsoftware", xs@_ *) => MergeStrategy.last // For Log$Logger.class
case x =>
val oldStrategy = (mergeStrategy in assembly).value
oldStrategy(x)
}
*/
Binary file added imagenet.mean
Binary file not shown.
18 changes: 18 additions & 0 deletions imnet.sh
Original file line number Diff line number Diff line change
@@ -0,0 +1,18 @@
export SPARK_WORKER_INSTANCES=4
export SPARKNET_HOME=/data02/nhe/SparkNet
export DEVICES=1
spark-submit --master yarn --deploy-mode cluster \
--conf spark.yarn.appMasterEnv.SPARKNET_HOME=/data02/nhe/SparkNet \
--conf spark.yarn.appMasterEnv.Redis=bdalab12 \
--conf spark.yarn.appMasterEnv.GPU_HOSTS=bdalab12,bdalab13 \
--conf spark.yarn.max.executor.failures=100 \
--conf spark.driver.extraLibraryPath="${LD_LIBRARY_PATH}" \
--conf spark.executorEnv.LD_LIBRARY_PATH="${LD_LIBRARY_PATH}" \
--driver-memory 5g \
--executor-memory 100g \
--executor-cores 5 \
--driver-cores 4 \
--num-executors ${SPARK_WORKER_INSTANCES} \
--class apps.ImageNetApp \
sparknet-assembly-0.1-SNAPSHOT.jar \
4 hdfs://bdalab12:8020/imagenet 10 00 12000 128 false 125 false 40 2
Binary file added imnet11800.caffemodel
Binary file not shown.
Binary file added imnet11850-b.caffemodel
Binary file not shown.
Binary file added imnet350.caffemodel
Binary file not shown.
Loading