-
Notifications
You must be signed in to change notification settings - Fork 10
Running on Amazon Web Services
Create an Amazon Web Services account.
Clone the repository with submodules:
git clone --recursive https://github.com/saalfeldlab/stitching-spark.git
Build the package:
python build.py
This will generate a binary file target/stitching-spark-<version>-SNAPSHOT.jar
. Upload this file to an S3 bucket.
In the EMR console select Create cluster. Set launch mode to Step execution.
Each step of the pipeline needs to be added as an execution step in the configuration. For each step of the pipeline set Step type to Spark application and click Configure.
In the pop-up dialog:
- Set spark-submit options to
--class <class>
. Each step of the stitching pipeline has an associated class with it (see below). - In the Application location field select the uploaded
stitching-spark-<version>-SNAPSHOT.jar
file. - Specify arguments for your application and click Add
The application requires an input file containing the registered tiles configuration for each channel. It should be a JSON formatted as follows:
[
{
"index" : 0,
"file" : "FCF_CSMH__54383_20121206_35_C3_zb15_zt01_63X_0-0-0_R1_L086_20130108192758780.lsm.tif",
"position" : [0.0, 0.0, 0.0],
"size" : [991, 992, 880],
"pixelResolution" : [0.097,0.097,0.18],
"type" : "GRAY16"
},
{
"index" : 1,
"file" : "FCF_CSMH__54383_20121206_35_C3_zb15_zt01_63X_0-0-0_R1_L087_20130108192825183.lsm.tif",
"position" : [716.932762003862, -694.0887500300357, -77.41783189603937],
"size" : [991, 992, 953],
"pixelResolution" : [0.097,0.097,0.18],
"type" : "GRAY16"
}
]
The tile images need to be uploaded to AWS S3 to be accessible from the AWS EMR cluster.
Run the provided script that uploads the tiles and corresponding configuration files to your bucket:
python startup-scripts/cloud/upload-tiles-n5.py -i ch0.json -i ch1.json -o s3://target-bucket/
Submit a job with the following parameters:
-
Main class:
org.janelia.flatfield.FlatfieldCorrection
-
Jar file:
s3://<your-bucket>/<path>/stitching-spark-<version>-SNAPSHOT.jar
- Arguments:
-i
s3://<your-bucket>/ch0-converted-n5.json
This will create a folder named ch0-converted-n5-flatfield/
in the same bucket. After the application is finished, it will store two files S.tif
and T.tif
(the brightfield and the offset respectively).
The next steps will detect the flatfield folder and will automatically use the estimated flatfields for on-the-fly correction.
The full list of available parameters for the flatfield script is available here.
Submit a job with the following parameters:
-
Main class:
org.janelia.stitching.StitchingSpark
-
Jar file:
s3://<your-bucket>/<path>/stitching-spark-<version>-SNAPSHOT.jar
- Arguments:
--stitch
-i
s3://<your-bucket>/ch0-converted-n5.json
-i
s3://<your-bucket>/ch1-converted-n5.json
This will run the stitching performing a number of iterations until it cannot improve the solution anymore. The multichannel data will be averaged on-the-fly before computing pairwise shifts in order to get higher correlations because of denser signal.
As a result, it will create files ch0-converted-n5-final.json
and ch1-converted-n5-final.json
near the input tile configuration files.
It will also store a file named optimizer.txt
that will contain the statistics on average and max errors, number of retained tiles and edges in the final graph, and cross correlation and variance threshold values that were used to obtain the final solution.
The current stitching method is iterative translation-based (improving the solution by building the prediction model).
The pipeline incorporating a higher-order model is currently under development in the split-tiles
branch.
The full list of available parameters for the stitch script is available here.
Submit a job with the following parameters:
-
Main class:
org.janelia.stitching.StitchingSpark
-
Jar file:
s3://<your-bucket>/<path>/stitching-spark-<version>-SNAPSHOT.jar
- Arguments:
--fuse
-i
s3://<your-bucket>/ch0-converted-n5-final.json
-i
s3://<your-bucket>/ch1-converted-n5-final.json
This will generate an N5 export under export.n5/
folder. The export is fully compatible with N5 Viewer for browsing.
The full list of available parameters for the export script is available here.
Submit a job with the following parameters:
-
Main class:
org.janelia.stitching.N5ToSliceTiffSpark
-
Jar file:
s3://<your-bucket>/<path>/stitching-spark-<version>-SNAPSHOT.jar
- Arguments:
-i
s3://<your-bucket>/export.n5
This will output a set of XY slices as TIFF images for each channel of the N5 export.
The full list of available parameters for this step is available here.