diff --git a/CHANGELOG.md b/CHANGELOG.md
index 58055fce18c..7075cbff5bd 100644
--- a/CHANGELOG.md
+++ b/CHANGELOG.md
@@ -8,7 +8,7 @@
 - PR #210 Expose degree calculation kernel via python API
 - PR #220 Added bindings for Nvgraph triangle counting
 - PR #234 Added bindings for renumbering, modify renumbering to use RMM
-
+- PR #250 Add local build script to mimic gpuCI
 
 ## Improvements
 - PR #157 Removed cudatoolkit dependency in setup.py
@@ -21,8 +21,11 @@
 - PR #215 Simplified get_rapids_dataset_root_dir(), set a default value for the root dir
 - PR #233 Added csv datasets and edited test to use cudf for reading graphs
 - PR #247 Added some documentation for renumbering
+- PR #252 cpp test upgrades for more convenient testing on large input
 
 ## Bug Fixes
+- PR #256 Add pip to the install, clean up conda instructions
+- PR #253 Add rmm to conda configuration
 - PR #226 Bump cudf dependencies to 0.7
 - PR #169 Disable terminal output in sssp
 - PR #191 Fix double upload bug
diff --git a/CONTRIBUTING.md b/CONTRIBUTING.md
index eea8dd3f19a..ca9b8301c56 100644
--- a/CONTRIBUTING.md
+++ b/CONTRIBUTING.md
@@ -49,5 +49,10 @@ contributing to. Start with _Step 3_ from above, commenting on the issue to let
 others know you are working on it. If you have any questions related to the
 implementation of the issue, ask them in the issue instead of the PR.
 
+### Building and Testing on a gpuCI image locally
+
+Before submitting a pull request, you can do a local build and test on your machine that mimics our gpuCI environment using the `ci/local/build.sh` script.
+For detailed information on usage of this script, see [here](ci/local/README.md).
+
 ## Attribution
 Portions adopted from https://github.com/pytorch/pytorch/blob/master/CONTRIBUTING.md
diff --git a/README.md b/README.md
index c5e389d85c8..ad996e7e89c 100644
--- a/README.md
+++ b/README.md
@@ -126,7 +126,6 @@ To install cuGraph from source, ensure the dependencies are met and follow the s
 
 2) Create the conda development environment 
 
-​	A)   Building the `master` branch uses the `cugraph_dev` environment
 
 ```bash
 # create the conda environment (assuming in base `cugraph` directory)
@@ -145,22 +144,6 @@ conda deactivate
 
 
 
-​	B) Create the conda development environment `cugraph_nightly`  
-
-If you are  on the latest development branch then you must use the `cugraph_nightly` environment.  The latest cuGraph code uses the latest cuDF features that might not yet be in the master branch.  To work off of the latest development branch, which could be unstable, use the nightly build environment.  
-
-```bash
-# create the conda environment (assuming in base `cugraph` directory)
-conda env create --name cugraph_nightly --file conda/environments/cugraph_nightly.yml
-
-# activate the environment
-conda activate cugraph_nightly 
-
-```
-
-
-
-
   - The environment can be updated as development includes/changes the dependencies. To do so, run:  
 
 
@@ -218,26 +201,48 @@ python setup.py install    # install cugraph python bindings
 
 #### Run tests
 
-6. Run either the standalone tests or the Python tests with datasets
-  - **C++ stand alone tests** 
+6. Run either the C++ or the Python tests with datasets
 
-    From the build directory : 
+  - **Python tests with datasets** 
 
     ```bash
-    # Run the cugraph tests
     cd $CUGRAPH_HOME
-    cd cpp/build
-    gtests/GDFGRAPH_TEST		# this is an executable file
+    cd python
+    pytest  
     ```
+  - **C++ stand alone tests** 
 
-  - **Python tests with datasets** 
+    From the build directory : 
 
     ```bash
+    # Run the cugraph tests
     cd $CUGRAPH_HOME
-    cd python
-    pytest  
+    cd cpp/build
+    gtests/GDFGRAPH_TEST		# this is an executable file
     ```
-
+ - **C++ tests with larger datasets**
+   
+   If you already have the datasets:
+  
+   ```bash
+   export RAPIDS_DATASET_ROOT_DIR=<path_to_ccp_test_and_reference_data>
+   ```
+   If you do not have the datasets:
+   
+   ```bash
+   cd $CUGRAPH_HOME/datasets
+   source get_test_data.sh #This takes about 10 minutes and download 1GB data (>5 GB uncompressed)
+   ```
+   
+   Run the C++ tests on large input:
+  
+   ```bash
+   cd $CUGRAPH_HOME/cpp/build
+   #test one particular analytics (eg. pagerank)
+   gtests/PAGERANK_TEST
+   #test everything
+   make test
+   ```
 
 Note: This conda installation only applies to Linux and Python versions 3.6/3.7.
 
@@ -322,4 +327,5 @@ The RAPIDS suite of open source software libraries aim to enable execution of en
 
 ### Apache Arrow on GPU
 
-The GPU version of [Apache Arrow](https://arrow.apache.org/) is a common API that enables efficient interchange of tabular data between processes running on the GPU. End-to-end computation on the GPU avoids unnecessary copying and converting of data off the GPU, reducing compute time and cost for high-performance analytics common in artificial intelligence workloads. As the name implies, cuDF uses the Apache Arrow columnar data format on the GPU. Currently, a subset of the features in Apache Arrow are supported.
+The GPU version of [Apache Arrow](https://arrow.apache.org/) is a common API that enables efficient interchange of tabular data between processes running on the GPU. End-to-end computation on the GPU avoids unnecessary copying and converting of data off the GPU, reducing compute time and cost for high-performance analytics common in artificial intelligence workloads. As the name implies, cuDF uses the Apache Arrow columnar data format on the GPU. Currently, a subset of the features in Apache Arrow are supported.
+
diff --git a/ci/local/README.md b/ci/local/README.md
new file mode 100644
index 00000000000..1a426ceea6d
--- /dev/null
+++ b/ci/local/README.md
@@ -0,0 +1,57 @@
+## Purpose
+
+This script is designed for developer and contributor use. This tool mimics the actions of gpuCI on your local machine. This allows you to test and even debug your code inside a gpuCI base container before pushing your code as a GitHub commit.
+The script can be helpful in locally triaging and debugging RAPIDS continuous integration failures.
+
+## Requirements
+
+```
+nvidia-docker
+```
+
+## Usage
+
+```
+bash build.sh [-h] [-H] [-s] [-r <repo_dir>] [-i <image_name>]
+Build and test your local repository using a base gpuCI Docker image
+
+where:
+    -H   Show this help text
+    -r   Path to repository (defaults to working directory)
+    -i   Use Docker image (default is gpuci/rapidsai-base:cuda10.0-ubuntu16.04-gcc5-py3.6)
+    -s   Skip building and testing and start an interactive shell in a container of the Docker image
+```
+
+Example Usage:
+`bash build.sh -r ~/rapids/cugraph -i gpuci/cuda9.2-ubuntu16.04-gcc5-py3.6`
+
+For a full list of available gpuCI docker images, visit our [DockerHub](https://hub.docker.com/r/gpuci/rapidsai-base/tags) page.
+
+Style Check:
+```bash
+$ bash ci/local/build.sh -r ~/rapids/cugraph -s
+$ source activate gdf    #Activate gpuCI conda environment
+$ cd rapids
+$ flake8 python
+```
+
+## Information
+
+There are some caveats to be aware of when using this script, especially if you plan on developing from within the container itself.
+
+
+### Docker Image Build Repository
+
+The docker image will generate build artifacts in a folder on your machine located in the `root` directory of the repository you passed to the script. For the above example, the directory is named `~/rapids/cugraph/build_rapidsai-base_cuda9.2-ubuntu16.04-gcc5-py3.6/`. Feel free to remove this directory after the script is finished.
+
+*Note*: The script *will not* override your local build repository. Your local environment stays in tact.
+
+
+### Where The User is Dumped
+
+The script will build your repository and run all tests. If any tests fail, it dumps the user into the docker container itself to allow you to debug from within the container. If all the tests pass as expected the container exits and is automatically removed. Remember to exit the container if tests fail and you do not wish to debug within the container itself.
+
+
+### Container File Structure
+
+Your repository will be located in the `/rapids/` folder of the container. This folder is volume mounted from the local machine. Any changes to the code in this repository are replicated onto the local machine. The `cpp/build` and `python/build` directories within your repository is on a separate mount to avoid conflicting with your local build artifacts.
diff --git a/ci/local/build.sh b/ci/local/build.sh
new file mode 100644
index 00000000000..2a1c4bf9bf5
--- /dev/null
+++ b/ci/local/build.sh
@@ -0,0 +1,104 @@
+#!/bin/bash
+
+DOCKER_IMAGE="gpuci/rapidsai-base:cuda10.0-ubuntu16.04-gcc5-py3.6"
+REPO_PATH=${PWD}
+RAPIDS_DIR_IN_CONTAINER="/rapids"
+CPP_BUILD_DIR="cpp/build"
+PYTHON_BUILD_DIR="python/build"
+CONTAINER_SHELL_ONLY=0
+
+SHORTHELP="$(basename $0) [-h] [-H] [-s] [-r <repo_dir>] [-i <image_name>]"
+LONGHELP="${SHORTHELP}
+Build and test your local repository using a base gpuCI Docker image
+
+where:
+    -H   Show this help text
+    -r   Path to repository (defaults to working directory)
+    -i   Use Docker image (default is ${DOCKER_IMAGE})
+    -s   Skip building and testing and start an interactive shell in a container of the Docker image
+"
+
+# Limit GPUs available to container based on CUDA_VISIBLE_DEVICES
+if [[ -z "${CUDA_VISIBLE_DEVICES}" ]]; then
+    NVIDIA_VISIBLE_DEVICES="all"
+else
+    NVIDIA_VISIBLE_DEVICES=${CUDA_VISIBLE_DEVICES}
+fi
+
+while getopts ":hHr:i:s" option; do
+    case ${option} in
+        r)
+            REPO_PATH=${OPTARG}
+            ;;
+        i)
+            DOCKER_IMAGE=${OPTARG}
+            ;;
+        s)
+            CONTAINER_SHELL_ONLY=1
+            ;;
+        h)
+            echo "${SHORTHELP}"
+            exit 0
+            ;;
+        H)
+            echo "${LONGHELP}"
+            exit 0
+            ;;
+        *)
+            echo "ERROR: Invalid flag"
+            echo "${SHORTHELP}"
+            exit 1
+            ;;
+    esac
+done
+
+REPO_PATH_IN_CONTAINER="${RAPIDS_DIR_IN_CONTAINER}/$(basename ${REPO_PATH})"
+CPP_BUILD_DIR_IN_CONTAINER="${RAPIDS_DIR_IN_CONTAINER}/$(basename ${REPO_PATH})/${CPP_BUILD_DIR}"
+PYTHON_BUILD_DIR_IN_CONTAINER="${RAPIDS_DIR_IN_CONTAINER}/$(basename ${REPO_PATH})/${PYTHON_BUILD_DIR}"
+
+
+# BASE_CONTAINER_BUILD_DIR is named after the image name, allowing for
+# multiple image builds to coexist on the local filesystem. This will
+# be mapped to the typical BUILD_DIR inside of the container. Builds
+# running in the container generate build artifacts just as they would
+# in a bare-metal environment, and the host filesystem is able to
+# maintain the host build in BUILD_DIR as well.
+BASE_CONTAINER_BUILD_DIR=${REPO_PATH}/build_$(echo $(basename ${DOCKER_IMAGE})|sed -e 's/:/_/g')
+CPP_CONTAINER_BUILD_DIR=${BASE_CONTAINER_BUILD_DIR}/cpp
+PYTHON_CONTAINER_BUILD_DIR=${BASE_CONTAINER_BUILD_DIR}/python
+
+
+BUILD_SCRIPT="#!/bin/bash
+set -e
+WORKSPACE=${REPO_PATH_IN_CONTAINER}
+PREBUILD_SCRIPT=${REPO_PATH_IN_CONTAINER}/ci/gpu/prebuild.sh
+BUILD_SCRIPT=${REPO_PATH_IN_CONTAINER}/ci/gpu/build.sh
+cd ${WORKSPACE}
+if [ -f \${PREBUILD_SCRIPT} ]; then
+    source \${PREBUILD_SCRIPT}
+fi
+yes | source \${BUILD_SCRIPT}
+"
+
+if (( ${CONTAINER_SHELL_ONLY} == 0 )); then
+    COMMAND="${CPP_BUILD_DIR_IN_CONTAINER}/build.sh || bash"
+else
+    COMMAND="bash"
+fi
+
+# Create the build dir for the container to mount, generate the build script inside of it
+mkdir -p ${BASE_CONTAINER_BUILD_DIR}
+mkdir -p ${CPP_CONTAINER_BUILD_DIR}
+mkdir -p ${PYTHON_CONTAINER_BUILD_DIR}
+echo "${BUILD_SCRIPT}" > ${CPP_CONTAINER_BUILD_DIR}/build.sh
+chmod ugo+x ${CPP_CONTAINER_BUILD_DIR}/build.sh
+
+# Run the generated build script in a container
+docker pull ${DOCKER_IMAGE}
+docker run --runtime=nvidia --rm -it -e NVIDIA_VISIBLE_DEVICES=${NVIDIA_VISIBLE_DEVICES} \
+       --user $(id -u):$(id -g) \
+       -v ${REPO_PATH}:${REPO_PATH_IN_CONTAINER} \
+       -v ${CPP_CONTAINER_BUILD_DIR}:${CPP_BUILD_DIR_IN_CONTAINER} \
+       -v ${PYTHON_CONTAINER_BUILD_DIR}:${PYTHON_BUILD_DIR_IN_CONTAINER} \
+       --cap-add=SYS_PTRACE \
+       ${DOCKER_IMAGE} bash -c "${COMMAND}"
diff --git a/conda/environments/cugraph_dev.yml b/conda/environments/cugraph_dev.yml
index a725d1b44c1..f419fe6dc7e 100644
--- a/conda/environments/cugraph_dev.yml
+++ b/conda/environments/cugraph_dev.yml
@@ -1,25 +1,26 @@
 name: cugraph_dev
 channels:
-- nvidia
-- rapidsai
-- numba
+- rapidsai/label/cuda9.2
+- nvidia/label/cuda9.2
+- rapidsai-nightly/label/cuda9.2
 - conda-forge
-- defaults
 dependencies:
-- cudf>=0.6
+- cudf=0.7.*
+- nvstrings=0.7.*
+- rmm=0.7.*
 - scipy
 - networkx
 - python-louvain
+- nccl
 - cudatoolkit
 - cmake>=3.12
 - python>=3.6,<3.8
-- numba>=0.40
+- numba>=0.41
 - pandas>=0.23.4
 - pyarrow=0.12.1
 - notebook>=0.5.0
 - boost
-- nvstrings>=0.3,<0.4
-- cffi>=1.10.0
+- cffi>=1.10.0   
 - distributed>=1.23.0
 - cython>=0.29,<0.30
 - pytest
@@ -30,7 +31,6 @@ dependencies:
 - numpydoc
 - ipython
 - recommonmark
-- pandoc=<2.0.0
+- pip
 - pip:
   - sphinx-markdown-tables
-  
diff --git a/conda/environments/cugraph_dev_cuda10.yml b/conda/environments/cugraph_dev_cuda10.yml
index e235f6c7bb1..7827798d27b 100644
--- a/conda/environments/cugraph_dev_cuda10.yml
+++ b/conda/environments/cugraph_dev_cuda10.yml
@@ -1,12 +1,13 @@
 name: cugraph_dev
 channels:
-- nvidia/label/cuda10.0
 - rapidsai/label/cuda10.0
-- numba
+- nvidia/label/cuda10.0
+- rapidsai-nightly/label/cuda10.0
 - conda-forge
-- defaults
 dependencies:
-- cudf>=0.6
+- cudf=0.7.*
+- nvstrings=0.7.*
+- rmm=0.7.*
 - scipy
 - networkx
 - python-louvain
@@ -14,12 +15,11 @@ dependencies:
 - cudatoolkit
 - cmake>=3.12
 - python>=3.6,<3.8
-- numba>=0.40
+- numba>=0.41
 - pandas>=0.23.4
 - pyarrow=0.12.1
 - notebook>=0.5.0
 - boost
-- nvstrings>=0.3,<0.4
 - cffi>=1.10.0   
 - distributed>=1.23.0
 - cython>=0.29,<0.30
@@ -31,5 +31,6 @@ dependencies:
 - numpydoc
 - ipython
 - recommonmark
+- pip
 - pip:
-  - sphinx-markdown-tables
\ No newline at end of file
+  - sphinx-markdown-tables
diff --git a/conda/environments/cugraph_nightly.yml b/conda/environments/cugraph_nightly.yml
deleted file mode 100644
index e63ddd5ed65..00000000000
--- a/conda/environments/cugraph_nightly.yml
+++ /dev/null
@@ -1,36 +0,0 @@
-name: cugraph_0.6
-channels:
-- nvidia
-- rapidsai-nightly
-- rapidsai
-- numba
-- conda-forge
-- defaults
-dependencies:
-- cudf=0.7.*
-- scipy
-- networkx
-- python-louvain
-- cudatoolkit
-- cmake>=3.12
-- python>=3.6,<3.8
-- numba>=0.40
-- pandas>=0.23.4
-- pyarrow=0.12.1
-- notebook>=0.5.0
-- boost
-- nvstrings>=0.3,<0.4
-- cffi>=1.10.0   
-- distributed>=1.23.0
-- cython>=0.29,<0.30
-- pytest
-- sphinx
-- sphinx_rtd_theme
-- sphinxcontrib-websupport
-- nbsphinx
-- numpydoc
-- ipython
-- recommonmark
-- pandoc=<2.0.0
-- pip:
-  - sphinx-markdown-tables
diff --git a/conda/environments/cugraph_nightly_cuda10.yml b/conda/environments/cugraph_nightly_cuda10.yml
deleted file mode 100644
index 5923f7ac7c1..00000000000
--- a/conda/environments/cugraph_nightly_cuda10.yml
+++ /dev/null
@@ -1,35 +0,0 @@
-name: cugraph_0.6
-channels:
-- nvidia/label/cuda10.0
-- rapidsai-nightly/label/cuda10.0
-- rapidsai/label/cuda10.0
-- numba
-- conda-forge
-- defaults
-dependencies:
-- cudf=0.7.*
-- scipy
-- networkx
-- python-louvain
-- cudatoolkit
-- cmake>=3.12
-- python>=3.6,<3.8
-- numba>=0.40
-- pandas>=0.23.4
-- pyarrow=0.12.1
-- notebook>=0.5.0
-- boost
-- nvstrings>=0.3,<0.4
-- cffi>=1.10.0   
-- distributed>=1.23.0
-- cython>=0.29,<0.30
-- pytest
-- sphinx
-- sphinx_rtd_theme
-- sphinxcontrib-websupport
-- nbsphinx
-- numpydoc
-- ipython
-- recommonmark
-- pip:
-  - sphinx-markdown-tables
diff --git a/cpp/src/tests/nvgraph_plugin/nvgraph_gdf_sssp.cpp b/cpp/src/tests/nvgraph_plugin/nvgraph_gdf_sssp.cpp
index e448ab86789..ff412416c32 100644
--- a/cpp/src/tests/nvgraph_plugin/nvgraph_gdf_sssp.cpp
+++ b/cpp/src/tests/nvgraph_plugin/nvgraph_gdf_sssp.cpp
@@ -225,11 +225,11 @@ TEST_P(Tests_Sssp2, CheckFP32) {
 
 // --gtest_filter=*golden_test*
 INSTANTIATE_TEST_CASE_P(golden_test, Tests_Sssp2, 
-                        ::testing::Values(  Sssp2_Usecase("/datasets/networks/karate.mtx" , "", 1)
-                                           ,Sssp2_Usecase("/datasets/golden_data/graphs/dblp.mtx" , "/datasets/golden_data/results/sssp/dblp_T.sssp_100000.bin", 100000)
-                                           ,Sssp2_Usecase("/datasets/golden_data/graphs/dblp.mtx" , "/datasets/golden_data/results/sssp/dblp_T.sssp_100.bin", 100)
-                                           ,Sssp2_Usecase("/datasets/golden_data/graphs/wiki2003.mtx" , "/datasets/golden_data/results/sssp/wiki2003_T.sssp_100000.bin",100000 )
-                                           ,Sssp2_Usecase("/datasets/golden_data/graphs/wiki2003.mtx" , "/datasets/golden_data/results/sssp/wiki2003_T.sssp_100.bin", 100)
+                        ::testing::Values(  Sssp2_Usecase("test/datasets/karate.mtx" , "", 1)
+                                           ,Sssp2_Usecase("test/datasets/dblp.mtx" ,     "test/ref/sssp/dblp_T.sssp_100000.bin", 100000)
+                                           ,Sssp2_Usecase("test/datasets/dblp.mtx" ,     "test/ref/sssp/dblp_T.sssp_100.bin", 100)
+                                           ,Sssp2_Usecase("test/datasets/wiki2003.mtx" , "test/ref/sssp/wiki2003_T.sssp_100000.bin",100000 )
+                                           ,Sssp2_Usecase("test/datasets/wiki2003.mtx" , "test/ref/sssp/wiki2003_T.sssp_100.bin", 100)
                                          )
                        );
 int main(int argc, char **argv)  {
diff --git a/cpp/src/tests/pagerank/pagerank_test.cu b/cpp/src/tests/pagerank/pagerank_test.cu
index a5493e22726..46c1150f292 100644
--- a/cpp/src/tests/pagerank/pagerank_test.cu
+++ b/cpp/src/tests/pagerank/pagerank_test.cu
@@ -198,24 +198,22 @@ TEST_P(Tests_Pagerank, CheckFP64) {
 
 // --gtest_filter=*simple_test*
 INSTANTIATE_TEST_CASE_P(simple_test, Tests_Pagerank, 
-                        ::testing::Values(  Pagerank_Usecase("networks/karate.mtx", "")
-                                            ,Pagerank_Usecase("golden_data/graphs/cit-Patents.mtx", "golden_data/results/pagerank/cit-Patents.pagerank_val_0.85.bin")
-                                            ,Pagerank_Usecase("golden_data/graphs/ljournal-2008.mtx", "golden_data/results/pagerank/ljournal-2008.pagerank_val_0.85.bin")
-                                            ,Pagerank_Usecase("golden_data/graphs/webbase-1M.mtx", "golden_data/results/pagerank/webbase-1M.pagerank_val_0.85.bin")
-                                            ,Pagerank_Usecase("golden_data/graphs/web-BerkStan.mtx", "golden_data/results/pagerank/web-BerkStan.pagerank_val_0.85.bin")
-                                            ,Pagerank_Usecase("golden_data/graphs/web-Google.mtx", "golden_data/results/pagerank/web-Google.pagerank_val_0.85.bin")
-                                            ,Pagerank_Usecase("golden_data/graphs/wiki-Talk.mtx", "golden_data/results/pagerank/wiki-Talk.pagerank_val_0.85.bin")
-                                            //,Pagerank_Usecase("bb_lt250m_4.mtx", "")
-                                            //,Pagerank_Usecase("bb_lt250m_3.mtx", "")
-                                            //,Pagerank_Usecase("caidaRouterLevel.mtx", "")
-                                            //,Pagerank_Usecase("citationCiteseer.mtx", "")
-                                            //,Pagerank_Usecase("coPapersDBLP.mtx", "")
-                                            //,Pagerank_Usecase("coPapersCiteseer.mtx", "")
-                                            //,Pagerank_Usecase("as-Skitter.mtx", "")
-                                            //,Pagerank_Usecase("hollywood.mtx", "")
-                                            //,Pagerank_Usecase("europe_osm.mtx", "")
-                                            //,Pagerank_Usecase("soc-LiveJournal1.mtx", "")
-                                            //,Pagerank_Usecase("twitter.mtx", "")
+                        ::testing::Values(   Pagerank_Usecase("test/datasets/karate.mtx", "")
+                                            ,Pagerank_Usecase("test/datasets/web-BerkStan.mtx", "test/ref/pagerank/web-BerkStan.pagerank_val_0.85.bin")
+                                            ,Pagerank_Usecase("test/datasets/web-Google.mtx",   "test/ref/pagerank/web-Google.pagerank_val_0.85.bin")
+                                            ,Pagerank_Usecase("test/datasets/wiki-Talk.mtx",    "test/ref/pagerank/wiki-Talk.pagerank_val_0.85.bin")
+                                            ,Pagerank_Usecase("test/datasets/cit-Patents.mtx",  "test/ref/pagerank/cit-Patents.pagerank_val_0.85.bin")
+                                            ,Pagerank_Usecase("test/datasets/ljournal-2008.mtx","test/ref/pagerank/ljournal-2008.pagerank_val_0.85.bin")
+                                            ,Pagerank_Usecase("test/datasets/webbase-1M.mtx",   "test/ref/pagerank/webbase-1M.pagerank_val_0.85.bin")
+                                            //,Pagerank_Usecase("test/datasets/caidaRouterLevel.mtx", "")
+                                            //,Pagerank_Usecase("test/datasets/citationCiteseer.mtx", "")
+                                            //,Pagerank_Usecase("test/datasets/coPapersDBLP.mtx", "")
+                                            //,Pagerank_Usecase("test/datasets/coPapersCiteseer.mtx", "")
+                                            //,Pagerank_Usecase("test/datasets/as-Skitter.mtx", "")
+                                            //,Pagerank_Usecase("test/datasets/hollywood.mtx", "")
+                                            //,Pagerank_Usecase("test/datasets/europe_osm.mtx", "")
+                                            //,Pagerank_Usecase("test/datasets/soc-LiveJournal1.mtx", "")
+                                            //,Pagerank_Usecase("benchmark/twitter.mtx", "")
                                          )
                        );
 
diff --git a/cpp/src/tests/snmg_spmv/snmg_spmv_test.cu b/cpp/src/tests/snmg_spmv/snmg_spmv_test.cu
index 9d928828829..31a6c6ba43c 100644
--- a/cpp/src/tests/snmg_spmv/snmg_spmv_test.cu
+++ b/cpp/src/tests/snmg_spmv/snmg_spmv_test.cu
@@ -226,14 +226,14 @@ TEST_P(Tests_MGSpmv, CheckFP64) {
 }
 
 INSTANTIATE_TEST_CASE_P(mtx_test, Tests_MGSpmv, 
-                        ::testing::Values(  MGSpmv_Usecase("networks/karate.mtx")
-                                            ,MGSpmv_Usecase("golden_data/graphs/cit-Patents.mtx")
-                                            ,MGSpmv_Usecase("golden_data/graphs/ljournal-2008.mtx")
-                                            ,MGSpmv_Usecase("golden_data/graphs/webbase-1M.mtx")
-                                            ,MGSpmv_Usecase("networks/netscience.mtx")
-                                            ,MGSpmv_Usecase("golden_data/graphs/web-Google.mtx")
-                                            ,MGSpmv_Usecase("golden_data/graphs/wiki-Talk.mtx")
-                                            //,MGSpmv_Usecase("networks/twitter.mtx")
+                        ::testing::Values(   MGSpmv_Usecase("test/datasets/karate.mtx")
+                                            ,MGSpmv_Usecase("test/datasets/netscience.mtx")
+                                            ,MGSpmv_Usecase("test/datasets/cit-Patents.mtx")
+                                            ,MGSpmv_Usecase("test/datasets/webbase-1M.mtx")
+                                            ,MGSpmv_Usecase("test/datasets/web-Google.mtx")
+                                            ,MGSpmv_Usecase("test/datasets/wiki-Talk.mtx")
+                                            //,MGSpmv_Usecase("test/datasets/ljournal-2008.mtx")
+                                            //,MGSpmv_Usecase("test/datasets/twitter.mtx")
                                          )
                        );
 
@@ -391,14 +391,15 @@ TEST_P(Tests_MGSpmv_hibench, CheckFP32_hibench) {
     run_current_test<int, float>(GetParam());
 }
 
-INSTANTIATE_TEST_CASE_P(hibench_test, Tests_MGSpmv_hibench, 
-                        ::testing::Values(  MGSpmv_Usecase("1/Input-small/edges/part-00000")
-                                            //,MGSpmv_Usecase("1/Input-large/edges/part-00000")
-                                            //,MGSpmv_Usecase("1/Input-huge/edges/part-00000")
-                                            //,MGSpmv_Usecase("1/Input-gigantic/edges/part-00000")
-                                            ,MGSpmv_Usecase("1/Input-bigdatax2/edges/part-00000")
-                                            ,MGSpmv_Usecase("1/Input-bigdatax4/edges/part-00000")
-                                            //,MGSpmv_Usecase("1/Input-bigdata/edges/part-00000")
+
+INSTANTIATE_TEST_CASE_P(hibench_test, Tests_MGSpmv_hibench,  
+                        ::testing::Values(   MGSpmv_Usecase("benchmark/hibench/1/Input-small/edges/part-00000")
+                                            ,MGSpmv_Usecase("benchmark/hibench/1/Input-large/edges/part-00000")
+                                            ,MGSpmv_Usecase("benchmark/hibench/1/Input-huge/edges/part-00000")
+                                            //,MGSpmv_Usecase("benchmark/hibench/1/Input-gigantic/edges/part-00000")
+                                            //,MGSpmv_Usecase("benchmark/hibench/1/Input-bigdata/edges/part-00000")
+                                            //,MGSpmv_Usecase("benchmark/hibench/1/Input-bigdatax2/edges/part-00000")
+                                            //,MGSpmv_Usecase("benchmark/hibench/1/Input-bigdatax4/edges/part-00000")
                                          )
                        );
 
diff --git a/datasets/README.md b/datasets/README.md
new file mode 100644
index 00000000000..cbad8a8a3e8
--- /dev/null
+++ b/datasets/README.md
@@ -0,0 +1,34 @@
+# Cugraph test data
+
+
+## Python
+
+This directory contains small public datasets in `mtx` and `csv` format used by cuGraph's python tests. Graph details:
+
+| Graph         | V     | E     | Directed | Weighted |
+| ------------- | ----- | ----- | -------- | -------- |
+| karate        | 34    | 156   | No       | No       |
+| dolphin       | 62    | 318   | No       | No       |
+| netscience    | 1,589 | 5,484 | No       | Yes      |
+
+
+**karate** :The graph "karate" contains the network of friendships between the 34 members of a karate club at a US university, as described by Wayne Zachary in 1977.
+
+**dolphin** : The graph dolphins contains an undirected social network of frequent associations between 62 dolphins in a community living off Doubtful Sound, New Zealand, as compiled by Lusseau et al. (2003).                        
+
+**netscience** : The graph netscience contains a coauthorship network of scientists working on network theory and experiment, as compiled by M. Newman in May 2006.
+
+
+## C++
+Cugraph's C++ analytics tests need larger datasets (>5GB uncompressed) and reference results (>125MB uncompressed). They can be downloaded using the provided script.
+```
+source get_test_data.sh
+``` 
+You may run this script from elsewhere and store C++ test input to another location. 
+
+Before running the tests, you should let cuGraph know where to find the test input by using:
+```
+export RAPIDS_DATASET_ROOT_DIR=<path_to_ccp_test_and_reference_data>
+```
+## Refence
+The SuiteSparse Matrix Collection (formerly the University of Florida Sparse Matrix Collection) : https://sparse.tamu.edu/
diff --git a/datasets/get_test_data.sh b/datasets/get_test_data.sh
new file mode 100644
index 00000000000..a416baf7256
--- /dev/null
+++ b/datasets/get_test_data.sh
@@ -0,0 +1,22 @@
+#!/bin/bash
+
+echo Downloading ...
+mkdir tmp
+cd tmp
+wget https://s3.us-east-2.amazonaws.com/rapidsai-data/cugraph/test/datasets.tgz
+wget https://s3.us-east-2.amazonaws.com/rapidsai-data/cugraph/test/ref/pagerank.tgz
+wget https://s3.us-east-2.amazonaws.com/rapidsai-data/cugraph/test/ref/sssp.tgz
+cd ..
+
+mkdir test
+mkdir test/ref
+
+echo Decompressing ...
+tar xvzf tmp/datasets.tgz -C test
+tar xvzf tmp/pagerank.tgz -C test/ref
+tar xvzf tmp/sssp.tgz -C test/ref
+
+rm -rf tmp
+
+export RAPIDS_DATASET_ROOT_DIR=$PWD
+echo RAPIDS_DATASET_ROOT_DIR was set to $PWD