Skip to content

Commit

Permalink
Merge branch 'branch-0.15' of https://github.com/rapidsai/cuml into f…
Browse files Browse the repository at this point in the history
…ea-ext-clang-tidy
  • Loading branch information
teju85 committed May 24, 2020
2 parents 6c88a6d + 867ac72 commit 440aa31
Show file tree
Hide file tree
Showing 66 changed files with 2,154 additions and 554 deletions.
32 changes: 31 additions & 1 deletion CHANGELOG.md
Original file line number Diff line number Diff line change
@@ -1,6 +1,18 @@
# cuML 0.15.0 (Date TBD)

## New Features

## Improvements

- PR #2310: Pinning ucx-py to 0.14 to make 0.15 CI pass

## Bug Fixes

# cuML 0.14.0 (Date TBD)

## New Features
- PR #1994: Support for distributed OneHotEncoder
- PR #1892: One hot encoder implementation with cupy
- PR #1655: Adds python bindings for homogeneity score
- PR #1704: Adds python bindings for completeness score
- PR #1687: Adds python bindings for mutual info score
Expand All @@ -13,6 +25,7 @@
- PR #2074: SG and MNMG `make_classification`
- PR #2127: Added order to SG `make_blobs`, and switch from C++ to cupy based implementation
- PR #2057: Weighted k-means
- PR #2256: Add a `make_arima` generator
- PR #2245: ElasticNet, Lasso and Coordinate Descent MNMG
- PR #2242: Pandas input support with output as NumPy arrays by default

Expand Down Expand Up @@ -93,8 +106,17 @@
- PR #2225: input_to_cuml_array keep order option, test updates and cleanup
- PR #2244: Re-enable slow ARIMA tests as stress tests
- PR #2231: Using OPG structs from `cuml.common` in decomposition algorithms
- PR #2257: Update QN and LogisticRegression to use CumlArray
- PR #2259: Add CumlArray support to Naive Bayes
- PR #2252: Add benchmark for the Gram matrix prims
- PR #2264: Reduce build time for cuML by using make_blobs from libcuml++ interface
- PR #2269: Add docs targets to build.sh and fix python cuml.common docs
- PR #2271: Clarify doc for `_unique` default implementation in OneHotEncoder
- PR #2272: Add docs build.sh script to repository
- PR #2276: Ensure `CumlArray` provided `dtype` conforms
- PR #2281: Rely on cuDF's `Serializable` in `CumlArray`
- PR #2284: Reduce dataset size in SG RF notebook to reduce run time of sklearn
- PR #2285: Increase the threshold for elastic_net test in dask/test_coordinate_descent

## Bug Fixes
- PR #1939: Fix syntax error in cuml.common.array
Expand Down Expand Up @@ -132,13 +154,22 @@
- PR #2183: Fix RAFT in nightly package
- PR #2191: Fix placement of SVM parameter documentation and add examples
- PR #2212: Fix DBScan results (no propagation of labels through border points)
- PR #2215: Fix the printing of forest object
- PR #2217: Fix opg_utils naming to fix singlegpu build
- PR #2223: Fix bug in ARIMA C++ benchmark
- PR #2224: Temporary fix for CI until new Dask version is released
- PR #2228: Update to use __reduce_ex__ in CumlArray to override cudf.Buffer
- PR #2249: Fix bug in UMAP continuous target metrics
- PR #2258: Fix doxygen build break
- PR #2255: Set random_state for train_test_split function in dask RF
- PR #2275: Fix RF fit memory leak
- PR #2274: Fix parameter name verbose to verbosity in mnmg OneHotEncoder
- PR #2277: Updated cub repo path and branch name
- PR #2282: Fix memory leak in Dask RF concatenation
- PR #2301: Scaling KNN dask tests sample size with n GPUs
- PR #2293: Contiguity fixes for input_to_cuml_array and train_test_split
- PR #2295: Fix convert_to_dtype copy even with same dtype
- PR #2305: Fixed race condition in DBScan

# cuML 0.13.0 (Date TBD)

Expand All @@ -149,7 +180,6 @@
- PR #1766: Mean absolute error implementation with cupy
- PR #1766: Mean squared log error implementation with cupy
- PR #1635: cuML Array shim and configurable output added to cluster methods
- PR #1892: One hot encoder implementation with cupy
- PR #1586: Seasonal ARIMA
- PR #1683: cuml.dask make_regression
- PR #1689: Add framework for cuML Dask serializers
Expand Down
18 changes: 15 additions & 3 deletions build.sh
Original file line number Diff line number Diff line change
Expand Up @@ -18,7 +18,7 @@ ARGS=$*
# script, and that this script resides in the repo dir!
REPODIR=$(cd $(dirname $0); pwd)

VALIDARGS="clean libcuml cuml prims bench prims-bench -v -g -n --allgpuarch --singlegpu --nvtx --show_depr_warn -h --help"
VALIDARGS="clean libcuml cuml prims bench prims-bench cppdocs pydocs -v -g -n --allgpuarch --singlegpu --nvtx --show_depr_warn -h --help"
HELP="$0 [<target> ...] [<flag> ...]
where <target> is:
clean - remove all existing build artifacts and configuration (start over)
Expand All @@ -28,6 +28,8 @@ HELP="$0 [<target> ...] [<flag> ...]
prims - build the ML prims tests
bench - build the cuml C++ benchmark
prims-bench - build the ml-prims C++ benchmark
cppdocs - build the C++ API doxygen documentation
pydocs - build the general and Python API documentation
and <flag> is:
-v - verbose build mode
-g - build for debug
Expand Down Expand Up @@ -129,7 +131,7 @@ fi

################################################################################
# Configure for building all C++ targets
if (( ${NUMARGS} == 0 )) || hasArg libcuml || hasArg prims || hasArg bench || hasArg prims-bench; then
if (( ${NUMARGS} == 0 )) || hasArg libcuml || hasArg prims || hasArg bench || hasArg prims-bench || hasArg cppdocs; then
if (( ${BUILD_ALL_GPU_ARCH} == 0 )); then
GPU_ARCH=""
echo "Building for the architecture of the GPU in the system..."
Expand Down Expand Up @@ -184,9 +186,14 @@ if (( ${NUMARGS} == 0 )) || hasArg libcuml || hasArg prims || hasArg bench; then

fi

if hasArg cppdocs; then
cd ${LIBCUML_BUILD_DIR}
make doc
fi


# Build and (optionally) install the cuml Python package
if (( ${NUMARGS} == 0 )) || hasArg cuml; then
if (( ${NUMARGS} == 0 )) || hasArg cuml || hasArg pydocs; then

cd ${REPODIR}/python
if [[ ${INSTALL_TARGET} != "" ]]; then
Expand All @@ -195,4 +202,9 @@ if (( ${NUMARGS} == 0 )) || hasArg cuml; then
else
python setup.py build_ext -j${PARALLEL_LEVEL:-1} --inplace --library-dir=${LIBCUML_BUILD_DIR} ${SINGLEGPU}
fi

if hasArg pydocs; then
cd ${REPODIR}/docs
make html
fi
fi
105 changes: 49 additions & 56 deletions ci/docs/build.sh
Original file line number Diff line number Diff line change
@@ -1,37 +1,22 @@
#!/bin/bash
# Copyright (c) 2018-2020, NVIDIA CORPORATION.
#########################################
# cuML GPU build and test script for CI #
#########################################
set -ex

# Logger function for build status output
function logger() {
echo -e "\n>>>> $@\n"
}

# Set path and build parallel level
# Copyright (c) 2020, NVIDIA CORPORATION.
#################################
# cuML Docs build script for CI #
#################################

if [ -z "$PROJECT_WORKSPACE" ]; then
echo ">>>> ERROR: Could not detect PROJECT_WORKSPACE in environment"
echo ">>>> WARNING: This script contains git commands meant for automated building, do not run locally"
exit 1
fi

export DOCS_WORKSPACE=$WORKSPACE/docs
export PATH=/conda/bin:/usr/local/cuda/bin:$PATH
export PARALLEL_LEVEL=4
export CUDA_REL=${CUDA_VERSION%.*}
export CUDF_VERSION=0.8.*
export RMM_VERSION=0.8.*

# Set home to the job's workspace
export HOME=$WORKSPACE
export DOCS_DIR=/data/docs/html

while getopts "d" option; do
case ${option} in
d)
DOCS_DIR=${OPTARG}
;;
esac
done

################################################################################
# SETUP - Check environment
################################################################################
export PROJECT_WORKSPACE=/rapids/cuml
export LIBCUDF_KERNEL_CACHE_PATH="$HOME/.jitify-cache"
export NIGHTLY_VERSION=$(echo $BRANCH_VERSION | awk -F. '{print $2}')
export PROJECTS=(cuml libcuml)

logger "Check environment..."
env
Expand All @@ -40,43 +25,51 @@ logger "Check GPU usage..."
nvidia-smi

logger "Activate conda env..."
source activate gdf
conda install -c nvidia -c rapidsai -c rapidsai-nightly -c conda-forge \
cudf=$CUDF_VERSION rmm=$RMM_VERSION cudatoolkit=$CUDA_REL
source activate rapids
# TODO: Move installs to docs-build-env meta package
conda install -c anaconda beautifulsoup4 jq
pip install sphinx-markdown-tables

pip install numpydoc sphinx sphinx-rtd-theme sphinxcontrib-websupport

logger "Check versions..."
python --version
$CC --version
$CXX --version
conda list

################################################################################
# BUILD - Build libcuml and cuML from source
################################################################################
# Build Doxygen docs
logger "Build Doxygen docs..."
cd $PROJECT_WORKSPACE/cpp/build
make doc

# Build Python docs
logger "Build Sphinx docs..."
cd $PROJECT_WORKSPACE/docs
make html

#Commit to Website
cd $DOCS_WORKSPACE

cd $WORKSPACE
git submodule update --init --recursive
for PROJECT in ${PROJECTS[@]}; do
if [ ! -d "api/$PROJECT/$BRANCH_VERSION" ]; then
mkdir -p api/$PROJECT/$BRANCH_VERSION
fi
rm -rf $DOCS_WORKSPACE/api/$PROJECT/$BRANCH_VERSION/*
done

logger "Build libcuml..."
$WORKSPACE/build.sh clean libcuml cuml

################################################################################
# BUILD - Build doxygen docs
################################################################################
mv $PROJECT_WORKSPACE/cpp/build/html/* $DOCS_WORKSPACE/api/libcuml/$BRANCH_VERSION
mv $PROJECT_WORKSPACE/docs/build/html/* $DOCS_WORKSPACE/api/cuml/$BRANCH_VERSION

cd $WORKSPACE/cpp/build
logger "Build doxygen docs..."
make doc
# Customize HTML documentation
./update_symlinks.sh $NIGHTLY_VERSION
./customization/lib_map.sh

################################################################################
# BUILD - Build docs
################################################################################

logger "Build docs..."
cd $WORKSPACE/docs
make html
for PROJECT in ${PROJECTS[@]}; do
echo ""
echo "Customizing: $PROJECT"
./customization/customize_docs_in_folder.sh api/$PROJECT/ $NIGHTLY_VERSION
git add $DOCS_WORKSPACE/api/$PROJECT/*
done

rm -rf ${DOCS_DIR}/*
mv build/html/* $DOCS_DIR
5 changes: 3 additions & 2 deletions ci/gpu/build.sh
Original file line number Diff line number Diff line change
Expand Up @@ -58,9 +58,10 @@ conda install -c conda-forge -c rapidsai -c rapidsai-nightly -c nvidia \
"distributed>=2.12.0" \
"dask-cudf=${MINOR_VERSION}" \
"dask-cuda=${MINOR_VERSION}" \
"ucx-py=${MINOR_VERSION}" \
"ucx-py=0.14*" \
"statsmodels" \
"xgboost====1.0.2dev.rapidsai0.13" \
"xgboost==1.0.2dev.rapidsai0.13" \
"psutil" \
"lightgbm"


Expand Down
18 changes: 12 additions & 6 deletions conda/environments/cuml_dev_cuda10.0.yml
Original file line number Diff line number Diff line change
Expand Up @@ -11,8 +11,8 @@ dependencies:
- cmake=3.14.5
- numba>=0.46
- cupy>=7,<8.0.0a0
- cudf=0.14*
- rmm=0.14*
- cudf=0.15*
- rmm=0.15*
- cython>=0.29,<0.30
- pytest>=4.6
- pytest-timeout
Expand All @@ -21,15 +21,21 @@ dependencies:
- scikit-learn>=0.21
- dask>=2.12.0
- distributed>=2.12.0
- dask-cuda=0.14*
- dask-cudf=0.14*
- ucx-py=0.14*
- dask-cuda=0.15*
- dask-cudf=0.15*
- ucx-py=0.15*
- nccl>=2.5
- libcumlprims=0.14*
- libcumlprims=0.15*
- statsmodels
- protobuf >=3.4.1,<4.0.0
- doxygen
- sphinx
- sphinx_rtd_theme
- numpydoc
- nbsphinx
- recommonmark
- pip
- pip:
- sphinx_markdown_tables
- git+https://github.com/dask/dask.git
- git+https://github.com/dask/distributed.git
18 changes: 12 additions & 6 deletions conda/environments/cuml_dev_cuda10.1.yml
Original file line number Diff line number Diff line change
Expand Up @@ -11,8 +11,8 @@ dependencies:
- cmake=3.14.5
- numba>=0.46
- cupy>=7,<8.0.0a0
- cudf=0.14*
- rmm=0.14*
- cudf=0.15*
- rmm=0.15*
- cython>=0.29,<0.30
- pytest>=4.6
- pytest-timeout
Expand All @@ -21,15 +21,21 @@ dependencies:
- scikit-learn>=0.21
- dask>=2.12.0
- distributed>=2.12.0
- dask-cuda=0.14*
- dask-cudf=0.14*
- ucx-py=0.14*
- dask-cuda=0.15*
- dask-cudf=0.15*
- ucx-py=0.15*
- nccl>=2.5
- libcumlprims=0.14*
- libcumlprims=0.15*
- statsmodels
- protobuf >=3.4.1,<4.0.0
- doxygen
- sphinx
- sphinx_rtd_theme
- numpydoc
- nbsphinx
- recommonmark
- pip
- pip:
- sphinx_markdown_tables
- git+https://github.com/dask/dask.git
- git+https://github.com/dask/distributed.git
18 changes: 12 additions & 6 deletions conda/environments/cuml_dev_cuda10.2.yml
Original file line number Diff line number Diff line change
Expand Up @@ -11,8 +11,8 @@ dependencies:
- cmake=3.14.5
- numba>=0.46
- cupy>=7,<8.0.0a0
- cudf=0.14*
- rmm=0.14*
- cudf=0.15*
- rmm=0.15*
- cython>=0.29,<0.30
- pytest>=4.6
- pytest-timeout
Expand All @@ -21,15 +21,21 @@ dependencies:
- scikit-learn>=0.21
- dask>=2.12.0
- distributed>=2.12.0
- dask-cuda=0.14*
- dask-cudf=0.14*
- ucx-py=0.14*
- dask-cuda=0.15*
- dask-cudf=0.15*
- ucx-py=0.15*
- nccl>=2.5
- libcumlprims=0.14*
- libcumlprims=0.15*
- statsmodels
- protobuf >=3.4.1,<4.0.0
- doxygen
- sphinx
- sphinx_rtd_theme
- numpydoc
- nbsphinx
- recommonmark
- pip
- pip:
- sphinx_markdown_tables
- git+https://github.com/dask/dask.git
- git+https://github.com/dask/distributed.git
Loading

0 comments on commit 440aa31

Please sign in to comment.