This repository has been archived by the owner on Nov 17, 2023. It is now read-only.
-
Notifications
You must be signed in to change notification settings - Fork 6.8k
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
merge with 38f7c55 compiles on GPU update check alloc: Checkpoint. Pass elem-sum gpu test bug fix for copyfromto. sparse sgd test pass on gpu inefficient implementation for csr copy update submodule fix lint Simple bind with infer storage type (#32) * Symbol binding for sparse tensor development. (#31) * Initial checkin * Add init functions for simple bind in graph_executor * Add simple_bind c_api * Add simple bind c-api * Assign zeros to in_args, arg_grads, and aux_states * Add simple_bind2 python interface * Fix python interface bugs * Interface changes * Fix * Fix core dump * Add bind_ith_exec c_api * Change simple_bind2 * Fix seg fault * Finish simple_bind * Change _bind_ith_exec * Refactor simple_bind initialization flow for bind * Consolidate bind and simple_bind graph init flow * Fix bug * Clean up * Add comments * Clean up * Clean up * Minor correction * Rename APIs in graph executor * Refactor * Rebase * Delete deprecated functions * Move more front-end work to backend * Bug fix * Fix failed tests * Minor fix * Fix lint * Fix lint * Revert unnecessary changes * Revert * Revert * Clean up * Fix lint Conflicts: python/mxnet/symbol.py src/executor/graph_executor.cc * Add inferstorage to graph executor * re-enable tests for sparse embedding with simple_bind * type switch fix in sparse embedding" ; change `default` to `default_storage` for cast storage op (#33) * change default to default_storage * disable cpp test build temporarily attempt to fix windows build error, and fix lint (#34) update nnvm submodule (#37) Scipy build (#38) * update nnvm submodule * add scipy pip install for dockerfile Python3 unit tests (#39) * change xrange to range for python3 compatiblity" * remove more xrange from tests replace long with int for python3 (#40) fix the rest of TShape constructor errors (#41) fix lint (#42) fix wrong usage of mshadow::Shape1" (#43) implementation for Csr slice on cpu (#36) * CPU implementation for CSR remove seg_len from csr slice add some docs for slice csr change indptr, values, etc to be private member bug fix in sparse embedding update nnvm submoduel fix lint update unit test for sparse nd" * add const for SliceCsrIndPtr kernel Fix sparse dot according to the new RSP definition (#35) * Fix csr dot dns * Fix sparse dot * Add fallback and test cases for dot(csr, dns)=dns * Add int type switch * Fix * Fix * Fix update mshadow submodule (#44) Fix dns to rsp (#46) fix lint (#47) add runtime storage fallback detection" (#48) * add runtime storage fallback detection" * replace cast storage ex with cast storage impl Fm example (#45) * update csr slice logic to avoid confusion. add more exmaples. * add hint to module.update * more testcases(fallback) for sparse_nd * add to_csr() and to_rsp() method. More unit test (fallback now) * add fm test. fix lint * register sparse sgd under Optim.SGD * update dmlc-core submoduel * change indptr to _indptr temporarily. add const ref to fname fix lint fix lint; (#51) Guard gpu cast storage (#50) * Clean up * Fix typo Rearrange unit test files (#52) fix lint. add scipy for python_test. fix scipy.sparse import error. fix truediv for python3 fix travis test (#54) * remove pyc files * add verbose for travis nosetests cleanup some testing code and enums (#57) * update Makefile * refactor test_sparse_operator * change `default_storage` back to `default` * remove unused cpp tests port libsvm parser to mxnet as libsvm iter (#55) * copied csv iter to libsvm iter test libsvm iter draft handle round batch == false for csr batch loader code refactoring add get stype, shape interface to iiter separate class for sparse iter add missing file fix mem corruption' rename variables add comments also read label from libsvm add test. update docs. update submodule Conflicts: python/mxnet/sparse_ndarray.py * update submodule * fix lint * update test * revert naming change add benchmark scritp for dot (#59) * add benchmark scritp for dot add gpu option for bench add get_data funciton for benchmark print t_sparse, too; add comment change nnz to dnesity add backward * add comment update fm test (#62) introduce CSRNDarray and rowsparseNDarray to python frontend api (#58) * introduce CSRNDarray and rowsparseNDarray to python frontend api * temporarily disable fm_module test fix lint (#64) fix typo. disable libsvm io test (#65) Improve dot (#61) * Init checkin * Fix * Adjust dot parallelization methods * Set num_omp_threads for benchmark from command line * Fix omp thread number * Clean up * Add scipy as dot baseline * Fix format sparse_retain op (#66) * Initial checkin * Fix bugs * Add unit test for sparse_retain * Add example and modify test add storage cast for outputs that have non-default storage (#67) fix gpu build (#69) Fix test_sparse_retain python3 issue (#68) revert nnvm version
- Loading branch information
1 parent
49b1513
commit ac54762
Showing
82 changed files
with
5,909 additions
and
374 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,191 @@ | ||
import ctypes | ||
|
||
from mxnet.test_utils import * | ||
import scipy.sparse as sp | ||
import os | ||
import time | ||
import argparse | ||
|
||
from mxnet.base import check_call, _LIB | ||
|
||
parser = argparse.ArgumentParser(description="Benchmark sparse operators", | ||
formatter_class=argparse.ArgumentDefaultsHelpFormatter) | ||
parser.add_argument('--num-omp-threads', type=int, default=1, help='number of omp threads to set in MXNet') | ||
args = parser.parse_args() | ||
|
||
|
||
def get_avazu(data_dir): | ||
if not os.path.isdir(data_dir): | ||
os.system("mkdir " + data_dir) | ||
os.chdir(data_dir) | ||
if (not os.path.exists('avazu-app.t')): | ||
import urllib | ||
zippath = os.path.join(data_dir, "avazu-app.t.bz2") | ||
url = "https://www.csie.ntu.edu.tw/~cjlin/libsvmtools/datasets/binary/avazu-app.t.bz2" | ||
urllib.urlretrieve(url, zippath) | ||
# decompress | ||
os.system("bzip2 -d avazu-app.t.bz2") | ||
os.chdir("..") | ||
|
||
|
||
def test_dot_real(): | ||
def get_iter(path, data_shape, batch_size): | ||
data_train = mx.io.LibSVMIter(data_libsvm=path, | ||
data_shape=data_shape, | ||
batch_size=batch_size) | ||
data_iter = iter(data_train) | ||
return data_iter | ||
data_dir = os.path.join(os.getcwd(), 'data') | ||
get_avazu(data_dir) | ||
path = os.path.join(data_dir, 'avazu-app.t') | ||
# TODO(haibin) get file size automatically | ||
size = 336490781 >> 20 | ||
|
||
# model | ||
batch_size = 512 | ||
feature_dim = 1000000 | ||
data_shape = (feature_dim, ) | ||
train_iter = get_iter(path, data_shape, batch_size) | ||
|
||
k = 500 | ||
weight = mx.nd.random_uniform(low=0, high=1, shape=(feature_dim, k)) | ||
weight.wait_to_read() | ||
|
||
# start workload | ||
start = time.time() | ||
results = [] | ||
num_batch = 0 | ||
for batch in train_iter: | ||
data = train_iter.getdata() | ||
results.append(mx.nd.dot(data, weight)) | ||
num_batch += 1 | ||
for result in results: | ||
result.wait_to_read() | ||
|
||
end = time.time() | ||
cost = end - start | ||
print(size / cost, cost, num_batch, num_batch / cost) | ||
|
||
|
||
def test_dot_synthetic(): | ||
"""benchmark mx.nd.dot(sparse_ndarray, dense_ndarray) with given density. | ||
`t_sparse` is the time cost of dot(csr, dns), while `t_dense` is the time cost | ||
of dot(dns, dns), with the same matrix except that it is in default storage type. | ||
""" | ||
def measure_cost_forward_baseline(repeat, dot, lhs, rhs): | ||
start = time.time() | ||
for i in range(repeat): | ||
dot(lhs, rhs) | ||
end = time.time() | ||
diff = end - start | ||
return diff / repeat | ||
|
||
def measure_cost_backward_baseline(repeat, dot, transpose, lhs, rhs): | ||
start = time.time() | ||
for i in range(repeat): | ||
dot(transpose(lhs), rhs) | ||
end = time.time() | ||
diff = end -start | ||
return diff / repeat | ||
|
||
def measure_cost(repeat, f, *args, **kwargs): | ||
# start bench | ||
start = time.time() | ||
results = [] | ||
for i in range(repeat): | ||
results.append(f(*args, **kwargs)) | ||
for result in results: | ||
result.wait_to_read() | ||
end = time.time() | ||
diff = end - start | ||
return diff / repeat | ||
|
||
def bench_dot_forward(m, k, n, density, ctx, repeat): | ||
set_default_context(ctx) | ||
dns = mx.nd.random_uniform(shape=(k, n)).copyto(ctx) | ||
data_shape = (m, k) | ||
csr_data = rand_ndarray(data_shape, 'csr', density) | ||
dns_data = csr_data.to_dense() | ||
rhs_dns_np = dns.asnumpy() | ||
lhs_csr_sp = sp.csr_matrix(dns_data.asnumpy()) # csr in scipy | ||
lhs_dns_np = lhs_csr_sp.todense() | ||
|
||
data = [dns_data, csr_data] | ||
costs = [] | ||
for d in data: | ||
dns.wait_to_read() | ||
d.wait_to_read() | ||
cost = measure_cost(repeat, mx.nd.dot, d, dns) | ||
costs.append(cost / repeat) | ||
ratio = costs[1] / costs[0] | ||
|
||
costs_baseline = [] | ||
cost = measure_cost_forward_baseline(repeat, np.dot, lhs_dns_np, rhs_dns_np) | ||
costs_baseline.append(cost) | ||
cost = measure_cost_forward_baseline(repeat, sp.spmatrix.dot, lhs_csr_sp, rhs_dns_np) | ||
costs_baseline.append(cost) | ||
ratio_baseline = costs_baseline[1] / costs_baseline[0] | ||
fmt = "%0.1f\t\t%s\t%d\t%d\t%d\t%0.6f\t%0.5f\t%0.2f\t\t\t%0.6f\t%0.5f\t\t%0.2f" | ||
print(fmt % (density * 100, str(ctx), n, m, k, costs[1], costs[0], ratio, | ||
costs_baseline[1], costs_baseline[0], ratio_baseline)) | ||
|
||
def bench_dot_backward(m, k, n, density, ctx, repeat): | ||
set_default_context(ctx) | ||
dns = mx.nd.random_uniform(shape=(m, n)).copyto(ctx) | ||
data_shape = (m, k) | ||
csr_data = rand_ndarray(data_shape, 'csr', density) | ||
dns_data = csr_data.to_dense() | ||
rhs_dns_np = dns.asnumpy() | ||
lhs_csr_sp = sp.csr_matrix(dns_data.asnumpy()) | ||
lhs_dns_np = lhs_csr_sp.todense() | ||
|
||
data = [dns_data, csr_data] | ||
costs = [] | ||
for d in data: | ||
dns.wait_to_read() | ||
d.wait_to_read() | ||
cost = measure_cost(repeat, mx.nd.dot, d, dns, transpose_a=True) | ||
costs.append(cost) | ||
ratio = costs[1] / costs[0] | ||
|
||
costs_baseline = [] | ||
cost = measure_cost_backward_baseline(repeat, np.dot, np.transpose, lhs_dns_np, rhs_dns_np) | ||
costs_baseline.append(cost) | ||
cost = measure_cost_backward_baseline(repeat, sp.spmatrix.dot, sp.spmatrix.transpose, lhs_csr_sp, rhs_dns_np) | ||
costs_baseline.append(cost) | ||
ratio_baseline = costs_baseline[1] / costs_baseline[0] | ||
fmt = "%0.1f\t\t%s\t%d\t%d\t%d\t%0.6f\t%0.5f\t%0.2f\t\t\t%0.6f\t%0.5f\t\t%0.2f" | ||
print(fmt % (density * 100, str(ctx), n, m, k, costs[1], costs[0], ratio, | ||
costs_baseline[1], costs_baseline[0], ratio_baseline)) | ||
|
||
print("A = sparse NDArray of shape(m, k)") | ||
print("B = dense NDArray of shape(k, n)") | ||
print("dot_forward\tdot(csr, dns)") | ||
print('density(%)\tcontext\tn\tm\tk\tt_sparse\tt_dense\tt_sparse/t_dense' | ||
'\tt_scipy_sparse\tt_scipy_dense\tt_scipy_sparse/t_scipy_dense') | ||
|
||
check_call(_LIB.MXSetNumOMPThreads(ctypes.c_int(args.num_omp_threads))) | ||
# TODO(haibin) make these runtime options | ||
m = 512 | ||
k = [50000, 100000] | ||
n = [50, 100] | ||
density = [0.05, 0.02, 0.01, 0.005, 0.001] | ||
num_repeat = 10 | ||
# contexts = [mx.cpu(), mx.gpu(0)] | ||
contexts = [mx.cpu()] | ||
for i in range(2): | ||
for ctx in contexts: | ||
for den in density: | ||
bench_dot_forward(m, k[i], n[i], den, ctx, num_repeat) | ||
|
||
print("dot_backward\tdot(csr.T, dns)") | ||
print('density(%)\tcontext\tn\tm\tk\tt_sparse\tt_dense\tt_sparse/t_dense' | ||
'\tt_scipy_sparse\tt_scipy_dense\tt_scipy_sparse/t_scipy_dense') | ||
for i in range(2): | ||
for ctx in contexts: | ||
for den in density: | ||
bench_dot_backward(m, k[i], n[i], den, ctx, num_repeat) | ||
|
||
if __name__ == "__main__": | ||
test_dot_real() | ||
test_dot_synthetic() |
Submodule dmlc-core
updated
8 files
+18 −5 | include/dmlc/data.h | |
+17 −0 | include/dmlc/endian.h | |
+2 −2 | src/data/row_block.h | |
+46 −14 | src/io/s3_filesys.cc | |
+2 −0 | src/io/s3_filesys.h | |
+1 −0 | test/filesys_test.cc | |
+0 −4 | tracker/dmlc_tracker/ssh.py | |
+0 −2 | tracker/dmlc_tracker/yarn.py |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.