warp-transducer

A fast parallel implementation of RNN Transducer (Graves 2013 joint network), on both CPU and GPU.

GPU implementation is now available for Graves2012 add network.

GPU Performance

Benchmarked on a GeForce GTX 1080 Ti GPU.

T=150, L=40, A=28	warp-transducer
N=1	8.51 ms
N=16	11.43 ms
N=32	12.65 ms
N=64	14.75 ms
N=128	19.48 ms

T=150, L=20, A=5000	warp-transducer
N=1	4.79 ms
N=16	24.44 ms
N=32	41.38 ms
N=64	80.44 ms
N=128	51.46 ms

Interface

The interface is in include/rnnt.h. It supports CPU or GPU execution, and you can specify OpenMP parallelism if running on the CPU, or the CUDA stream if running on the GPU. We took care to ensure that the library does not preform memory allocation internally, in oder to avoid synchronizations and overheads caused by memory allocation. Please be carefull if you use the RNNTLoss CPU version, log_softmax should be manually called before the loss function. (For pytorch binding, this is optionally handled by tensor device.)

Compilation

warp-transducer has been tested on Ubuntu 16.04 and CentOS 7. Windows is not supported at this time.

First get the code:

git clone https://github.com/HawkAaron/warp-transducer
cd warp-transducer

create a build directory:

mkdir build
cd build

if you have a non standard CUDA install, add -DCUDA_TOOLKIT_ROOT_DIR=/path/to/cuda option to cmake so that CMake detects CUDA.

Run cmake and build:

cmake -DCUDA_TOOLKIT_ROOT_DIR=$CUDA_HOME ..
make

if it logs

-- cuda found TRUE
-- Building shared library with no GPU support

please run rm CMakeCache.txt and cmake again.

The C library should now be built along with test executables. If CUDA was detected, then test_gpu will be built; test_cpu will always be built.

Test

To run the tests, make sure the CUDA libraries are in LD_LIBRARY_PATH (DYLD_LIBRARY_PATH for OSX).

Contributing

We welcome improvements from the community, please feel free to submit pull requests.

Name		Name	Last commit message	Last commit date
Latest commit History 99 Commits
docs		docs
include		include
mxnet_binding @ 75874b2		mxnet_binding @ 75874b2
pytorch_binding		pytorch_binding
src		src
tensorflow_binding		tensorflow_binding
tests		tests
.gitignore		.gitignore
.gitmodules		.gitmodules
CMakeLists.txt		CMakeLists.txt
LICENSE		LICENSE
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

warp-transducer

GPU Performance

Interface

Compilation

Test

Contributing

Reference

About

Releases

Packages

Languages

License

noahchalifour/warp-transducer

Folders and files

Latest commit

History

Repository files navigation

warp-transducer

GPU Performance

Interface

Compilation

Test

Contributing

Reference

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages