Skip to content

Orion34-lanbo/binary_ops

 
 

Repository files navigation

This module is a prototype for complete the implementation of the xnor kernel on CUDA. With a tensorflow interface.

Heavily inspired by the original implementation in Theano by Matthieu Courbariaux

Major feature:

  1. Supports arbitrary size matrices.
  2. Comes with Tensorflow Binding

Speed up

Generated with the ipython notebook that is also in this repo. benchmark ran with CUDA 7.5, cuDNN v4 on Titan Black, Intel core i7-5820K

Speed Up comparison with cublas

Note: This code probably not the most optimized code, since it's my first CUDA program. Suggestions are welcome

About

No description, website, or topics provided.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • Jupyter Notebook 41.6%
  • C++ 33.6%
  • Cuda 19.5%
  • Python 2.7%
  • CMake 2.6%