Skip to content

Latest commit

 

History

History
15 lines (9 loc) · 682 Bytes

README.md

File metadata and controls

15 lines (9 loc) · 682 Bytes

This module is a prototype for complete the implementation of the xnor kernel on CUDA. With a tensorflow interface.

Heavily inspired by the original implementation in Theano by Matthieu Courbariaux

Major feature:

  1. Supports arbitrary size matrices.
  2. Comes with Tensorflow Binding

Speed up

Generated with the ipython notebook that is also in this repo. benchmark ran with CUDA 7.5, cuDNN v4 on Titan Black, Intel core i7-5820K

Speed Up comparison with cublas

Note: This code probably not the most optimized code, since it's my first CUDA program. Suggestions are welcome