Fast, generic implementations of Dynamic Continuous Indexing and Prioritized Dynamic Continuous Indexing (PDCI).
Additionally, includes matrix hashes for several kinds of dissimilarity measures. These include:
- L2 distance (both E2LSH and P-stable LSHashing)
- L1 distance (P-stable)
- Total Variation Distance (a special case of P-stable L1)
- Lp distance, 1 < p < 2, using CMS sampling
- Jensen-Shannon Divergence
- S2JSD (Jensen-Shannon Metric, the square root of the JSD)
- Hellinger Distance
These can be used either in a table, such as dci::hash::LSHTable
, or for DCI.
The Fast Randomized Projections project in which this was originally developed also has an FHTHasher, which computes the projections using the FHT compatible with this.