Skip to content

Prefix tree function macros for creating a prefix tree on a big data system and using an edit distance algorithm to query it

Notifications You must be signed in to change notification settings

Charles-Kaminski/PrefixTree

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

8 Commits
 
 
 
 

Repository files navigation

PrefixTree

CFK 02/23/2016 [email protected]

This project is a code bundle for the HPCCSystem big data platform. The code showcases using prefix trees on a big data platform and demonstrates how prefix trees can be used to significantly improve the performance of certain algorithms such as the Levenshtein edit distance algorithm. The code is a set of three function macros that do the heavy lifting for you.

They are as follows:

  1. Create - Efficiently creates a prefix tree from a dataset
  2. QueryThorLevenshtein - Uses a dataset to query a prefix tree using Levenshtein
  3. QueryRoxieLevenshtein - Uses a string to query a prefix tree using Levenshtein

Usage examples are at the end of the file PrefixTree.ecl

Read the original code walk-through on the HPCC Systems blog: https://hpccsystems.com/resources/blog?uid=225

Read how to install code bundles onto the HPCC platform: https://github.com/hpcc-systems/HPCC-Platform/blob/master/ecl/ecl-bundle/BUNDLES.rst#installing-a-bundle

Read about other bundles: https://github.com/hpcc-systems/ecl-bundles

About

Prefix tree function macros for creating a prefix tree on a big data system and using an edit distance algorithm to query it

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages