Skip to content

mjvh80/LevenshteinDFA

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

6 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

LevenshteinDFA

Port's Lucene's Levenshtein DFA to .NET.

This is a fairly straightforward port with most parts ported to C#/.NET.

Example usage:

// Construct:
var auto = new LevenshteinAutomata("foobar", true).toAutomaton(1 /* distance 1 */);
auto = Operations.determinize(auto);

// Optionally minimize (removes cruft etc):
auto = MinimizationOperations.minimize(auto);

// Optimize for matching:
var compiled = new CompiledAutomaton(auto);

// Run/match:
Assert.True(compiled.Matches("foebar"));

###Completely Unofficial Benchmarks###

The following is some (not very rigorous timing) on a reasonable laptop running Windows 8.1.

Running

var timer = Stopwatch.StartNew();

for (var j = 0; j < numIterations; j++)
{
    var auto2 = new LevenshteinAutomata(input, true).toAutomaton(1);
    auto2 = Operations.determinize(auto2);
    var comp = new CompiledAutomaton(auto2);

    Parallel.For(0, words.Length, i =>
    {
       if (comp.Matches(wordCps[i]))
          Interlocked.Increment(ref hitCount);
    });
}

Here wordCps is a list of words that have been pre-computed as "codepoints".

The above executes in roughly 560 microseconds for a word list of 36,000 entries (averaged with over numIterations = 10000). This will depend on input but should give some indication of performance.

About

Port's Lucene's Levenshtein DFA to .NET

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages