A method written in Typescript, used for finding all common strings for Javascript and node.js, particularly quick for large string samples. It works in both web and node environment and it has no dependencies.
The easiest way to start is:
import substrings from 'common-substrings';
const result = substrings(stringArray, {
minOccurrence: 3,
minLength: 5,
});
Result is listed as an Object array, each element in the array include :
source
: the index of the labels which contain this fragment,name
: the name of the fragment,weight
: the product of the fragment length and the fragment occurrence
If we have the array ['java', 'javascript','pythonscript']
, using the default options, we will get result array:
[
{name : 'java', source : [0,1], weight : 8},
{name : 'script', source : [1,2], weight : 10}
]
The default options are:
minLength
: 3minOccurrence
: 2
Result is fetched from leaf to node of the trie, so it is not sorted, but it will be quite easy with lodash sortBy function , for example:
const resultSortByWeight = _.sortBy(result, ['weight']);
const resultSortByLength = _.sortBy(result, substring => substring.name.length);
Explanation here
The algorithm code is under The MIT License