This project is a dictionary based lemmatizer written in pure go, without external dependencies.
A lemmatizer is a tool that finds the base form of words.
Lang | Input | Output |
---|---|---|
English | aligning | align |
Swedish | sprungit | springa |
French | abattaient | abattre |
It's based on the dictionaries found on lexiconista.com, which are available under the Open Database License. This project would not be feasible without them.
At the moment I have added English, Swedish, French, Spanish & German, but adding another language should be no more trouble than getting the dictionary for that language. Some of which are already available on lexiconista. Please let me know if there is something you would like to see in here, or fork the project and create a pull request.
package main
import (
"github.com/aaaton/golem"
)
func main() {
// "en" and "english" will give an english lemmatizer
lemmatizer, err := golem.New("english")
if err != nil {
panic(err)
}
word := lemmatizer.Lemma("Abducting")
if word != "abduct" {
panic("The output is not what is expected!")
}
}