Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

link-generator defaults to Lithuanian #1499

Open
ryandesign opened this issue Apr 21, 2024 · 1 comment
Open

link-generator defaults to Lithuanian #1499

ryandesign opened this issue Apr 21, 2024 · 1 comment

Comments

@ryandesign
Copy link
Contributor

I'm using link-grammar for the first time without knowing anything about it nor having read a great deal of documentation. I've built version 5.12.4 from source on macOS 12.

I ran link-generator with no arguments and it said:

% link-generator                                
#
# Corpus for language: "lt"
# Sentence length: 6
# Requested number of linkages: 500
# Requested number to print: 20
link-grammar: Info: Dictionary found at /opt/local/share/link-grammar/lt/4.0.dict
link-grammar: Info: lt: Spell checker disabled.
# Dictionary version 5.11.0
# Number of categories: 431
# Linkages found: 141388
# Linkages generated: 389
# Number of unused disjuncts: 1438
#
LEFT-WALL au =ga : gyvename namo .
LEFT-WALL au =gtume ; einam namo !
LEFT-WALL au =gsiu : einame namo !
LEFT-WALL au =gai ; gyvename namo !
LEFT-WALL au =gdavo , einam namo .
LEFT-WALL au =gi ; einame namo !
LEFT-WALL au =game ; gyvename namo ?
LEFT-WALL au =gtų ; einame namo ?
LEFT-WALL au =gs : gyvenam namo !
LEFT-WALL au =gtume , einam namo ?
LEFT-WALL au =gs ; einam namo .
LEFT-WALL au =gam ; einam namo .
LEFT-WALL au =ga ; einam namo .
LEFT-WALL au =ga : gyvename namo ?
LEFT-WALL au =gate , einame namo .
LEFT-WALL au =gdavai , einame namo !
LEFT-WALL au =gtų , gyvename namo ?
LEFT-WALL au =gs : einam namo !
LEFT-WALL au =gi ; einame namo .
LEFT-WALL au =gdavome , einame namo !
# Bye.

Language "lt" is Lithuanian, yes? It surprises me to see the program default to Lithuanian when I am located in the United States with typical English locale settings:

% locale
LANG="en_US.UTF-8"
LC_COLLATE="en_US.UTF-8"
LC_CTYPE="en_US.UTF-8"
LC_MESSAGES="en_US.UTF-8"
LC_MONETARY="en_US.UTF-8"
LC_NUMERIC="en_US.UTF-8"
LC_TIME="en_US.UTF-8"
LC_ALL=
% link-generator --version
Version: link-grammar-5.12.4
Compiled with: /usr/bin/clang __VERSION__="Apple LLVM 13.0.0 (clang-1300.0.29.30)"  
OS: darwin21.6.0 __APPLE__ __MACH__ 
Standards: __STDC_VERSION__=201112L 
Configuration (source code):
	CPPFLAGS=-I/opt/local/include -isysroot/Library/Developer/CommandLineTools/SDKs/MacOSX12.sdk
	CFLAGS=-D_DEFAULT_SOURCE -std=c11 -D_BSD_SOURCE -D_SVID_SOURCE -D_GNU_SOURCE -D_ISOC11_SOURCE -fvisibility=hidden -pipe -Os -isysroot/Library/Developer/CommandLineTools/SDKs/MacOSX12.sdk -arch x86_64
Configuration (features):
	DICTIONARY_DIR=/opt/local/share/link-grammar
	-DPACKAGE_NAME="link-grammar" -DPACKAGE_TARNAME="link-grammar" -DPACKAGE_VERSION="5.12.4" -DPACKAGE_STRING="link-grammar 5.12.4" -DPACKAGE_BUGREPORT="https://github.com/opencog/link-grammar" -DPACKAGE_URL="https://opencog.github.io/link-grammar-website" -DPACKAGE="link-grammar" -DVERSION="5.12.4" -DHAVE_STDIO_H=1 -DHAVE_STDLIB_H=1 -DHAVE_STRING_H=1 -DHAVE_INTTYPES_H=1 -DHAVE_STDINT_H=1 -DHAVE_STRINGS_H=1 -DHAVE_SYS_STAT_H=1 -DHAVE_SYS_TYPES_H=1 -DHAVE_UNISTD_H=1 -DSTDC_HEADERS=1 -DHAVE_DLFCN_H=1 -DLT_OBJDIR=".libs/" -DYYTEXT_POINTER=1 -DHAVE_STRNDUP=1 -DHAVE_STRTOK_R=1 -DHAVE_SIGACTION=1 -DHAVE_ALIGNED_ALLOC=1 -DHAVE_POSIX_MEMALIGN=1 -DHAVE_ALLOCA_H=1 -DHAVE_ALLOCA=1 -DHAVE_FORK=1 -DHAVE_VFORK=1 -DHAVE_WORKING_VFORK=1 -DHAVE_WORKING_FORK=1 -D__STDC_FORMAT_MACROS=1 -D__STDC_LIMIT_MACROS=1 -DTLS=_Thread_local -DHAVE_PTHREAD_PRIO_INHERIT=1 -DHAVE_PTHREAD=1 -DHAVE_VISIBILITY=1 -DHAVE_LOCALE_T_IN_XLOCALE_H=1 -DHAVE_XLOCALE_H=1 -DHAVE_STDATOMIC_H=1 -DHAVE_MKLIT=1 -DUSE_SAT_SOLVER=1 -DUSE_WORDGRAPH_DISPLAY=1 -DHAVE_SQLITE3=1 -DHAVE_HUNSPELL=1 -DHUNSPELL_DICT_DIR="/Library/Spelling" -DHAVE_EDITLINE=1 -DHAVE_WIDECHAR_EDITLINE=1 -DHAVE_REGEX_H=1 -DHAVE_REGEXEC=1 -DHAVE_DECL_STRERROR_R=1 -DHAVE_STRERROR_R=1
@ampli
Copy link
Member

ampli commented Apr 21, 2024

You're right. However, in the current state of link-generator, lt might be its most useful language since it has a small dictionary. en is currently extremely slow for sentences with more than a few words. In the discussion section (or maybe issues), @linas suggested speeding it up by disjunct sampling. Efficiency fixes are needed too. I still need to implement most of that. It also lacks a useful API. Suggestions are welcome.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants