Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

hfst-ospell -v does not list correct metadata #32

Closed
albbas opened this issue Sep 5, 2017 · 7 comments
Closed

hfst-ospell -v does not list correct metadata #32

albbas opened this issue Sep 5, 2017 · 7 comments
Assignees

Comments

@albbas
Copy link

albbas commented Sep 5, 2017

OS: KDE neon User Edition 5.10, based on Ubuntu 16.04

Installed packages:

hfst-ospell  0.4.5~r344-0ubuntu1~xenial1
giella-sme  0.0.20150917~r156539-1~sid1

For some reason the executable installed in the hfst-ospell package does not display the correct info about the loaded speller package.

Executable from the package:

/usr/bin/hfst-ospell -v /usr/share/voikko/3/se.zhfst
Following metadata was read from ZHFST archive:
locale: und
version:  [vcsrev: ]
date:
producer: [email: <>, website: <>]

From self built hfst-ospell

Output from the shell script in the hfst-ospell directory:

hfst-ospell $ ./hfst-ospell -v /usr/share/voikko/3/se.zhfst
Following metadata was read from ZHFST archive:
locale: se
version: GT_VERSION [vcsrev: GT_REVISION]
date: DATE
producer: Giellatekno/Divvun/UiT contributors[email: <[email protected]>, website: <http://divvun.no>]
title [fi]: Pohjoissaamen oikoluku
title [nb]: Nordsamisk stavekontroll
title [se]: Davvisámi čállindárkisteaddji
title [sma]: Noerhtesaemien staeriedimmiedïrregh
title [smj]: Nuorttasáme duollatjállemdárkastus
title [sv]: Nordsamisk rättstavning
description [se]: This is an fst-based speller for Northern Sámi made by
    Divvun/Giellatekno/UiT. It is based
    on the normative subset of the morphological analyser for Northern Sámi.
    The source code can be found at:
    https://victorio.uit.no/langtech/trunk/langs/sme/
    License: GPL3+.
acceptor[default.] [id: acceptor.default.hfst, type: generaltrtype: ]
title [se]: Giellatekno/Divvun/UiT dictionary Northern Sámi
description[se]: Giellatekno/Divvun/UiT dictionary for
    Northern Sámi compiled for HFST.
errmodel[default.] [id: errmodel.default.hfst]
title [se]: Levenshtein edit distance transducer
description[se]: Correction model for keyboard misstrokes, at most 2 per
    word.
type: default
model: errormodel.default.hfst

Output from the executable found in .libs:

hfst-ospell $ .libs/hfst-ospell -v /usr/share/voikko/3/se.zhfst
Following metadata was read from ZHFST archive:
locale: und
version:  [vcsrev: ]
date:
producer: [email: <>, website: <>]
@albbas
Copy link
Author

albbas commented Sep 5, 2017

After I reported this behaviour, I installed libtinyxml2, to check whether hfst-ospell would work better with that library than the default libxml++

441 sudo apt install libtinyxml-dev
447 sudo apt install libtinyxml2-dev
448 ./configure --with-tinyxml2 --without-libxmlpp
454 make -j

and now .libs/hfst_ospell shows the correct metadata.

But, now hfst-ospell from the package also shows the correct metadata

hfst-ospell $ /usr/bin/hfst-ospell -v /usr/share/voikko/3/se.zhfst
Following metadata was read from ZHFST archive:
locale: se
version: GT_VERSION [vcsrev: GT_REVISION]
date: DATE
producer: Giellatekno/Divvun/UiT contributors[email: [email protected], website: http://divvun.no]
title [fi]: Pohjoissaamen oikoluku
title [nb]: Nordsamisk stavekontroll
title [se]: Davvisámi čállindárkisteaddji
title [sma]: Noerhtesaemien staeriedimmiedïrregh
title [smj]: Nuorttasáme duollatjállemdárkastus
title [sv]: Nordsamisk rättstavning
description [se]: This is an fst-based speller for Northern Sámi made by
Divvun/Giellatekno/UiT. It is based
on the normative subset of the morphological analyser for Northern Sámi.
The source code can be found at:
https://victorio.uit.no/langtech/trunk/langs/sme/
License: GPL3+.
acceptor[default.] [id: acceptor.default.hfst, type: generaltrtype: ]
title [se]: Giellatekno/Divvun/UiT dictionary Northern Sámi
description[se]: Giellatekno/Divvun/UiT dictionary for
Northern Sámi compiled for HFST.
errmodel[default.] [id: errmodel.default.hfst]
title [se]: Levenshtein edit distance transducer
description[se]: Correction model for keyboard misstrokes, at most 2 per
word.
type: default
model: errormodel.default.hfst

I tried uninstalling and purging hfst-ospell and install it again, and it still shows the correct metadata.

@albbas
Copy link
Author

albbas commented Sep 5, 2017

And voikkospell also reports the correct info now:
voikkospell -l
Unknown file in archive ./._acceptor.default.hfst
Unknown file in archive ./._errmodel.default.hfst
Unknown file in archive ./._index.xml
se-x-standard: Giellatekno/Divvun/UiT fst-based speller for Northern Sami

Voikkospell reported about und before I began looking at the above issue …

@TinoDidriksen
Copy link
Member

The Debian/Ubuntu packages are built entirely without XML support, and thus cannot display any such information. I disabled XML because of #21 and #22 - there is no supported XML library that'll work on all supported platforms.

And the XML didn't seem meaningful, or it would be a runtime requirement - the zhfst files work perfectly file without the XML.

If the XML is meaningful, I guess I'll have to write a tiny fallback parser to extract at least the ISO 639 codes.

@albbas
Copy link
Author

albbas commented Sep 5, 2017

Of course I had a locally built voikkospell in /usr/local/bin. Using voikkospell provided by the package, I get this report now:

/usr/bin/voikkospell -l
fi-x-standard: suomi (perussanasto)
se-x-standard: Davvisámi čállindárkisteaddji
sma-x-standard: Åarjelsaemien staeriedimmiedïrregh
smj-x-standard: Julevsáme duollatjállemdárkastus

I don't know if the voikkospell executable provided by the package behaved like the locally built one did.

@snomos
Copy link
Member

snomos commented Sep 6, 2017

XML support is crucial when integrating with libvoikko. It is the means by which we communicate to libvoikko the languages available for spell checking. As demonstrated in this bug report, hfst-ospell is just broken without it.

I don't know about #21, but #22 should be easy to fix.

@TinoDidriksen
Copy link
Member

I think this was fixed ages ago in commit dad7c5c - is this still a problem?

@TinoDidriksen TinoDidriksen self-assigned this Sep 20, 2018
@TinoDidriksen
Copy link
Member

Nobody said anything for a year - reopen if this wasn't actually fixed.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants