Skip to content

LogonTop

StephanOepen edited this page Nov 30, 2008 · 26 revisions

Overview

The LOGON infrastructure (and source tree) is a collection of software, grammars, and other linguistic resources to facilitate experimentation with transfer-based machine translation (MT). To a large degree, the LOGON tree packages resources that exist independently, specifically the core of the open-source [http://www.delph-in.net DELPH-IN] toolchain and several of the DELPH-IN grammars. These include, among others, the [http://www.delph-in.net/lkb LKB], [http://www.delph-in.net/pet PET], and itsdb software systems, and the [http://www.delph-in.net/erg LinGO ERG], [http://www.delph-in.net/gg GG], [http://www.delph-in.net/jacy JaCY], and [http://www.delph-in.net/srg SRG] broad-coverage grammars for English, German, Japanese, and Spanish, respectively. Additionally, the tree includes pre-compiled versions of other packages, for example [http://chasen.aist-nara.ac.jp/chasen/distribution.html.en ChaSen] (for Japanese pre-processing), the CMU Language Modeling Toolkit ([http://www.speech.cs.cmu.edu/SLM_info.html SLM]), [http://garraf.epsevg.upc.es/freeling/ FreeLing] (Spanish pre-processing), [http://tadm.sourceforge.net/ TADM] (for MaxEnt experimentation), [http://www.coli.uni-saarland.de/~thorsten/tnt/ TnT] (for English PoS tagging), and [http://www.coli.uni-saarland.de/projects/chorus/utool/ UTool] (for MRS manipulation).

The LOGON tree was originally developed by the Norwegian [http://www.emmtee.net LOGON] and [http://www.emmtee.net/index.php?page=7 HandOn] research projects, working on quality-oriented translation from Norwegian to English. For Norwegian analysis, these projects employed (an extended version of) the [http://maximos.aksis.uib.no/Aksis-wiki/Oslo-Bergen_Tagger Oslo-Bergen Tagger] (OBT) and the [http://www.hf.uib.no/i/LiLi/SLF/Dyvik/norgram/ NorGram] LFG implementation. Both resources are released under open-source licenses as part of the LOGON tree. However, to actually run the Norwegian–English instantiation of the system (dubbed NoEn), the proprietary [http://www2.parc.com/isl/groups/nltt/xle/ XLE] LFG system and a commercially licensed bilinguial dictionary (dubbed KF) are required, which cannot be part of the freely available LOGON distribution. Please see the LogonExtras page for instructions on how to install proprietary add-ons to the LOGON tree, e.g. for sites holding valid XLE and KF licenses. To see the Norwegian–English LOGON system at work, there is an [http://noen.emmtee.net on-line interface] to the MT demonstrator.

In subsequent collaborations between the original LOGON developers and DELPH-IN researchers in Germany, Japan, and the USA, additional language pairs were added. As of late 2008, these include German–English and Japanese–English (and, albeit lesser developed, the inverse directions of translation), as well as a battery of 'baby' MT systems built from a collection of smaller grammars based on the LinGO [http://www.delph-in.net/matrix Grammar Matrix]. In a sense, the LOGON tree functions similar to a Linux distribution: it combines a complex set of individual components, aiming to provide ease of installation and a certain degree of uniformity, inter-operability, and quality assurance. The system is available exclusively for Linux (on 32-bit or 64-bit x86 architectures). As of November 2008, all system development and distribution is through the [http://subversion.tigris.org/ SubVersion] (SVN) revision management system. Please see the LogonInstallation page for details. Regrettably, only a very limited amount of documentation is available, a property that the LOGON tree shares with a number of the core DELPH-IN resources. The LogonReports page summarizes the documentation misery as of late 2008.

Table of Contents

Following is a list of topics for which at least some documentation exists. Feel free to add additional materials, but please make sure to create adequate wiki names for new pages, typically prefixed with Logon where they pertain to specifics of the LOGON infrastructure.

  • LogonInstallation: System Requirements, Download and Installation Notes

  • LogonProcessing: Documentation of Various Batch Processing Facilities

  • LogonModeling: Information on Training and Applying Various Statistical Models

  • LogonOnline: Instructions on Creating On-Line, Web-Accessible Demonstrators

  • LogonWishlist: Feature Requests Contributed by LOGON Co-Developers and Users

Background Materials

Further information on the LOGON software and consortium can be found at the [http://www.emmtee.net/ project web site]; the following publication provides an overview of most of the core pieces:

  • Stephan Oepen, Erik Velldal, Jan Tore Lønning, Paul Meurer, Victoria Rosén, and Dan Flickinger (2007).

    [http://share.emmtee.net/pub/bscw.cgi/d64459/tmi07.pdf Towards hybrid quality-oriented machine translation. On linguistics and probabilities in MT]. In Proceedings of the 10th International Conference on Theoretical and Methodological Issues in Machine Translation, pp.144–153. Skövde, Sweden.

  • Stephan Oepen, Helge Dyvik, Jan Tore Lønning, Erik Velldal, Dorothee Beermann, John Carroll, Dan Flickinger, Lars Hellan, Janne Bondi Johannessen, Paul Meurer, Torbjørn Nordgård, and Victoria Rosén (2004).

    [http://share.emmtee.net/pub/bscw.cgi/d23044/tmi04.pdf Som å kapp-ete med trollet? Towards MRS-based Norwegian-English Machine Translation]. In Proceedings of the 10th International Conference on Theoretical and Methodological Issues in Machine Translation, pp. 11–20. Baltimore, MD.

The first paper discussing the use of Minimal Recursion Semantics in machine translation is:

  • Ann Copestake, Dan Flickinger, Rob Malouf, Susanne Riehemann and Ivan Sag (1995).

    [http://www.cl.cam.ac.uk/~aac10/papers/tmi95.ps.gz Translation using Minimal Recursion Semantics]. In Proceedings of The Sixth International Conference on Theoretical and Methodological Issues in Machine Translation, pp. 15–32. Leuven, Belgium.

An example of the extension of the LOGON machinery to a new language pair can be seen in

For additional information, there is an archived [http://lists.emmtee.net/mailman/listinfo/logon mailing list] for the LOGON tree. For additional questions, please feel free to contact Stephan Oepen (oe at ifi.uio.no), the technical manager for the original Norwegian LOGON consortium.

Clone this wiki locally