-
Notifications
You must be signed in to change notification settings - Fork 0
Project Extended From http://www-i6.informatik.rwth-aachen.de/web/Software/g2p.html based on Bisani (2008)
License
KaranKamath/SequiturG2P
Folders and files
Name | Name | Last commit message | Last commit date | |
---|---|---|---|---|
Repository files navigation
Fixes And Enhancements: ===== - http://www.voxforge.org/home/forums/message-boards/audio-discussions/sequitur-g2p--symbol-out-of-range-error/2 - A confusion matrix is now printed out at the end to help analyse errors - For data with single digit tone indicators for phonemenes, --tones helps extract and print the confusion matrix for tones Original README: =============== Sequitur G2P ============ A trainable Grapheme-to-Phoneme converter. Introduction ------------ Sequitur G2P is a data-driven grapheme-to-phoneme converter written at RWTH Aachen University by Maximilian Bisani. The method used in this software is described in M. Bisani and H. Ney: "Joint-Sequence Models for Grapheme-to-Phoneme Conversion". Speech Communication, Volume 50, Issue 5, May 2008, Pages 434-451 (avaliable online at http://dx.doi.org/10.1016/j.specom.2008.01.002) This software is made available to you under terms of the GNU Public License. It can be used for experimentation and as part of other free software projects. For details see the licensing terms below. If you publish about work that involves the use of this software, please cite the above paper. (You should feel obliged to do so by rules of good scientific conduct.) You may contact the author with any questions or comments via e-mail: [email protected] License ------- This program is free software; you can redistribute it and/or modify it under the terms of the GNU General Public License Version 2 (June 1991) as published by the Free Software Foundation. This program is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for more details. You should have received a copy of the GNU General Public License along with this program; if not, you will find it at http://www.gnu.org/licenses/gpl.html, or write to the Free Software Foundation, Inc., 51 Franlin Street, Fifth Floor, Boston, MA 02110, USA. Should a provision of no. 9 and 10 of the GNU General Public License be invalid or become invalid, a valid provision is deemed to have been agreed upon which comes closest to what the parties intended commercially. In any case guarantee/warranty shall be limited to gross negligent actions or intended actions or fraudulent concealment. Installing ---------- To build and use this software you need to have the following part installed: - Python (http://www.python.org) tested with 2.5 - SWIG (http://www.swig.org) tested with 1.3.31 - NumPy (http://numpy.scipy.org) tested with 1.0.4 - a C++ compiler that's recognized by Python's distutils. tested with GCC 4.1, 4.2 and 4.3 To install change to the source directory and type: python setup.py install --prefix /usr/local You may substitue /usr/local with some other directory. If you do so make sure that some-other-directory/lib/python2.5/site-packages/ is in your PYTHONPATH, e.g. by typing export PYTHONPATH=some-other-directory/lib/python2.5/site-packages Using ----- Sequitur G2P is a data-driven grapheme-to-phoneme converter. Actually, it can be applied to any monotonous sequence translation problem, provided the source and target alphabets are small (less than 255 symbols). Data-driven means that you need to train it with example pronunciations. It has no built-in linguistic knowledge whatsoever, which means that it should work for any alphabetic language. Training takes a pronunciation dictionary and creates a model file. The model file can then be used to transcribe words that where not in the dictionary. Here is step-by-step guide to get you started: 1. Obtain a pronunciation dictionary for training. The format is one word per line. Each line contains the orthographic form of the word followed by the corresponding phonemic transcription. The word and all phonemes need to be separated by white space. The word and phoneme symbols may thus not contain blanks. We'll assume your training lexicon is called train.lex, and that you set aside some portion for testing purposes as test.lex, which is disjoint from train.lex. 2. Train a model. To create a first model type: g2p.py --train train.lex --devel 5% --write-model model-1 This first model will be rather poor because it is only a unigram. To create higher order models you need to run g2p.py again: g2p.py --model model-1 --ramp-up --train train.lex --devel 5% --write-model model-2 Repeat this a couple of times g2p.py --model model-2 --ramp-up --train train.lex --devel 5% --write-model model-3 g2p.py --model model-3 --ramp-up --train train.lex --devel 5% --write-model model-4 ... 3. Evaluate the model. To find out how accurately your model can transcribe unseen words type: g2p.py --model model-6 --test test.lex 4. Transcribe new words. Prepare a list of words you want to transcribe as a simple text file words.txt with one word per line (and no phonemic transcription), then type: g2p.py --model model-3 --apply words.txt Random comments: - Whenever a file name is required, you can specify "-" to mean standard in, or standard out. - If a file name ends in ".gz", it is assumed that the file is (or should be) compressed using gzip. - For the time being you need to type g2p.py --help and/or read the source to find out the other things g2p.py can do. Sorry about that.
About
Project Extended From http://www-i6.informatik.rwth-aachen.de/web/Software/g2p.html based on Bisani (2008)
Resources
License
Stars
Watchers
Forks
Releases
No releases published
Packages 0
No packages published