Skip to content
Phil Paradis edited this page Sep 15, 2016 · 5 revisions

Here are some of the suggestions I've received

  • Instead of opening and closing files for each search, open them all at startup.

  • Instead of working with strings directly, use a fixed English dictionary where each word corresponds to a number or index. Then, each sentence becomes a list of numbers. Multiple déclinations of the same words can have the same number, for example (is, are) or (dog, dogs). This makes text processing easier and much faster to perform. See Bags-of-words model: https://en.wikipedia.org/wiki/Bag-of-words_model