Skip to content

Latest commit

 

History

History
17 lines (14 loc) · 856 Bytes

README.md

File metadata and controls

17 lines (14 loc) · 856 Bytes

LazyLanger

LazyLanger is a tool to extract most common words from a subtitle, and sort them out by number of appearance in a show / movie subtitle. Useful when learning a new language, to get the most used words from a language.

ToDo

  • Read a text file (subtitle) or multiple text files and merge into one file
  • Remove lines that start with numbers
  • Convert comma, dot, ? and ! into spaces
  • Split all words by space
  • Sort words by number of occurance
  • Return JSON containing list of words and number of occurance
  • Reject single character words <-- I'm too lazy at the moment for this one
  • Create UI, a form with upload field, nothing special
  • Display result in a table after sorting
  • Create CSV file of that table on demand (button to create a file)
  • Make each word start with a capital letter