Skip to content

TairYerniyazov/SimpleLanguageModel

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

33 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Simple RNN Language Model

Quite simple RNN using LSTMs as its main units and words as tokens.
The model has been trained on two Jack London's novels (open source, Project Gutenberg):

  • 360 epochs (with the increasing size of batches);
  • 650KB of text (~93k sentences);
  • 10k words in vocabulary;
  • a 5-word prompt;
  • generating a 200-word text.

Getting Started

Requirements: TensorFlow 2.12.1+ and Python 3.9.17+.

  1. Create an instance of the language model based on your choice of RNN architecture.
# Example for creating an instance of WordRNN
language_model = WordRNN(dictionary_size=10_000, sentence_length=5)
language_model.compile_model()
  1. Load the trained model to generate text.
# Generating text for unknown prompts
language_model.generate('I scarcely know where to begin... ', temperature=1.5)

Examples

Explore the capabilities of WordRNN with our provided examples:

  • Test the model's accuracy when given a known prompt (it will reproduce the chosen part of one of the novels).
  • Sample text for known prompts (it will combine sentences and phrases from the novels).
  • Generate text for unknown prompts to witness some creative language generation (it will generate completely new phrases or combine the learnt ones to create a new sentence).

Structure of the model (layers)

While the text may not convey any meaningful message, it is evident that the model has grasped the fundamentals of forming coherent phrases, maintaining the correct grammatical structure and word relationships in some parts of the sentences. For instance, it appropriately combines articles with nouns and matches pronouns with verbs. However, it is crucial to note that the dataset used for training is relatively small. Furthermore, it's worth mentioning that when it comes to comprehending the basic connections between individual words in a sentence over extended distances, LSTM units fall short compared to transformers.

Architecture

Use the show_structure method to visualise the architecture of the model, giving you insights into its layers and structure.

Structure of the model (layers)

Releases

No releases published

Packages

No packages published

Languages