Simple RNN Language Model

Quite simple RNN using LSTMs as its main units and words as tokens.
The model has been trained on two Jack London's novels (open source, Project Gutenberg):

360 epochs (with the increasing size of batches);
650KB of text (~93k sentences);
10k words in vocabulary;
a 5-word prompt;
generating a 200-word text.

Getting Started

Requirements: TensorFlow 2.12.1+ and Python 3.9.17+.

Create an instance of the language model based on your choice of RNN architecture.

# Example for creating an instance of WordRNN
language_model = WordRNN(dictionary_size=10_000, sentence_length=5)
language_model.compile_model()

Load the trained model to generate text.

# Generating text for unknown prompts
language_model.generate('I scarcely know where to begin... ', temperature=1.5)

Examples

Explore the capabilities of WordRNN with our provided examples:

Test the model's accuracy when given a known prompt (it will reproduce the chosen part of one of the novels).
Sample text for known prompts (it will combine sentences and phrases from the novels).
Generate text for unknown prompts to witness some creative language generation (it will generate completely new phrases or combine the learnt ones to create a new sentence).

While the text may not convey any meaningful message, it is evident that the model has grasped the fundamentals of forming coherent phrases, maintaining the correct grammatical structure and word relationships in some parts of the sentences. For instance, it appropriately combines articles with nouns and matches pronouns with verbs. However, it is crucial to note that the dataset used for training is relatively small. Furthermore, it's worth mentioning that when it comes to comprehending the basic connections between individual words in a sentence over extended distances, LSTM units fall short compared to transformers.

Architecture

Use the show_structure method to visualise the architecture of the model, giving you insights into its layers and structure.

Name		Name	Last commit message	Last commit date
Latest commit History 33 Commits
datasets		datasets
saved_model		saved_model
structure		structure
.gitignore		.gitignore
README.md		README.md
example.gif		example.gif
word_rnn.py		word_rnn.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Simple RNN Language Model

Getting Started

Examples

Architecture

About

Releases

Packages

Languages

TairYerniyazov/SimpleLanguageModel

Folders and files

Latest commit

History

Repository files navigation

Simple RNN Language Model

Getting Started

Examples

Architecture

About

Topics

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages