-
Notifications
You must be signed in to change notification settings - Fork 0
Recurrent Neural Networks
RNNs are a good architecture for working with sequential data in the form of x(1), ... , x(τ). RNNs are better suited for sequences with many members, and varied lengths than the perceptron or feed-forward layers. This is because they are able to use the relationship between adjacent members of a sequence to make their predictions. Due to their ability to retain information about previous states and process large sequences, RNNs are a good option for time-series analysis and natural language processing.
RNNs process sequential data by defining a recurrence relation over time steps which is typically the following formula:
Where
The final output of the network at a certain time step is typically computed from one or more states.
This structure allows us to predict the next state
In the graphical representation bellow, we "unfold" the formula given above. On the left is an overview of the above mathematical representation of an RNN. On the right, we walk through every state generated by a sequence:
Source for information and image: How to implement an RNN (1/2) - Minimal example
RNNs are great for tasks that require a many-to-one, or many-to-many mapping. For example, a "character-level RNN" uses the individual characters in a given string as its input, rather than words or sentences. The network learns the underlying patterns that govern which character is allowed next to which one in a given language. In a many-to-one architecture, the RNN generates only one output, which could be a class the string belongs to.
In many-to-many architectures, the network receives an input sequence of characters and generates an output sequence of characters. For example mapping a string in a source language, to a similar one in a target language.
Source: Going Under the Hood of Character-Level RNNs: A NumPy-based Implementation Guide
We will transition into this Python notebook for our demonstration: pytorch_char_rnn_classification_tutorial.ipynb
UArizona DataLab, Data Science Institute, 2024