ML2-Poincare-Embedding

This repository is a simple implementation of Poincaré Embeddings for Learning Hierarchical Representations paper introduced by Maximilian Nickel and Douwe Kiela at Facebook AI Research.

Summary

This paper introduced an interesting model to learn vector representation of nodes in a graph. It takes a list of relations between nodes such as:

	dataset = [[banana fruit], [eatable_fruit fruit]]

Afterward, it attempts to learn the dataset vector representation such that the distance between nodes' vectors accurately represent how close the nodes are in the graph.

The novality of this paper is by introducing a new approach for learning hierarchical representations of the nodes by embedding them into hyperbolic space, or more precisely into an n-dimentional Poincare ball. The reason presented for this is that hyperbolic spaces are more suitable for capturing hierarchical and similarity information of nodes, as opposed to the commonly used Euclidean space. For more insights in understanding the following contents, please refer the paper above.

Distance Function

The model calculates the distances between two nodes' vectors through the following equation:

Where:

u, v are multi-dimentional vectors of any two words in the dataset.

The distances within the Poincare ball changes smoothly with respect to the location of the u and v vectors. This locality property of the Poincare distance is key for finding continous embeddings of hierarchies. For nodes close to the Poincare ball boundary, their distances to other nodes is relatively low in the Euclidean space terms.

Loss Function

The paper mentioned the following equation:

Where:

N(u) is a set of negative examples (nodes not related to the node u). The paper suggests to sample 10 negative examples per positive example for training. This loss function minimizes the distance between connected nodes and maximizes the distances between unconnected nodes.

Optimization

The paper presented the following equation in order to optimize the model embeddings:

where:

proj(θ) constrain the embeddings to remain within the Poincare ball via the following equation:

Repository Contents

This repository contains the following:

Poincaré embeddings for learning hierarchical representations paper
Sample data to train at data/*.tsv
implementation codebase pytorch_scripts.py and prog.py

Training and Testing the Model

In order to run and train the model, you have to make sure the following libraries are installed in your python3 version:

Pytorch
NLTK
Matplotlib

Th repo generates a pair of related words list file of (.tsv) extension imported from WordNet library. However, you may generate a list of word pairs file of (.tsv) extension in your own by saving it in the data/ folder to be fed to the model.

Once all is set, refer the prog.py script file and alter the following variable which should match the created file name in the data/ folder. There is several other parameters that you may update in the prog.py file depending on your own preference.

Examples

Setting the follwing variable as it shown below, will generate a similar result of the following graph:

Example (A)

	word = 'brown'

Example (B)

	word = 'fruit'

Contributors

Ali Moallim

License

GNU GENERAL PUBLIC LICENSE

Name		Name	Last commit message	Last commit date
Latest commit History 28 Commits
data		data
imgs		imgs
.gitattributes		.gitattributes
.gitignore		.gitignore
LICENSE		LICENSE
Poincare_Embedings_for_Learning_Hierarchical_Representation.pdf		Poincare_Embedings_for_Learning_Hierarchical_Representation.pdf
README.md		README.md
prog.py		prog.py
pytorch_scripts.py		pytorch_scripts.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

ML2-Poincare-Embedding

Summary

Distance Function

Loss Function

Optimization

Repository Contents

Training and Testing the Model

Examples

Contributors

License

About

Releases

Packages

Languages

License

amoallim15/ML2-Poincare-Embedding

Folders and files

Latest commit

History

Repository files navigation

ML2-Poincare-Embedding

Summary

Distance Function

Loss Function

Optimization

Repository Contents

Training and Testing the Model

Examples

Contributors

License

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages