lit2vec

This repository aims to analyze the relationships between literary works mentioned in scholarly works. Different from co-occurence approach, we here use word2vec method to messeaur the context similarity between mentions of literary works in scholarly works.

Using gensim, the word2vec model was trained on our scholarly work corpus, which currently 26 texts or 289,746 tokens. More scholarly works will be added in the future (s. Zotero Library).

work_identifier.tsv: contains all identifiers of literary works and their MiMoText_ID as well as Wikidata_ID, if existing.
word2vec.py: script to train a word2vec model with all.tsv which cannot be published here due to copy right issue.
word2vec.model: word2vec model trained on MiMoText scholarly work corpus usding gensim.
similarity.py: compute similary between all work pairs, using model.wv.similarity().
similarity.tsv: output of similarity.py
plt_2d.py: plot all work_identifiers in a 2d space
plot_2d.html: output of plt_2d.py (download and open it with broswer to explore it)
plt_3d.py: plot all work_identifiers in a 3d space
plot_3d.html: output of plt_3d.py (download and open it with broswer to explore it)

Name		Name	Last commit message	Last commit date
Latest commit History 14 Commits
.gitignore		.gitignore
README.md		README.md
plot_2d.html		plot_2d.html
plot_2d.png		plot_2d.png
plot_2d.py		plot_2d.py
plot_3d.html		plot_3d.html
plot_3d.png		plot_3d.png
plot_3d.py		plot_3d.py
similarity.py		similarity.py
similarity.tsv		similarity.tsv
word2vec.model		word2vec.model
word2vec.py		word2vec.py
work_identifier.tsv		work_identifier.tsv

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

lit2vec

Table of Contents

Screenshot of plots

About

Releases

Packages

Languages

MiMoText/lit2vec

Folders and files

Latest commit

History

Repository files navigation

lit2vec

Table of Contents

Screenshot of plots

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages