-
-
Notifications
You must be signed in to change notification settings - Fork 5
Word Vector
Wannaphong Phatthiyaphaibun edited this page Jul 14, 2022
·
2 revisions
LaoNLP support Word2Vec for Word Vector.
We has train Lao word2vec with OSCAR Corpus by gensim. You can see the training notebook at https://github.com/wannaphong/LaoNLP-Notebook/blob/main/Lao_Word2Vec.ipynb
Example
from laonlp.word_vector import Word2Vec
wv = Word2Vec(model="skip-gram") # cbow or skip-gram
print(wv.similarity("ວຽງຈັນ", "ເມືອງ"))
# output: 0.46474797
print(wv.most_similar_cosmul(positive=["ວຽງຈັນ", "ເມືອງ"],negative=[]))
# output: [('ສຸຂຸມາ', 0.6676176190376282), ('ແຂວ', 0.6541932821273804), ('ທຸລະຄົມ', 0.6540694832801819), ('ຫ້ອງການຍຸຕິທຳ', 0.6540253758430481), ('ສີສັດຕະນາກ', 0.6531381607055664), ('ພະລານໄຊ', 0.6501346230506897), ('ພັດທະນາກວມລວມ', 0.6448683738708496), ('ກະລຶມ', 0.6448098421096802), ('ຍົມມະລາດ', 0.6435081958770752), ('ປົກຄອງເມືອງ', 0.6423164010047913)]