Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Tests for the evaluate_word_pairs function #1061

Merged
merged 62 commits into from
Dec 28, 2016
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
62 commits
Select commit Hold shift + click to select a range
1c63c9a
Merge branch 'release-0.12.3rc1'
tmylk Nov 5, 2015
280a488
Merge branch 'release-0.12.3'
tmylk Nov 6, 2015
ddeb002
Merge branch 'release-0.12.3'
tmylk Nov 6, 2015
f2ac3a9
Update CHANGELOG.txt
tmylk Nov 6, 2015
cf09e8c
Update CHANGELOG.txt
tmylk Nov 6, 2015
b8b8f57
cbow_mean default changed from 0 to 1.
akutuzov Nov 23, 2015
6456cbc
Hyperparameters' default values are aligned with Mikolov's word2vec.
akutuzov Jan 13, 2016
966a4b0
Merge remote-tracking branch 'upstream/master' into develop
akutuzov Jan 13, 2016
d9ec7e4
Fix for #538: cbow_mean default changed from 0 to 1.
akutuzov Jan 13, 2016
76d2df7
Update changelog
akutuzov Jan 13, 2016
0b6f45b
(main) defaults aligned to Mikolov's word2vec.
akutuzov Jan 14, 2016
7fb5f18
Merge remote-tracking branch 'upstream/develop' into develop
akutuzov Jan 14, 2016
bc7a447
word2vec (main) now mimics command-line arguments for Mikolov's word2…
akutuzov Jan 14, 2016
e689b4f
Fix for #538
akutuzov Jan 14, 2016
a5274ab
Fix for #538 (tabs and spaces).
akutuzov Jan 14, 2016
5c32ca8
Fix for #538 (tests).
akutuzov Jan 15, 2016
ac889b3
For #538: slightly relaxed sanity check demands (because now default …
akutuzov Jan 15, 2016
92087c0
Fixes as per @gojomo comments.
akutuzov Jan 15, 2016
06785b5
Test fixes due to negative sampling becoming default behavior.
akutuzov Jan 15, 2016
3ac5fd4
Commented out tests which work for HS only.
akutuzov Jan 15, 2016
e0ac3d2
Fix for #538.
akutuzov Jan 16, 2016
0aad977
Yet another fix.
akutuzov Jan 16, 2016
1db616b
Merge remote-tracking branch 'upstream/develop' into develop
akutuzov Jan 16, 2016
e4eb8ba
Merging.
akutuzov Jan 16, 2016
ab25344
Fix for CBOW test.
akutuzov Jan 16, 2016
6b3f01d
Merge remote-tracking branch 'upstream/develop' into develop
akutuzov Jan 16, 2016
2bf45d3
Changelog mention of #538
akutuzov Jan 16, 2016
1a579ec
Fix for CBOW negative sampling tests.
akutuzov Jan 17, 2016
78372bf
Merge remote-tracking branch 'upstream/develop' into develop
akutuzov Jan 26, 2016
0c10fa6
Factoring out word2vec _main__ into gensim/scripts
akutuzov Jan 26, 2016
8a3d58b
Use logger instead of logging.
akutuzov Jan 27, 2016
c5249b9
Made Changelog less verbose about word2vec defaults changed.
akutuzov Jan 27, 2016
a40e624
Fixes to word2vec_standalone.py as per Radim's comments.
akutuzov Jan 27, 2016
dbd0eab
Alpha argument. with different defaults for CBOW ans skipgram.
akutuzov Jan 27, 2016
b61287a
resolve merge conflict in Changelog
tmylk Jan 29, 2016
3ade404
Merge branch 'release-0.12.4' with #596
tmylk Jan 31, 2016
9e6522e
Merge branch 'release-0.13.0'
tmylk Jun 10, 2016
87c4e9c
Merge branch 'release-0.13.0'
tmylk Jun 10, 2016
9c74b40
Release version typo fix
tmylk Jun 10, 2016
7b30025
Merge branch 'release-0.13.0rc1'
tmylk Jun 10, 2016
de79c8e
Merge branch 'release-0.13.0'
tmylk Jun 22, 2016
d4f9cc5
Merge branch 'release-0.13.1'
tmylk Jun 23, 2016
e0627c6
Merge remote-tracking branch 'upstream/master' into develop
akutuzov Jul 2, 2016
b8b30c2
Finalizing.
akutuzov Jul 2, 2016
f3f2a52
'fisrt_push'
Nowow Jul 2, 2016
873f184
Initial shippable release
Nowow Dec 8, 2016
68a3e86
Merge remote-tracking branch 'upstream/develop' into develop
akutuzov Dec 15, 2016
498474d
Evaluation function to measure model correlation with human similarit…
akutuzov Dec 15, 2016
ce64d5a
Updating semantic similarity evaluation.
akutuzov Dec 15, 2016
0936971
Scipy stats import
akutuzov Dec 15, 2016
e11909f
Evaluation function to measure model correlation with human similarit…
akutuzov Dec 15, 2016
5f38818
Merge branch 'develop' of https://github.com/akutuzov/gensim into dev…
akutuzov Dec 15, 2016
b4b8d14
Remove unneccessary.
akutuzov Dec 15, 2016
2429dc4
Changing the neame of the word pairs evaluation function.
akutuzov Dec 16, 2016
ad6b268
Merge branch 'develop' into develop
tmylk Dec 22, 2016
fddbc0a
Merge remote-tracking branch 'upstream/develop' into develop
akutuzov Dec 26, 2016
910a511
Wordsim353 dataset added.
akutuzov Dec 26, 2016
54e0ba2
Fixed bug in evaluate_word_pairs.
akutuzov Dec 27, 2016
41f8f8e
Tests for evaluate_word_pairs function.
akutuzov Dec 27, 2016
9dfbac5
Atrributing Wordsim353 dataset.
akutuzov Dec 27, 2016
5899610
Merge remote-tracking branch 'upstream/develop' into develop
akutuzov Dec 28, 2016
11c9afb
Test for out-of-vocabulary pairs in evaluate_word_pairs.
akutuzov Dec 28, 2016
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
3 changes: 1 addition & 2 deletions gensim/models/word2vec.py
Original file line number Diff line number Diff line change
Expand Up @@ -1402,7 +1402,7 @@ def log_evaluate_word_pairs(pearson, spearman, oov, pairs):
return KeyedVectors.log_evaluate_word_pairs(pearson, spearman, oov, pairs)

def evaluate_word_pairs(self, pairs, delimiter='\t', restrict_vocab=300000, case_insensitive=True, dummy4unknown=False):
return self.wv.evaluate_word_pairs(self, pairs, delimiter, restrict_vocab, case_insensitive, dummy4unknown)
return self.wv.evaluate_word_pairs(pairs, delimiter, restrict_vocab, case_insensitive, dummy4unknown)

def __str__(self):
return "%s(vocab=%s, size=%s, alpha=%s)" % (self.__class__.__name__, len(self.wv.index2word), self.vector_size, self.alpha)
Expand Down Expand Up @@ -1629,4 +1629,3 @@ def __iter__(self):
model.accuracy(args.accuracy)

logger.info("finished running %s", program)

355 changes: 355 additions & 0 deletions gensim/test/test_data/wordsim353.tsv
Original file line number Diff line number Diff line change
@@ -0,0 +1,355 @@
# The WordSimilarity-353 Test Collection (http://www.cs.technion.ac.il/~gabr/resources/data/wordsim353/)
# Word 1 Word 2 Human (mean)
love sex 6.77
tiger cat 7.35
tiger tiger 10.00
book paper 7.46
computer keyboard 7.62
computer internet 7.58
plane car 5.77
train car 6.31
telephone communication 7.50
television radio 6.77
media radio 7.42
drug abuse 6.85
bread butter 6.19
cucumber potato 5.92
doctor nurse 7.00
professor doctor 6.62
student professor 6.81
smart student 4.62
smart stupid 5.81
company stock 7.08
stock market 8.08
stock phone 1.62
stock CD 1.31
stock jaguar 0.92
stock egg 1.81
fertility egg 6.69
stock live 3.73
stock life 0.92
book library 7.46
bank money 8.12
wood forest 7.73
money cash 9.15
professor cucumber 0.31
king cabbage 0.23
king queen 8.58
king rook 5.92
bishop rabbi 6.69
Jerusalem Israel 8.46
Jerusalem Palestinian 7.65
holy sex 1.62
fuck sex 9.44
Maradona football 8.62
football soccer 9.03
football basketball 6.81
football tennis 6.63
tennis racket 7.56
Arafat peace 6.73
Arafat terror 7.65
Arafat Jackson 2.50
law lawyer 8.38
movie star 7.38
movie popcorn 6.19
movie critic 6.73
movie theater 7.92
physics proton 8.12
physics chemistry 7.35
space chemistry 4.88
alcohol chemistry 5.54
vodka gin 8.46
vodka brandy 8.13
drink car 3.04
drink ear 1.31
drink mouth 5.96
drink eat 6.87
baby mother 7.85
drink mother 2.65
car automobile 8.94
gem jewel 8.96
journey voyage 9.29
boy lad 8.83
coast shore 9.10
asylum madhouse 8.87
magician wizard 9.02
midday noon 9.29
furnace stove 8.79
food fruit 7.52
bird cock 7.10
bird crane 7.38
tool implement 6.46
brother monk 6.27
crane implement 2.69
lad brother 4.46
journey car 5.85
monk oracle 5.00
cemetery woodland 2.08
food rooster 4.42
coast hill 4.38
forest graveyard 1.85
shore woodland 3.08
monk slave 0.92
coast forest 3.15
lad wizard 0.92
chord smile 0.54
glass magician 2.08
noon string 0.54
rooster voyage 0.62
money dollar 8.42
money cash 9.08
money currency 9.04
money wealth 8.27
money property 7.57
money possession 7.29
money bank 8.50
money deposit 7.73
money withdrawal 6.88
money laundering 5.65
money operation 3.31
tiger jaguar 8.00
tiger feline 8.00
tiger carnivore 7.08
tiger mammal 6.85
tiger animal 7.00
tiger organism 4.77
tiger fauna 5.62
tiger zoo 5.87
psychology psychiatry 8.08
psychology anxiety 7.00
psychology fear 6.85
psychology depression 7.42
psychology clinic 6.58
psychology doctor 6.42
psychology Freud 8.21
psychology mind 7.69
psychology health 7.23
psychology science 6.71
psychology discipline 5.58
psychology cognition 7.48
planet star 8.45
planet constellation 8.06
planet moon 8.08
planet sun 8.02
planet galaxy 8.11
planet space 7.92
planet astronomer 7.94
precedent example 5.85
precedent information 3.85
precedent cognition 2.81
precedent law 6.65
precedent collection 2.50
precedent group 1.77
precedent antecedent 6.04
cup coffee 6.58
cup tableware 6.85
cup article 2.40
cup artifact 2.92
cup object 3.69
cup entity 2.15
cup drink 7.25
cup food 5.00
cup substance 1.92
cup liquid 5.90
jaguar cat 7.42
jaguar car 7.27
energy secretary 1.81
secretary senate 5.06
energy laboratory 5.09
computer laboratory 6.78
weapon secret 6.06
FBI fingerprint 6.94
FBI investigation 8.31
investigation effort 4.59
Mars water 2.94
Mars scientist 5.63
news report 8.16
canyon landscape 7.53
image surface 4.56
discovery space 6.34
water seepage 6.56
sign recess 2.38
Wednesday news 2.22
mile kilometer 8.66
computer news 4.47
territory surface 5.34
atmosphere landscape 3.69
president medal 3.00
war troops 8.13
record number 6.31
skin eye 6.22
Japanese American 6.50
theater history 3.91
volunteer motto 2.56
prejudice recognition 3.00
decoration valor 5.63
century year 7.59
century nation 3.16
delay racism 1.19
delay news 3.31
minister party 6.63
peace plan 4.75
minority peace 3.69
attempt peace 4.25
government crisis 6.56
deployment departure 4.25
deployment withdrawal 5.88
energy crisis 5.94
announcement news 7.56
announcement effort 2.75
stroke hospital 7.03
disability death 5.47
victim emergency 6.47
treatment recovery 7.91
journal association 4.97
doctor personnel 5.00
doctor liability 5.19
liability insurance 7.03
school center 3.44
reason hypertension 2.31
reason criterion 5.91
hundred percent 7.38
Harvard Yale 8.13
hospital infrastructure 4.63
death row 5.25
death inmate 5.03
lawyer evidence 6.69
life death 7.88
life term 4.50
word similarity 4.75
board recommendation 4.47
governor interview 3.25
OPEC country 5.63
peace atmosphere 3.69
peace insurance 2.94
territory kilometer 5.28
travel activity 5.00
competition price 6.44
consumer confidence 4.13
consumer energy 4.75
problem airport 2.38
car flight 4.94
credit card 8.06
credit information 5.31
hotel reservation 8.03
grocery money 5.94
registration arrangement 6.00
arrangement accommodation 5.41
month hotel 1.81
type kind 8.97
arrival hotel 6.00
bed closet 6.72
closet clothes 8.00
situation conclusion 4.81
situation isolation 3.88
impartiality interest 5.16
direction combination 2.25
street place 6.44
street avenue 8.88
street block 6.88
street children 4.94
listing proximity 2.56
listing category 6.38
cell phone 7.81
production hike 1.75
benchmark index 4.25
media trading 3.88
media gain 2.88
dividend payment 7.63
dividend calculation 6.48
calculation computation 8.44
currency market 7.50
OPEC oil 8.59
oil stock 6.34
announcement production 3.38
announcement warning 6.00
profit warning 3.88
profit loss 7.63
dollar yen 7.78
dollar buck 9.22
dollar profit 7.38
dollar loss 6.09
computer software 8.50
network hardware 8.31
phone equipment 7.13
equipment maker 5.91
luxury car 6.47
five month 3.38
report gain 3.63
investor earning 7.13
liquid water 7.89
baseball season 5.97
game victory 7.03
game team 7.69
marathon sprint 7.47
game series 6.19
game defeat 6.97
seven series 3.56
seafood sea 7.47
seafood food 8.34
seafood lobster 8.70
lobster food 7.81
lobster wine 5.70
food preparation 6.22
video archive 6.34
start year 4.06
start match 4.47
game round 5.97
boxing round 7.61
championship tournament 8.36
fighting defeating 7.41
line insurance 2.69
day summer 3.94
summer drought 7.16
summer nature 5.63
day dawn 7.53
nature environment 8.31
environment ecology 8.81
nature man 6.25
man woman 8.30
man governor 5.25
murder manslaughter 8.53
soap opera 7.94
opera performance 6.88
life lesson 5.94
focus life 4.06
production crew 6.25
television film 7.72
lover quarrel 6.19
viewer serial 2.97
possibility girl 1.94
population development 3.75
morality importance 3.31
morality marriage 3.69
Mexico Brazil 7.44
gender equality 6.41
change attitude 5.44
family planning 6.25
opera industry 2.63
sugar approach 0.88
practice institution 3.19
ministry culture 4.69
problem challenge 6.75
size prominence 5.31
country citizen 7.31
planet people 5.75
development issue 3.97
experience music 3.47
music project 3.63
glass metal 5.56
aluminum metal 7.83
chance credibility 3.88
exhibit memorabilia 5.31
concert virtuoso 6.81
rock jazz 7.59
museum theater 7.19
observation architecture 4.38
space world 6.53
preservation world 6.19
admission ticket 7.69
shower thunderstorm 6.31
shower flood 6.03
weather forecast 8.34
disaster area 6.25
governor office 6.34
architecture century 3.78
12 changes: 12 additions & 0 deletions gensim/test/test_word2vec.py
Original file line number Diff line number Diff line change
Expand Up @@ -378,6 +378,18 @@ def testAccuracy(self):
kv_accuracy = model.wv.accuracy(datapath('questions-words.txt'))
self.assertEqual(w2v_accuracy, kv_accuracy)

def testEvaluateWordPairs(self):
"""Test Spearman and Pearson correlation coefficients give sane results on similarity datasets"""
corpus = word2vec.LineSentence(datapath('head500.noblanks.cor.bz2'))
model = word2vec.Word2Vec(corpus, min_count=3, iter=10)
correlation = model.evaluate_word_pairs(datapath('wordsim353.tsv'))
pearson = correlation[0][0]
spearman = correlation[1][0]
oov = correlation[2]
self.assertTrue(0.1 < pearson < 1.0)
self.assertTrue(0.1 < spearman < 1.0)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

could we please test for oov_ratio in correlation[2] too?

self.assertTrue(0.0 <= oov < 90.0)

def model_sanity(self, model, train=True):
"""Even tiny models trained on LeeCorpus should pass these sanity checks"""
# run extra before/after training tests if train=True
Expand Down