You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Upgrading to gensim 3.2.0 also upgrades smart-open to 1.5.6 which seems to have changed s3 code.
After the upgrade there is a performance regression in Word2Vec that leads to a > 2x slowdown when streaming a gzipped corpus from s3 (> 250K Words/sec => < 100K Words/sec).
Downgrading smart-open to 1.5.3 fixes the issue.
The release notes of smart-open 1.5.6 from Dec 28 state:
We use a private corpus of about 4M documents with about 150M words, chunked up into 2-3 MB sized gzipped files that we stream from s3 using smart-open.
Expected Results
Performance should be back to level of smart open 1.5.3.
Actual Results
See above
Versions
gensim with smart-open 1.5.6
The text was updated successfully, but these errors were encountered:
Hello @yjk21, this is smart_open problem (not a gensim), for this reason, I close this issue.
Please create an issue in smart_open repository https://github.com/RaRe-Technologies/smart_open/issues with a simple example of code that shows this regression (it seems that in 1.5.6 performance problem should be already fixed, if not - then we fix it again).
Description
Upgrading to gensim 3.2.0 also upgrades smart-open to 1.5.6 which seems to have changed s3 code.
After the upgrade there is a performance regression in Word2Vec that leads to a > 2x slowdown when streaming a gzipped corpus from s3 (> 250K Words/sec => < 100K Words/sec).
Downgrading smart-open to 1.5.3 fixes the issue.
The release notes of smart-open 1.5.6 from Dec 28 state:
Perhaps there need to be some adjustments made
Steps/Code/Corpus to Reproduce
We use a private corpus of about 4M documents with about 150M words, chunked up into 2-3 MB sized gzipped files that we stream from s3 using smart-open.
Expected Results
Performance should be back to level of smart open 1.5.3.
Actual Results
See above
Versions
gensim with smart-open 1.5.6
The text was updated successfully, but these errors were encountered: