Skip to content

Commit

Permalink
Merge pull request #3456 from pabs3/update-links
Browse files Browse the repository at this point in the history
Update broken/redirecting/unencrypted links
  • Loading branch information
piskvorky authored Apr 29, 2023
2 parents 714a333 + 73825d6 commit 525f67a
Show file tree
Hide file tree
Showing 161 changed files with 253 additions and 253 deletions.
6 changes: 3 additions & 3 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -290,7 +290,7 @@ Gensim 4.0 is a major release with lots of performance & robustness improvements
* Dropped Python 2. Gensim 4.0 is Py3.6+. Read our [Python version support policy](https://github.com/RaRe-Technologies/gensim/wiki/Gensim-And-Compatibility).
- If you still need Python 2 for some reason, stay at [Gensim 3.8.3](https://github.com/RaRe-Technologies/gensim/releases/tag/3.8.3).

* A new [Gensim website](https://radimrehurek.com/gensim) – finally! 🙃
* A new [Gensim website](https://radimrehurek.com/gensim/) – finally! 🙃

So, a major clean-up release overall. We're happy with this **tighter, leaner and faster Gensim**.

Expand Down Expand Up @@ -486,7 +486,7 @@ This is the direction we'll keep going forward: less kitchen-sink of "latest aca

### Why pre-release?

This 4.0.0beta pre-release is for users who want the **cutting edge performance and bug fixes**. Plus users who want to help out, by **testing and providing feedback**: code, documentation, workflows… Please let us know on the [mailing list](https://groups.google.com/forum/#!forum/gensim)!
This 4.0.0beta pre-release is for users who want the **cutting edge performance and bug fixes**. Plus users who want to help out, by **testing and providing feedback**: code, documentation, workflows… Please let us know on the [mailing list](https://groups.google.com/g/gensim)!

Install the pre-release with:

Expand Down Expand Up @@ -2557,7 +2557,7 @@ Tutorial and doc improvements:
* transactional similarity server: see docs/simserver.html
* website moved from university hosting to radimrehurek.com
* much improved speed of lsi[corpus] transformation:
* accuracy tests of incremental svd: test/svd_error.py and http://groups.google.com/group/gensim/browse_thread/thread/4b605b72f8062770
* accuracy tests of incremental svd: test/svd_error.py and https://groups.google.com/g/gensim/c/S2BbcvgGJ3A
* further improvements to memory-efficiency of LDA and LSA
* improved wiki preprocessing (thx to Luca de Alfaro)
* model.print_topics() debug fncs now support std output, in addition to logging (thx to Homer Strong)
Expand Down
2 changes: 1 addition & 1 deletion CONTRIBUTING.md
Original file line number Diff line number Diff line change
Expand Up @@ -4,7 +4,7 @@ First, please see [contribution-guide.org](http://www.contribution-guide.org/) f

Also, please check the [Gensim FAQ](https://github.com/RaRe-Technologies/gensim/wiki/Recipes-&-FAQ) page before posting.

**The proper place for open-ended questions is the [Gensim mailing list](https://groups.google.com/forum/#!forum/gensim).** Github is not the right place for research discussions or feature requests.
**The proper place for open-ended questions is the [Gensim mailing list](https://groups.google.com/g/gensim).** Github is not the right place for research discussions or feature requests.

# How to add a new feature or create a pull request?

Expand Down
2 changes: 1 addition & 1 deletion HACKTOBERFEST.md
Original file line number Diff line number Diff line change
Expand Up @@ -28,7 +28,7 @@ Check out the following:

## Questions

If you have a general question about Gensim, please ask on the [mailing list](https://groups.google.com/forum/#!forum/gensim).
If you have a general question about Gensim, please ask on the [mailing list](https://groups.google.com/g/gensim).
If you have a question a about a specific issue or PR, just ask there directly, and we'll get back to you as soon as we can.
Otherwise, ping @mpenkov on [Twitter](https://twitter.com/mpenkov) or [Telegram](https://t.me/mpenkov).

Expand Down
2 changes: 1 addition & 1 deletion ISSUE_TEMPLATE.md
Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@
<!--
**IMPORTANT**:
- Use the [Gensim mailing list](https://groups.google.com/forum/#!forum/gensim) to ask general or usage questions. Github issues are only for bug reports.
- Use the [Gensim mailing list](https://groups.google.com/g/gensim) to ask general or usage questions. Github issues are only for bug reports.
- Check [Recipes&FAQ](https://github.com/RaRe-Technologies/gensim/wiki/Recipes-&-FAQ) first for common answers.
Github bug reports that do not include relevant information and context will be closed without an answer. Thanks!
Expand Down
30 changes: 15 additions & 15 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -11,7 +11,7 @@ https://github.com/RaRe-Technologies/gensim/issues/2805
[![GitHub release](https://img.shields.io/github/release/rare-technologies/gensim.svg?maxAge=3600)](https://github.com/RaRe-Technologies/gensim/releases)
[![Downloads](https://img.shields.io/pypi/dm/gensim?color=blue)](https://pepy.tech/project/gensim/)
[![DOI](https://zenodo.org/badge/DOI/10.13140/2.1.2393.1847.svg)](https://doi.org/10.13140/2.1.2393.1847)
[![Mailing List](https://img.shields.io/badge/-Mailing%20List-blue.svg)](https://groups.google.com/forum/#!forum/gensim)
[![Mailing List](https://img.shields.io/badge/-Mailing%20List-blue.svg)](https://groups.google.com/g/gensim)
[![Follow](https://img.shields.io/twitter/follow/gensim_py.svg?style=social&style=flat&logo=twitter&label=Follow&color=blue)](https://twitter.com/gensim_py)

Gensim is a Python library for *topic modelling*, *document indexing*
Expand Down Expand Up @@ -73,7 +73,7 @@ package:

For alternative modes of installation, see the [documentation].

Gensim is being [continuously tested](http://radimrehurek.com/gensim/#testing) under all
Gensim is being [continuously tested](https://radimrehurek.com/gensim/#testing) under all
[supported Python versions](https://github.com/RaRe-Technologies/gensim/wiki/Gensim-And-Compatibility).
Support for Python 2.7 was dropped in gensim 4.0.0 – install gensim 3.8.3 if you must use Python 2.7.

Expand Down Expand Up @@ -101,15 +101,15 @@ Documentation

[QuickStart]: https://radimrehurek.com/gensim/auto_examples/core/run_core_concepts.html
[Tutorials]: https://radimrehurek.com/gensim/auto_examples/
[Official Documentation and Walkthrough]: http://radimrehurek.com/gensim/
[Official API Documentation]: http://radimrehurek.com/gensim/apiref.html
[Official Documentation and Walkthrough]: https://radimrehurek.com/gensim/
[Official API Documentation]: https://radimrehurek.com/gensim/apiref.html

Support
-------

For commercial support, please see [Gensim sponsorship](https://github.com/sponsors/piskvorky).

Ask open-ended questions on the public [Gensim Mailing List](https://groups.google.com/forum/#!forum/gensim).
Ask open-ended questions on the public [Gensim Mailing List](https://groups.google.com/g/gensim).

Raise bugs on [Github](https://github.com/RaRe-Technologies/gensim/blob/develop/CONTRIBUTING.md) but please **make sure you follow the [issue template](https://github.com/RaRe-Technologies/gensim/blob/develop/ISSUE_TEMPLATE.md)**. Issues that are not bugs or fail to provide the requested details will be closed without inspection.

Expand All @@ -121,7 +121,7 @@ Adopters

| Company | Logo | Industry | Use of Gensim |
|---------|------|----------|---------------|
| [RARE Technologies](http://rare-technologies.com) | ![rare](docs/src/readme_images/rare.png) | ML & NLP consulting | Creators of Gensim – this is us! |
| [RARE Technologies](https://rare-technologies.com/) | ![rare](docs/src/readme_images/rare.png) | ML & NLP consulting | Creators of Gensim – this is us! |
| [Amazon](http://www.amazon.com/) | ![amazon](docs/src/readme_images/amazon.png) | Retail | Document similarity. |
| [National Institutes of Health](https://github.com/NIHOPA/pipeline_word2vec) | ![nih](docs/src/readme_images/nih.png) | Health | Processing grants and publications with word2vec. |
| [Cisco Security](http://www.cisco.com/c/en/us/products/security/index.html) | ![cisco](docs/src/readme_images/cisco.png) | Security | Large-scale fraud detection. |
Expand Down Expand Up @@ -162,17 +162,17 @@ BibTeX entry:

[citing gensim in academic papers and theses]: https://scholar.google.com/citations?view_op=view_citation&hl=en&user=9vG_kV0AAAAJ&citation_for_view=9vG_kV0AAAAJ:NaGl4SEjCO4C

[design goals]: http://radimrehurek.com/gensim/about.html
[RaRe Technologies]: http://rare-technologies.com/wp-content/uploads/2016/02/rare_image_only.png%20=10x20
[design goals]: https://radimrehurek.com/gensim/intro.html#design-principles
[RaRe Technologies]: https://rare-technologies.com/wp-content/uploads/2016/02/rare_image_only.png%20=10x20
[rare\_tech]: //rare-technologies.com
[Talentpair]: https://avatars3.githubusercontent.com/u/8418395?v=3&s=100
[citing gensim in academic papers and theses]: https://scholar.google.cz/citations?view_op=view_citation&hl=en&user=9vG_kV0AAAAJ&citation_for_view=9vG_kV0AAAAJ:u-x6o8ySG0sC

[documentation and Jupyter Notebook tutorials]: https://github.com/RaRe-Technologies/gensim/#documentation
[Vector Space Model]: http://en.wikipedia.org/wiki/Vector_space_model
[unsupervised document analysis]: http://en.wikipedia.org/wiki/Latent_semantic_indexing
[NumPy and Scipy]: http://www.scipy.org/Download
[ATLAS]: http://math-atlas.sourceforge.net/
[OpenBLAS]: http://xianyi.github.io/OpenBLAS/
[source tar.gz]: http://pypi.python.org/pypi/gensim
[documentation]: http://radimrehurek.com/gensim/install.html
[Vector Space Model]: https://en.wikipedia.org/wiki/Vector_space_model
[unsupervised document analysis]: https://en.wikipedia.org/wiki/Latent_semantic_indexing
[NumPy and Scipy]: https://scipy.org/install/
[ATLAS]: https://math-atlas.sourceforge.net/
[OpenBLAS]: https://xianyi.github.io/OpenBLAS/
[source tar.gz]: https://pypi.org/project/gensim/
[documentation]: https://radimrehurek.com/gensim/#install
2 changes: 1 addition & 1 deletion continuous_integration/check_wheels.py
Original file line number Diff line number Diff line change
Expand Up @@ -2,7 +2,7 @@
# -*- coding: utf-8 -*-
#
# Copyright (C) 2019 RaRe Technologies s.r.o.
# Licensed under the GNU LGPL v2.1 - http://www.gnu.org/licenses/lgpl.html
# Licensed under the GNU LGPL v2.1 - https://www.gnu.org/licenses/old-licenses/lgpl-2.1.en.html
"""Print available wheels for a particular Python package."""
import re
import sys
Expand Down
2 changes: 1 addition & 1 deletion docs/notebooks/Any2Vec_Filebased.ipynb
Original file line number Diff line number Diff line change
Expand Up @@ -522,7 +522,7 @@
"\n",
"This new code branch was created by [@persiyanov](https://github.com/persiyanov) as a Google Summer of Code 2018 project in the [RARE Student Incubator](https://rare-technologies.com/incubator/).\n",
"\n",
"Questions, comments? Use our Gensim [mailing list](https://groups.google.com/forum/#!forum/gensim) and [twitter](https://twitter.com/gensim_py). Happy training!"
"Questions, comments? Use our Gensim [mailing list](https://groups.google.com/g/gensim) and [twitter](https://twitter.com/gensim_py). Happy training!"
]
}
],
Expand Down
4 changes: 2 additions & 2 deletions docs/notebooks/Varembed.ipynb
Original file line number Diff line number Diff line change
Expand Up @@ -8,7 +8,7 @@
"\n",
"Varembed is a word embedding model incorporating morphological information, capturing shared sub-word features. Unlike previous work that constructs word embeddings directly from morphemes, varembed combines morphological and distributional information in a unified probabilistic framework. Varembed thus yields improvements on intrinsic word similarity evaluations. Check out the original paper, [arXiv:1608.01056](https://arxiv.org/abs/1608.01056) accepted in [EMNLP 2016](http://www.emnlp2016.net/accepted-papers.html).\n",
"\n",
"Varembed is now integrated into [Gensim](http://radimrehurek.com/gensim/) providing ability to load already trained varembed models into gensim with additional functionalities over word vectors already present in gensim.\n",
"Varembed is now integrated into [Gensim](https://radimrehurek.com/gensim/) providing ability to load already trained varembed models into gensim with additional functionalities over word vectors already present in gensim.\n",
"\n",
"# This Tutorial\n",
"\n",
Expand Down Expand Up @@ -118,7 +118,7 @@
"# Resources\n",
"\n",
"* [Varembed Source Code](https://github.com/rguthrie3/MorphologicalPriorsForWordEmbeddings)\n",
"* [Gensim](http://radimrehurek.com/gensim/)\n",
"* [Gensim](https://radimrehurek.com/gensim/)\n",
"* [Lee Corpus](https://github.com/RaRe-Technologies/gensim/blob/develop/gensim/test/test_data/lee.cor)\n"
]
}
Expand Down
4 changes: 2 additions & 2 deletions docs/notebooks/WMD_tutorial.ipynb
Original file line number Diff line number Diff line change
Expand Up @@ -12,7 +12,7 @@
"\n",
"## Word Mover's Distance basics\n",
"\n",
"WMD is a method that allows us to assess the \"distance\" between two documents in a meaningful way, even when they have no words in common. It uses [word2vec](http://rare-technologies.com/word2vec-tutorial/) [4] vector embeddings of words. It been shown to outperform many of the state-of-the-art methods in *k*-nearest neighbors classification [3].\n",
"WMD is a method that allows us to assess the \"distance\" between two documents in a meaningful way, even when they have no words in common. It uses [word2vec](https://rare-technologies.com/word2vec-tutorial/) [4] vector embeddings of words. It been shown to outperform many of the state-of-the-art methods in *k*-nearest neighbors classification [3].\n",
"\n",
"WMD is illustrated below for two very similar sentences (illustration taken from [Vlad Niculae's blog](http://vene.ro/blog/word-movers-distance-in-python.html)). The sentences have no words in common, but by matching the relevant words, WMD is able to accurately measure the (dis)similarity between the two sentences. The method also uses the bag-of-words representation of the documents (simply put, the word's frequencies in the documents), noted as $d$ in the figure below. The intuition behind the method is that we find the minimum \"traveling distance\" between documents, in other words the most efficient way to \"move\" the distribution of document 1 to the distribution of document 2.\n",
"\n",
Expand All @@ -36,7 +36,7 @@
"\n",
"## Part 1: Computing the Word Mover's Distance\n",
"\n",
"To use WMD, we need some word embeddings first of all. You could train a word2vec (see tutorial [here](http://rare-technologies.com/word2vec-tutorial/)) model on some corpus, but we will start by downloading some pre-trained word2vec embeddings. Download the GoogleNews-vectors-negative300.bin.gz embeddings [here](https://code.google.com/archive/p/word2vec/) (warning: 1.5 GB, file is not needed for part 2). Training your own embeddings can be beneficial, but to simplify this tutorial, we will be using pre-trained embeddings at first.\n",
"To use WMD, we need some word embeddings first of all. You could train a word2vec (see tutorial [here](https://rare-technologies.com/word2vec-tutorial/)) model on some corpus, but we will start by downloading some pre-trained word2vec embeddings. Download the GoogleNews-vectors-negative300.bin.gz embeddings [here](https://code.google.com/archive/p/word2vec/) (warning: 1.5 GB, file is not needed for part 2). Training your own embeddings can be beneficial, but to simplify this tutorial, we will be using pre-trained embeddings at first.\n",
"\n",
"Let's take some sentences to compute the distance between."
]
Expand Down
2 changes: 1 addition & 1 deletion docs/notebooks/Word2Vec_FastText_Comparison.ipynb
Original file line number Diff line number Diff line change
Expand Up @@ -537,7 +537,7 @@
"3. In general, the performance of the models seems to get closer with the increasing corpus size. However, this might possibly be due to the size of the model staying constant at 100, and a larger model size for large corpora might result in higher performance gains.\n",
"4. The semantic accuracy for all models increases significantly with the increase in corpus size.\n",
"5. However, the increase in syntactic accuracy from the increase in corpus size for the n-gram FastText model is lower (in both relative and absolute terms). This could possibly indicate that advantages gained by incorporating morphological information could be less significant in case of larger corpus sizes (the corpuses used in the original paper seem to indicate this too)\n",
"6. Training times for gensim are slightly lower than the fastText no-ngram model, and significantly lower than the n-gram variant. This is quite impressive considering fastText is implemented in C++ and Gensim in Python (with calls to low-level BLAS routines for much of the heavy lifting). You could read [this post](http://rare-technologies.com/word2vec-in-python-part-two-optimizing/) for more details regarding word2vec optimisation in Gensim. Note that these times include importing any dependencies and serializing the models to disk, and not just the training times."
"6. Training times for gensim are slightly lower than the fastText no-ngram model, and significantly lower than the n-gram variant. This is quite impressive considering fastText is implemented in C++ and Gensim in Python (with calls to low-level BLAS routines for much of the heavy lifting). You could read [this post](https://rare-technologies.com/word2vec-in-python-part-two-optimizing/) for more details regarding word2vec optimisation in Gensim. Note that these times include importing any dependencies and serializing the models to disk, and not just the training times."
]
},
{
Expand Down
2 changes: 1 addition & 1 deletion docs/notebooks/WordRank_wrapper_quickstart.ipynb
Original file line number Diff line number Diff line change
Expand Up @@ -176,7 +176,7 @@
"cell_type": "markdown",
"metadata": {},
"source": [
"These methods take an [optional parameter](http://radimrehurek.com/gensim/models/word2vec.html#gensim.models.word2vec.Word2Vec.accuracy) restrict_vocab which limits which test examples are to be considered.\n",
"These methods take an [optional parameter](https://radimrehurek.com/gensim/models/word2vec.html#gensim.models.word2vec.Word2Vec.accuracy) restrict_vocab which limits which test examples are to be considered.\n",
"\n",
"The results here don't look good because the training corpus is very small. To get meaningful results one needs to train on 500k+ words.\n",
"\n",
Expand Down
2 changes: 1 addition & 1 deletion docs/notebooks/atmodel_tutorial.ipynb
Original file line number Diff line number Diff line change
Expand Up @@ -15,7 +15,7 @@
"Naturally, familiarity with topic modelling, LDA and Gensim is assumed in this tutorial. If you are not familiar with either LDA, or its Gensim implementation, I would recommend starting there. Consider some of these resources:\n",
"* Gentle introduction to the LDA model: http://blog.echen.me/2011/08/22/introduction-to-latent-dirichlet-allocation/\n",
"* Gensim's LDA API documentation: https://radimrehurek.com/gensim/models/ldamodel.html\n",
"* Topic modelling in Gensim: http://radimrehurek.com/topic_modeling_tutorial/2%20-%20Topic%20Modeling.html\n",
"* Topic modelling in Gensim: https://radimrehurek.com/topic_modeling_tutorial/2%20-%20Topic%20Modeling.html\n",
"* [Pre-processing and training LDA](lda_training_tips.ipynb)\n",
"\n",
"\n",
Expand Down
2 changes: 1 addition & 1 deletion docs/notebooks/deepir.ipynb
Original file line number Diff line number Diff line change
Expand Up @@ -6,7 +6,7 @@
"source": [
"## Deep Inverse Regression with Yelp reviews\n",
"\n",
"In this note we'll use [gensim](http://radimrehurek.com/gensim/) to turn the Word2Vec machinery into a document classifier, as in [Document Classification by Inversion of Distributed Language Representations](http://arxiv.org/pdf/1504.07295v3) from ACL 2015."
"In this note we'll use [gensim](https://radimrehurek.com/gensim/) to turn the Word2Vec machinery into a document classifier, as in [Document Classification by Inversion of Distributed Language Representations](http://arxiv.org/pdf/1504.07295v3) from ACL 2015."
]
},
{
Expand Down
16 changes: 8 additions & 8 deletions docs/notebooks/distributed.md
Original file line number Diff line number Diff line change
Expand Up @@ -50,11 +50,11 @@ Available distributed algorithms
* [Distributed Latent Dirichlet Allocation][8]


[1]: http://en.wikipedia.org/wiki/Distributed_computing
[2]: http://en.wikipedia.org/wiki/Basic_Linear_Algebra_Subprograms
[3]: http://pypi.python.org/pypi/Pyro4
[4]: http://radimrehurek.com/gensim/intro.html#design
[5]: http://radimrehurek.com/gensim/distributed.html#term-worker
[6]: http://en.wikipedia.org/wiki/Broadcast_domain
[7]: http://radimrehurek.com/gensim/dist_lsi.html
[8]: http://radimrehurek.com/gensim/dist_lda.html
[1]: https://en.wikipedia.org/wiki/Distributed_computing
[2]: https://en.wikipedia.org/wiki/Basic_Linear_Algebra_Subprograms
[3]: https://pypi.org/project/Pyro4/
[4]: https://radimrehurek.com/gensim/intro.html#design
[5]: https://radimrehurek.com/gensim/distributed.html#term-worker
[6]: https://en.wikipedia.org/wiki/Broadcast_domain
[7]: https://radimrehurek.com/gensim/dist_lsi.html
[8]: https://radimrehurek.com/gensim/dist_lda.html
Loading

0 comments on commit 525f67a

Please sign in to comment.