Skip to content

Commit

Permalink
Annoy tutorial notebook typo fixes (#784)
Browse files Browse the repository at this point in the history
  • Loading branch information
droudy authored and tmylk committed Jul 11, 2016
1 parent 4a6b52c commit 4dad188
Showing 1 changed file with 5 additions and 7 deletions.
12 changes: 5 additions & 7 deletions docs/notebooks/annoytutorial.ipynb
Original file line number Diff line number Diff line change
Expand Up @@ -11,7 +11,7 @@
"cell_type": "markdown",
"metadata": {},
"source": [
"This tutorial is about using the Annoy(Approximate Nearest Neighbors Oh Yeah) library for similarity queries in gensim"
"This tutorial is about using the [Annoy(Approximate Nearest Neighbors Oh Yeah)]((https://github.com/spotify/annoy \"Link to annoy repo\") library for similarity queries in gensim"
]
},
{
Expand Down Expand Up @@ -110,8 +110,8 @@
"cell_type": "markdown",
"metadata": {},
"source": [
"A similarity query using Annoy is significantly faster then using the traditional brute force method\n",
">**Note**: Initialization time for the annoy indexer was not included in the times. The optimal knn algorithm for you to use will depend on how many queries you need to make and the size of the corpus. If you are making very few similarity queries, the time taken to initialize the annoy indexer will be longer then the time it would take the brute force method to retrieve results. If you are making many queries however, the time it takes to initialize the annoy indexer will be made up for by the incredibly fast retrieval times for queries once the indexer has been initialized"
"A similarity query using Annoy is significantly faster than using the traditional brute force method\n",
">**Note**: Initialization time for the annoy indexer was not included in the times. The optimal knn algorithm for you to use will depend on how many queries you need to make and the size of the corpus. If you are making very few similarity queries, the time taken to initialize the annoy indexer will be longer than the time it would take the brute force method to retrieve results. If you are making many queries however, the time it takes to initialize the annoy indexer will be made up for by the incredibly fast retrieval times for queries once the indexer has been initialized"
]
},
{
Expand All @@ -125,9 +125,7 @@
"cell_type": "markdown",
"metadata": {},
"source": [
"[Annoy](https://github.com/spotify/annoy \"Link to annoy repo\") is an open source library to search for points in space that are close to a given query point. It also creates large read-only file-based data structures that are mmapped into memory so that many processes may share the same data. For our purpose, it is used to find similarity between words or documents in a vector space. [See the tutorial on similarity queries for more information on them](https://github.com/RaRe-Technologies/gensim/blob/develop/docs/notebooks/Similarity_Queries.ipynb).\n",
"\n",
"[There are benefits](https://github.com/spotify/annoy#background) of using annoy over the pre-existing method of making similarity queries through brute force in gensim."
"Annoy is an open source library to search for points in space that are close to a given query point. It also creates large read-only file-based data structures that are mmapped into memory so that many processes may share the same data. For our purpose, it is used to find similarity between words or documents in a vector space. [See the tutorial on similarity queries for more information on them](https://github.com/RaRe-Technologies/gensim/blob/develop/docs/notebooks/Similarity_Queries.ipynb)."
]
},
{
Expand Down Expand Up @@ -204,7 +202,7 @@
"metadata": {},
"source": [
"### Using the SimilarityIndex class\n",
"An instance of `SimilarityIndex` needs to be created in order to use Annoy in gensim. The `SimilarityIndex` class is located in `gensim/similarities/index`\n",
"An instance of `SimilarityIndex` needs to be created in order to use Annoy in gensim. The `SimilarityIndex` class is located in `gensim.similarities.index`\n",
"\n",
"Currently, there is only support for word2vec models and doc2vec models in gensim when it comes to using annoy for similarity queries. A word2vec model is being used in this tutorial, so `SimilarityIndex.build_from_word2vec()` is being called, but if you are using a doc2vec model `SimilarityIndex.build_from_doc2vec()` should be called.\n",
"\n",
Expand Down

0 comments on commit 4dad188

Please sign in to comment.