docs: add memory profile (#841)

* docs: add memory profile * docs: add memory profile * docs: remove reference and polish words
jina-ai · Oct 11, 2022 · 87fdc54 · 87fdc54
1 parent 7ee58c8
commit 87fdc54
Showing 1 changed file with 16 additions and 0 deletions.
diff --git a/docs/user-guides/retriever.md b/docs/user-guides/retriever.md
@@ -154,6 +154,22 @@ The results will look like this, the most relevant doc is "she smiled, with pain
 You can set the `limit` parameter (default is `10`) to control the number of the most similar documents to be retrieved.
 
 
+### Memory Estimation
+
+Here, we will show how to estimate the memory usage of `AnnLite` indexer.
+This is useful for determining the amount of memory required for indexing and querying.
+
+In `AnnLite`, the memory usage is determined by the following two components:
+
+- `HNSW` indexer: N * 1.1 * (4 bytes * `dimension` + 8 bytes * `max_connection`), where N is the number of embedding vectors, `dimension` is the dimension of the embedding vectors, and `max_connection` is the maximum number of connections in the graph. 
+- `cell_table`: it's almost linear to the number of columns and number of data. If the default setting is used (no columns used for filtering), the memory usage of `cell_table` is 0.12GB per million data.
+Columns used for filtering are stored in string type so the memory usage is depended on the length of the string.
+
+```{Notice}
+If you use `AnnLiteIndexer` in your Jina Flow, the memory usage will be slightly higher since we keep a `SQLite` table in memory in order to indexing in `DocumentArray`.
+```
+
+
 ## Support large-scale dataset
 
 When we want to index a large number of documents, for example, 100 million data or even 1 billion data,