Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Preloads .vec and .vex files #2186

Closed
wants to merge 1 commit into from

Conversation

shatejas
Copy link
Collaborator

@shatejas shatejas commented Oct 4, 2024

LuceneFlatVectorReader uses IOContext.Random to open the read. IOContext.Random indicates the kernel to not read ahead the pages on to physical memory. This causes an increase in merge time due to increase of read ops at runtime.

The preload settings signals the kernal to preload the files when the reader is opened

Description

Experiment setup

  • 3 nodes: 6 shards 1 replica
  • Dataset: Cohere-10m
  • index thread: 2

Baseline is without preloading in the table below

Description   vCPU Mem (GB) Storage Type Total force merge time Read ops Time between merges Index CPU% (max) Merge CPU % (max)
Without quantization Baseline 16 128 EBS 5hr 15mins 115K 10 mins 90 12
  Preload 16 128 EBS 4hrs 55mins 60K 4 mins 90 12
1 bit quantization Baseline 8 64 EBS 1hr 35mins 117K 3 mins 45 23
  Preload 8 64 EBS 1hr 24mins 60K 0 mins 40 23
1 bit quantization Baseline 8 64 Instance 1hr 2mins 253K 0 mins 82 27
  Preload 8 64 Instance 58 mins 55K-70K 0 mins 75 27
1 bit quantization Baseline 4 32 Instance 1hr 7 mins 1M 0min 99 50
  Preload 4 32 Instance 1hr 17 mins 105K - 145k 0 mins 99 50

Observation

A decrease in read ops along with a decrease in total force merge time is seen for experiments where data is preloaded and there is enough memory to hold the data.

As the memory is constrained, there is an increase in read ops. This is expected as the memory will not be able to hold all the pages. The baseline performs better for merge operations in terms of amount of total time taken for force merge compared to preload for these cases.

Testing

Scenario 1: No store.preload in settings

{
  "settings": {
    "index": {
      "knn": true,
      "knn.algo_param.ef_search": 100,
      "number_of_shards": 1,
      "number_of_replicas": 0
    }
  },
  "mappings": {
    "properties": {
      "location": {
        "type": "knn_vector",
        "dimension": 2,
        "method": {
          "name": "hnsw",
          "space_type": "l2",
          "engine": "faiss",
          "parameters": {
            "ef_construction": 100,
            "m": 16
          }
        }
      }
    }
  }
}

Get index response

{
	"hotels-index-faiss": {
		"aliases": {},
		"mappings": {
			"properties": {
				"location": {
					"type": "knn_vector",
					"dimension": 2,
					"method": {
						"engine": "faiss",
						"space_type": "l2",
						"name": "hnsw",
						"parameters": {
							"ef_construction": 100,
							"m": 16
						}
					}
				}
			}
		},
		"settings": {
			"index": {
				"replication": {
					"type": "DOCUMENT"
				},
				"number_of_shards": "1",
				"knn.algo_param": {
					"ef_search": "100"
				},
				"provided_name": "hotels-index-faiss",
				"knn": "true",
				"creation_date": "1728085803212",
				"store": {
					"preload": [
						"vec",
						"vex"
					]
				},
				"number_of_replicas": "0",
				"uuid": "WawO8OR2S2WmvTr6K0gpRw",
				"version": {
					"created": "137217827"
				}
			}
		}
	}
}

Scenario 2: Preload override

{
  "settings": {
    "index": {
      "store.preload": [ "dvd" ],
      "knn": true,
      "knn.algo_param.ef_search": 100,
      "number_of_shards": 1,
      "number_of_replicas": 0
    }
  },
  "mappings": {
    "properties": {
      "location": {
        "type": "knn_vector",
        "dimension": 2,
        "method": {
          "name": "hnsw",
          "space_type": "l2",
          "engine": "faiss",
          "parameters": {
            "ef_construction": 100,
            "m": 16
          }
        }
      }
    }
  }
}

Get response

{
	"hotels-index-faiss": {
		"aliases": {},
		"mappings": {
			"properties": {
				"location": {
					"type": "knn_vector",
					"dimension": 2,
					"method": {
						"engine": "faiss",
						"space_type": "l2",
						"name": "hnsw",
						"parameters": {
							"ef_construction": 100,
							"m": 16
						}
					}
				}
			}
		},
		"settings": {
			"index": {
				"replication": {
					"type": "DOCUMENT"
				},
				"number_of_shards": "1",
				"knn.algo_param": {
					"ef_search": "100"
				},
				"provided_name": "hotels-index-faiss",
				"knn": "true",
				"creation_date": "1728087079416",
				"store": {
					"preload": [
						"dvd"
					]
				},
				"number_of_replicas": "0",
				"uuid": "g9TsdOXkTluokbuupwyQSA",
				"version": {
					"created": "137217827"
				}
			}
		}
	}
}

Related Issues

Resolves #2134

Check List

  • New functionality includes testing.
  • New functionality has been documented.
  • API changes companion pull request created.
  • Commits are signed per the DCO using --signoff.
  • Public documentation issue/PR created.

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.
For more information on following Developer Certificate of Origin and signing off your commits, please check here.

LuceneFlatVectorReader uses IOContext.Random to open the read. IOContext.Random
indicates the kernel to not read ahead the pages on to physical memory.
This causes an increase in merge time due to increase of read ops at
runtime.

The preload settings signals the kernal to preload the files when the
reader is opened

Signed-off-by: Tejas Shah <[email protected]>
@shatejas
Copy link
Collaborator Author

Lucene search latencies were impacted with this change. Closing it

@shatejas shatejas closed this Oct 15, 2024
@shatejas shatejas deleted the preload-vec branch November 27, 2024 16:12
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[BUG] Regression in cohere-10m force merge latency after switching to NativeEngines990KnnVectorsWriter
1 participant