Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[FEATURE]: Check if an index exists #1361

Closed
1 task done
filip-halt opened this issue Apr 14, 2023 · 10 comments
Closed
1 task done

[FEATURE]: Check if an index exists #1361

filip-halt opened this issue Apr 14, 2023 · 10 comments
Assignees
Milestone

Comments

@filip-halt
Copy link
Contributor

Is there an existing issue for this?

  • I have searched the existing issues

Is your feature request related to a problem? Please describe.

At the current moment there is not fast way to check if an index exists on the vector in a collection.

In order to check if an index exists you first need to extract the embedding field name from the schema by iterating over all its fields looking for a dataype of FLOAT_VECTOR or BINARY_VECTOR. Then you must go through the indexes of the collection and check if the index.field_name matches the embedding field name.

The other route is to try to load() and catch the exception but that seems a bit hacky.

Describe the solution you'd like

collection.vector_indexed() returns true or false based on having an index built on the vector field.

Describe alternatives you've considered

No response

Anything else?

No response

@xiaofan-luan
Copy link
Contributor

collection.has_index should be the API you are looking for

@filip-halt
Copy link
Contributor Author

collection.has_index should be the API you are looking for

Unfortunately, just having an attribute index causes it to respond with True. I think it grabs the first index it can without checking if its on the embedding field.

@xiaofan-luan
Copy link
Contributor

This one should check for vector field and I guess it only cares about vector index unless you pass in indexname.
@czs007 correct me if I'm worng

@filip-halt
Copy link
Contributor Author

connections.connect(host='localhost', port=19530)
fields = [
    FieldSchema(name="id", dtype=DataType.INT64, is_primary=True, auto_id=True),
    FieldSchema(name="string1", dtype=DataType.VARCHAR, max_length=1000),
    FieldSchema(name="embedding", dtype=DataType.FLOAT_VECTOR, dim=DIMENSION),
]
schema = CollectionSchema(fields=fields)
if utility.has_collection("lol"):
    utility.drop_collection("lol")
collection = Collection("lol", schema, consistency_level="Strong")
index_params = {
    'metric_type':'L2',
    'index_type':"HNSW",
    'params':{'M': 8, 'efConstruction': 64}
}
collection.create_index('id')
print("has_index() call", collection.has_index())
try:
    collection.load()
except MilvusException as e:
    print(e.message)

will print:

has_index() call True
RPC error: [load_collection], <MilvusException: (code=1, message=there is no vector index on collection: lol, please create index firstly)>, <Time:{'RPC start': '2023-04-19 18:02:20.819621', 'RPC error': '2023-04-19 18:02:20.831111'}>
there is no vector index on collection: lol, please create index firstly

@xiaofan-luan
Copy link
Contributor

if you define the name of index, should be has_index('id')?
but we should support list_index interface

@filip-halt
Copy link
Contributor Author

if you define the name of index, should be has_index('id')?
but we should support list_index interface

Unfortunately that doesn't work either, the only argument for has_index is timeout, so it tries to timeout based on a str which throws an exception.

@xiaofan-luan
Copy link
Contributor

/assign @czs007

@xiaofan-luan
Copy link
Contributor

how do we check the index of a collection?

@longjiquan
Copy link
Contributor

Hello, @filip-halt , this feature may be implemented in #1386 , feel free to give feedbacks if it didn't satisfy the requirements.

@longjiquan
Copy link
Contributor

/assign @filip-halt

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants