-
Notifications
You must be signed in to change notification settings - Fork 48
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[BUG] Unable to index document containing knn_vector using NLP ingest pipeline #613
Comments
Hi @whittssg, Could you please provide a more complete code sample including how you've created the ingest pipeline and how you're attempting to index the document? |
@Xtansia I followed the tutorial on https://opensearch.org/docs/latest/search-plugins/neural-search-tutorial/ exactly.... names included.. Manually indexing works via the dashboard dev tools |
@whittssg If manually indexing via dev tools works then please share the code you're using to attempt to index the document in C#. |
I am just filling the field that is specified as the vector field in the index creation... nothing special:
and as i mentioned above this is how i created the index in c#: ` indexes => indexes
|
In the tutorial the ingest pipeline is configured to take an input field named
So when you're indexing documents you need to send your string in the
|
oh ok, so do i need to change this (i changed the field to text): I changed it to text as you can see above but now get the same error" illegal_argument_exception Reason: "Vector dimension mismatch. Expected: 768, Given: 0""} |
So it will index if i change that but querying will give this error:
I think i am missing something obvious |
Creating the index should read this right to match the tutorial: .KnnVector(kv => kv .Name(n => n.text).Dimension(768).Method(m => m.Engine("lucene").SpaceType("l2").Name("hnsw"))) |
the only way my code indexes is if i rem out this: I can see documents indexing once this is gone. Maybe i am just missing somethings stupid, is there an example some where for this type of search? |
I think i should have been more clear, i followed everything upto the creation of the index on that tutorial (i went through it all and everything worked perfectly but now i want to do it via c#). So instead of doing the models etc in c# i started at the create index step in c# (since the models etc were created via the puts in the tutorial). Since i am creating the index in c# i need to specify the KnnVector in the index creator and the field that is associated with it? Which i thought should be this:
Then i should just fill the test field and it should work but nope. Thanks for you help by the way. |
I haven't actually run this code yet but the below should be roughly what's needed. I'm going to work on creating a full working sample. You document class would look something like: public class NlpDoc
{
public NlpDoc()
{
}
public NlpDoc(string id, string text)
{
Id = id;
Text = text;
}
public string Id { get; set; }
public string Text { get; set; }
[PropertyName("passage_embedding")]
public float[] PassageEmbedding { get; set; }
} Creating the index would look something like: var resp = await client.Indices.CreateAsync(
indexName,
i => i
.Settings(s => s
.Setting("index.knn", true)
.DefaultPipeline(pipelineName))
.Map<NlpDoc>(m => m
.Properties(p => p
.Text(t => t.Name(d => d.Id))
.KnnVector(k => k
.Name(d => d.PassageEmbedding)
.Dimension(768)
.Method(km => km
.Engine("lucene")
.SpaceType("l2")
.Name("hnsw")))
.Text(t => t.Name(d => d.Text))))); Indexing the documents would look like: var docs = new[]
{
new NlpDoc("4319130149.jpg", "A West Virginia university women 's basketball team , officials , and a small gathering of fans are in a West Virginia arena ."),
new NlpDoc("1775029934.jpg", "A wild animal races across an uncut field with a minimal amount of trees ."),
new NlpDoc("2664027527.jpg", "People line the stands which advertise Freemont 's orthopedics , a cowboy rides a light brown bucking bronco ."),
new NlpDoc("4427058951.jpg", "A man who is riding a wild horse in the rodeo is very near to falling off ."),
new NlpDoc("2691147709.jpg", "A rodeo cowboy , wearing a cowboy hat , is being thrown off of a wild white horse .")
};
var resp = await client.IndexManyAsync(docs, indexName); |
I will give it a go tomorrow, a sample would be awsome. Thanks again. |
What is the bug?
Indexing a document with knnvector errors
How can one reproduce the bug?
?
What is the expected behavior?
It works
What is your host/environment?
Windows (just latest stabe release downloaded and setup today)
I followed this tutorial and everything works perfectly but when i try and index somthing via this library it errors..
https://opensearch.org/docs/latest/search-plugins/neural-search-tutorial/
This is the code i am using in c# to create the index:
The index creates without error but indexing gives this error:
I am filling the passage_embedding with text.
Is there something else i need to specify in the .net library for vector indexing?
Thanks
The text was updated successfully, but these errors were encountered: