-
Notifications
You must be signed in to change notification settings - Fork 548
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Reference key in search results isn't of the same type as the actual key. #117
Comments
Updated to add the actual reference instead of relying on the map's key.
I guess most people are using strings as keys for their documents, I'm actually a little surprised that this hasn't come up as an issue before. I'll merge this in for the 0.6.0 release. |
Are there any plans when 0.6.0 will come? 😉 |
For anybody having the same problem, here is my workaround: Given var searchIndex = lunr(function() {
this.field('text');
this.ref('_id');
}); replace searchIndex.add(doc); with doc._id = doc._id._str;
searchIndex.add(doc); and replace var searchResults = searchIndex.search(input);
frobnicate(input[0].ref); with var searchResults = searchIndex.search(input);
frobnicate(new ObjectId(input[0].ref)); This lets lunr use the string representation of the ObjectIds and creates new ObjectIds when retrieving search results from lunr. |
I've thought about this a bit more and I've changed my mind on how lunr should behave here. I think the right thing to do is to insist that document references are strings, and trying to add a document with a non string ref should be an error. The reason for this is that in several places through out the code base the reference is used as a property in an object being used as a key value store. This means that the key must be a string (or at least convert nicely to a string) because Object keys are always strings. I think it is then more consistent with the way lunr wants to see the documents to be indexed. It assumes that by the time you do If you have any other ideas or suggestions then I'd like to hear them, and sorry it took so long to respond! |
I think that's a fair idea, but I do think that in the instance that the ref is a number, lunr should treat is as a string. And for getting back the ref in the proper type, is there any reason my commit would not be acceptable? |
I tested your patch but the extra Lunr could continue as it is now by coercing the Enforcing a string also is more inline with the way lunr treats the document being indexed, the idea being that before indexing the document it must be in the right format for lunr, i.e. lunr doesn't do any kind of conversion or munging of the document. |
Oh, okay. I actually thought my implementation would be faster since it's using a Object.keys(self.tokenStore.get(key)).forEach(function (ref) { set.add(ref) }) To this: Object.keys(self.tokenStore.get(key)).forEach(function (ref) { set.add(documents[ref].ref) }) Shouldn't be too much overhead on the CPU. I actually ended up doing a benchmark on it and as it turns out, my method as referenced above (using the lookup) is somehow faster on my machine: http://jsperf.com/lookupvsnolookup I found it a bit hard to believe, so I also setup a test environment on my machine using benchmark.js using this source: import Benchmark from '/Users/kkirbatski/node_modules/benchmark';
let obj, set = null;
function setup(){
var obj = {};
for(var i = 0; i < 100; i++){
obj[i] = {
ref: i
};
}
var set = {
items: [],
add: function(item){
this.items.push(item);
}
}
}
let suite = new Benchmark.Suite('abc');
suite.add(
'noLookup',
function(){
Object.keys(obj).forEach(function(ref){
console.log(ref)
set.add(ref);
});
},
{setup}
).add(
'lookup',
function(){
Object.keys(obj).forEach(function(ref){
console.log(ref)
set.add(obj[ref].ref);
});
},
{setup}
).on('complete',function(){
console.log('complete',this.filter('fastest').pluck('name'));
}).run(); So if I'm understanding the results correctly, if we used my method above, then we would have the best of both worlds. Retaining the reference key type, and increasing the speed of lunr. Though, I do find it very possible that I did something wrong in my tests so let me know your thoughts. |
After taking another look at this I don't know why I thought there was a performance issue. Anyway, I've pushed a fix for this in version 0.6.0 that preserves the type of the ref property in the results. The only caveat is that the ref type must be comparable, this is because of the way the ref is used in Anyway, let me know if you have any issues with this, and thanks again for the patch and the issue, sorry it took so long to get a fix out for this! |
No worries, thanks! |
In the result array after searching, the ref key isn't of the same type as the actual data. Take this example:
The result is:
The ref key is of a string type, even though the actual key is a number.
The text was updated successfully, but these errors were encountered: