-
Notifications
You must be signed in to change notification settings - Fork 24
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
improve sorting when filtering datasets #2834
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice, works well. I'm alright with deploying it for performance feedback.
? _.chain(filteredDataSource) | ||
.map(row => ({ | ||
row, | ||
diceCoefficient: dice(row.name, this.props.searchQuery), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure whether this will result in strange results if the user searches for a word in the description. The search results are filtered using the name and description properties, but the diceCoefficient only takes into account the dataset name for sorting.
Including the description in the diceCoefficient will probably slow it down further, right?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actually I noticed the description is not even displayed in the advanced dataset view, so it's probably alright to remove "description" in line 66.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Searching for some word of the description doesn't work for me anyways, neither in the advanced nor on the spotlight dataset view, not sure why.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Searching for some word of the description doesn't work for me anyways, neither in the advanced nor on the spotlight dataset view, not sure why.
Really? It works for me 🤔
I'm not sure whether this will result in strange results if the user searches for a word in the description. The search results are filtered using the name and description properties, but the diceCoefficient only takes into account the dataset name for sorting.
My assumption is that the search is only used for dataset name matching, anyway. In the rare case, that a user searches the description, the ordering might be suboptimal (or "random"?), but I don't think that's a showstopper, as the user's had to scroll through a long list before anyway. We can remove the description from the search parameters, but I'd leave it as is for now until someone complains. That way, we don't remove existing functionality from the view.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I assumed that the string Original data and segmentation
is the description and I could search for parts of that, but that string is only a default string that is shown if there is no description. Setting a custom description and searching for it works as intended, all fine :)
Let's leave the description in there, your argumentation makes sense to me!
I used the dice coefficient to sort the datasets by best match. This works quite well for the suffix example which covers probably 95% of the cases where this is needed. I'm a bit concerned about performance, since the sorting resulted in a noticable lag for 1000 datasets. However, my datasets all had almost the same name. So, if the search query filters the amount of datasets considerably, the lag should be reduced, as well. In the end, the lab should just give final feedback (after deployment, since the dev server doesn't have many datasets). For the copy&paste use case it should definitely be enough.
Mailable description of changes:
URL of deployed dev instance (used for testing):
Steps to test:
Issues: