-
-
Notifications
You must be signed in to change notification settings - Fork 834
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Search discussion titles too #257
Comments
IMO Search should me more sensitive. Why It doesn't handle 3 signs long phrases? Like "php" or "403" for example? Even worse "php 5", and still not searching... |
Limitations of MySQL's FULLTEXT search engine. We're planning to make it possible to use more powerful drivers like ElasticSearch/Sphinx instead. See #509. |
ElasticSearch sounds good. Current solution - without comment. :) |
@franzliedke Any thoughts on this? How do you include titles in the search in FluxBB? |
If we want to use elasticsearch, the search implementation will need to be abstracted, which we have done at work. I can paste in some examples. |
We built an index upon post creation / editing that stored all words along with their appearances in all posts. Easy to adapt for topic titles... Search then simply looked for all search terms in the index, found all posts in which they appeared and assembled the resulting list. |
@kirkbushell We already have an abstraction, I'm not sure if it'll be sufficient... @franzliedke That's very interesting, I've never thought about doing that with MySQL. How does it scale though? What about posts with tens of thousands of words? And forums with millions of posts. Isn't it essentially implementing a less efficient version of MySQL's FULLTEXT engine? |
It does work on the fairly large forums we have (1M+ posts), I think. I believe the approach was originally copied from PhpBB. The bigger problem is the very limited functionality. Word-based stuff works well for Western languages, but even then matches need to be complete (i.e. "shoot" will not match "shot"). I'll experiment with that stuff at some point, I took a whole class on the topic, that should be worth something ;) |
Relevant: http://www.slideshare.net/billkarwin/practical-full-text-search-with-my-sql Thinking it might be best to go with MySQL ISAM FULLTEXT, mirroring the discussion title on the posts table. Then recommend Sphinx/ElasticSearch for improved results. |
i think index table is a better idea, I'm not sure if you want to search tags or not, but this way you can do that too, big community softwares use this method, with a good indexing it should be ok I guess |
please take a look at this: 'titlewords' => array(
'postid' => 'INT UNSIGNED NOT NULL',
'wordid' => 'INT UNSIGNED NOT NULL',
'KEY postid (postid)',
'KEY wordid (wordid)',
'CONSTRAINT ^titlewords_ibfk_1 FOREIGN KEY (postid) REFERENCES ^posts(postid) ON DELETE CASCADE',
'CONSTRAINT ^titlewords_ibfk_2 FOREIGN KEY (wordid) REFERENCES ^words(wordid)',
),
'contentwords' => array(
'postid' => 'INT UNSIGNED NOT NULL',
'wordid' => 'INT UNSIGNED NOT NULL',
'count' => 'TINYINT UNSIGNED NOT NULL', // how many times word appears in the post - anything over 255 can be ignored
'type' => "ENUM('Q', 'A', 'C', 'NOTE') NOT NULL", // the post's type (copied here for quick searching)
'questionid' => 'INT UNSIGNED NOT NULL', // the id of the post's antecedent parent (here for quick searching)
'KEY postid (postid)',
'KEY wordid (wordid)',
'CONSTRAINT ^contentwords_ibfk_1 FOREIGN KEY (postid) REFERENCES ^posts(postid) ON DELETE CASCADE',
'CONSTRAINT ^contentwords_ibfk_2 FOREIGN KEY (wordid) REFERENCES ^words(wordid)',
),
'tagwords' => array(
'postid' => 'INT UNSIGNED NOT NULL',
'wordid' => 'INT UNSIGNED NOT NULL',
'KEY postid (postid)',
'KEY wordid (wordid)',
'CONSTRAINT ^tagwords_ibfk_1 FOREIGN KEY (postid) REFERENCES ^posts(postid) ON DELETE CASCADE',
'CONSTRAINT ^tagwords_ibfk_2 FOREIGN KEY (wordid) REFERENCES ^words(wordid)',
), 3 different table for searching in titles, contents and tags, you can see the usage here |
Using a DB table with word_id/post_id won't scale to millions of posts (we tried) and it's very limited in functionality/flexibility. Besides; there are much better solutions out there nowadays which include weighing search terms, filtering stopwords, stemming, fuzzy search, distributed search, etc. The SQL route is fine if you don't want to use external dependencies and if you can't use MySQL FULLTEXT, but for other use cases I would strongly recommend against it. I would propose using MySQL FULLTEXT for small to medium sites and using something like Sphinx or Elasticsearch on big/demanding sites. But to do this properly the abstraction in: Is too limited. https://github.com/flarum/core/blob/master/src/Core/Search/AbstractSearch.php would be more suitable. But #1042 needs to be fixed first. It would be nice if the search interface would at least offer:
Nice to have:
|
What I want to say is this: https://github.com/teamtnt/tntsearch |
This was implemented in #1339 |
Currently fulltext search is only performed on post text, not discussion titles. This needs to be fixed.
What needs to be done:
The text was updated successfully, but these errors were encountered: