Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

User and owner filters #237

Merged
merged 2 commits into from
Sep 20, 2023
Merged

User and owner filters #237

merged 2 commits into from
Sep 20, 2023

Conversation

silvioq
Copy link
Contributor

@silvioq silvioq commented Jan 10, 2023

The use of "keyword" will allow the search key to be exact, Elasticsearch will not perform any transformations. This use case will consider external user sources (LDAP), where the username may have dashes or symbols that could be misinterpreted by the engine.

Signed-off-by: Silvio [email protected]

@vbier
Copy link

vbier commented Sep 13, 2023

If only I had seen your pull request earlier. It took me a complete day to find out why the search did not work for me. This PR fixes: #300

Edit: I am wondering how this pull request can be open for more than half a year. Is Nextcloud not used in corporate environments with AD integration? This is a killer bug for us, as it effectively breaks fulltext search for all our users.

Copy link

@vbier vbier left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@silvioq
Copy link
Contributor Author

silvioq commented Sep 13, 2023

If only I had seen your pull request earlier. It took me a complete day to find out why the search did not work for me. This PR fixes: #300

Edit: I am wondering how this pull request can be open for more than half a year. Is Nextcloud not used in corporate environments with AD integration? This is a killer bug for us, as it effectively breaks fulltext search for all our users.

Hi. We have a patch on our environment and we apply it after every upgrade.

@vbier
Copy link

vbier commented Sep 14, 2023

@silvioq , can you try to request a review? See https://docs.github.com/en/pull-requests/collaborating-with-pull-requests/proposing-changes-to-your-work-with-pull-requests/requesting-a-pull-request-review. A suitable reviewer might be R0Wi or ArtificialOwl. There has to be a way to get the changes merged.

@silvioq silvioq requested a review from vbier September 14, 2023 11:29
@vbier
Copy link

vbier commented Sep 15, 2023

I have approved the changes, but that does not seem to help as I do not have the needed write permissions on the repository. That is why I mentioned R0Wi and ArtificialOwl as reviewers. Can you request a review from one of them?

@silvioq
Copy link
Contributor Author

silvioq commented Sep 15, 2023

Sorry. Only @vbier appears to review for me. I can close this pull and create again.

imagen

@vbier
Copy link

vbier commented Sep 15, 2023

@R0Wi , @ArtificialOwl , sorry to bother you. Can you help top get the changes merged?

@R0Wi
Copy link
Member

R0Wi commented Sep 16, 2023

@ArtificialOwl this seems to be closely related to #265 where you fixed the uppercase group filter by just lowercasing the input. I was working together with @Kdubs937 recently because he was facing a similar issue when trying to search for files from an external provider (source = files_external) where the users array contained usernames with uppercase letters.

My fix here was to also call strtolower() on owner and users (see this commit) before sending the query to ES and this also fixed the issue.

But to me using users.keyword and owner.keyword seems to be the even cleaner approach, I'd only like to use the same strategie for users, owner, groups and circles and not mix it up. Btw: I tested this approach locally and it also works so the expected files_external documents are showing up again properly.

@silvioq did you find some official ES docs where they describe the behaviour of putting the .keyword suffix into your query?

The use of "keyword" will allow the search key to be exact, Elasticsearch will not perform any transformations. This use case will consider external user sources (LDAP), where the username may have dashes or symbols that could be misinterpreted by the engine.

Signed-off-by: Silvio <[email protected]>
@R0Wi
Copy link
Member

R0Wi commented Sep 16, 2023

I took the liberty to also add the .keyword suffix to both groups and circles. I also rebased this branch onto master and provided the appropriate test adjustments in nextcloud/fulltextsearch#771. You'll see that these tests will fail if you don't apply the patch of this PR.

Thanks @silvioq and @vbier for your work!

@R0Wi R0Wi requested a review from vbier September 16, 2023 15:36
R0Wi
R0Wi previously approved these changes Sep 16, 2023
@silvioq
Copy link
Contributor Author

silvioq commented Sep 18, 2023

Thanks @R0Wi I'll test the changes in our nextcloud instance soon.

About usage of .keyword sufix, you can read there:
https://www.elastic.co/guide/en/elasticsearch/reference/8.9/text.html#before-enabling-fielddata

@XueSheng-GIT
Copy link

I've applied this PR to NC27.1.0 and Fulltextsearch_elasticsearch 27.0.2 but afterwards fulltext search does not work anymore and test fails (see below). Applying nextcloud/fulltextsearch#771 does not solve this issue.

Any idea? Any additional patch required to get this working?

# sudo -u www-data php /var/www/nextcloud/occ fulltextsearch:test
 
.Testing your current setup:  
Creating mocked content provider. ok  
Testing mocked provider: get indexable documents. (2 items) ok  
Loading search platform. (Elasticsearch) ok  
Testing search platform. ok  
Locking process ok  
Removing test. ok  
Pausing 3 seconds 1 2 3 ok  
Initializing index mapping. ok  
Indexing generated documents. ok  
Pausing 3 seconds 1 2 3 ok  
Retreiving content from a big index (license). (size: 32386) ok  
Comparing document with source. ok  
Searching basic keywords:  
 - 'test' (result: 0, expected: ["simple"]) fail  
Error detected, unlocking process ok 
In Test.php line 675:
                                                                                                                                                                                        
  Unexpected SearchResult: {"provider":{"id":"test_provider","name":"Test Provider"},"platform":{"id":"elastic_search","name":"Elasticsearch"},"documents":[],"info":[],"meta":{"timed  
  Out":false,"time":3,"count":0,"total":0,"maxScore":0}}                                                                                                                                
                                                                                                                                                                                        

@R0Wi R0Wi dismissed their stale review September 19, 2023 05:27

Set onhold

@R0Wi
Copy link
Member

R0Wi commented Sep 19, 2023

Thanks @XueSheng-GIT for your feedback! I tested with a clean install of Nextcloud 28 (current master) together with Elasticsearch 8.6.1 and this worked for me. To get additional information out of the logs, you might want to apply this patch and lower the server loglevel to 0. Then, please rerun your test via occ-command. Hopefully the server log file should contain additional information afterwards which you could share with us 👍

@XueSheng-GIT
Copy link

@R0Wi Thanks for providing some guidance!
I'm using elasticsearch 8.10 on ubuntu 22.04.

Log for error shown above #237 (comment):

{"reqId":"5SX4qoTiPy5iS9tMVEzS","level":0,"time":"2023-09-19T11:30:04+02:00","remoteAddr":"","user":"--","app":"fulltextsearch_elasticsearch","method":"","url":"/occ","message":"Headers: {\"Host\":[\"localhost:9200\"],\"Accept\":[\"application\\/vnd.elasticsearch+json; compatible-with=8\"],\"Content-Type\":[\"application\\/vnd.elasticsearch+json; compatible-with=8\"],\"User-Agent\":[\"elasticsearch-php\\/8.6.1 (Linux 6.2.16-6-pve; PHP 8.1.2-1ubuntu2.14)\"],\"x-elastic-client-meta\":[\"es=8.6.1,php=8.1.2,t=8.7.0,a=0,gu=7.7.0\"]}\nBody: {\"query\":{\"bool\":{\"must\":{\"bool\":{\"should\":[{\"match_phrase_prefix\":{\"content\":\"test\"}},{\"match_phrase_prefix\":{\"title\":\"test\"}}]}},\"filter\":[{\"bool\":{\"must\":{\"term\":{\"provider\":\"test_provider\"}}}},{\"bool\":{\"should\":[{\"term\":{\"owner.keyword\":\"user1\"}},{\"term\":{\"users.keyword\":\"user1\"}},{\"term\":{\"users.keyword\":\"__all\"}}]}},{\"bool\":{\"should\":[]}},{\"bool\":{\"must\":[]}},{\"bool\":{\"must\":[]}}]}},\"highlight\":{\"fields\":{\"content\":{}},\"pre_tags\":[\"\"],\"post_tags\":[\"\"]}}","userAgent":"--","version":"27.1.0.7","data":{"app":"fulltextsearch_elasticsearch"}}
{"reqId":"5SX4qoTiPy5iS9tMVEzS","level":1,"time":"2023-09-19T11:30:04+02:00","remoteAddr":"","user":"--","app":"fulltextsearch_elasticsearch","method":"","url":"/occ","message":"Response (retry 0): 200","userAgent":"--","version":"27.1.0.7","data":{"app":"fulltextsearch_elasticsearch","response":"{\"[object] (GuzzleHttp\\Psr7\\Response)\":{\"GuzzleHttp\\Psr7\\ResponsereasonPhrase\":\"OK\",\"GuzzleHttp\\Psr7\\ResponsestatusCode\":200,\"GuzzleHttp\\Psr7\\Responseheaders\":{\"X-elastic-product\":[\"Elasticsearch\"],\"content-type\":[\"application/vnd.elasticsearch+json;compatible-with=8\"],\"Transfer-Encoding\":[\"chunked\"]},\"GuzzleHttp\\Psr7\\ResponseheaderNames\":{\"x-elastic-product\":\"X-elastic-product\",\"content-type\":\"content-type\",\"transfer-encoding\":\"Transfer-Encoding\"},\"GuzzleHttp\\Psr7\\Responseprotocol\":\"1.1\",\"GuzzleHttp\\Psr7\\Responsestream\":{\"[object] (GuzzleHttp\\Psr7\\Stream)\":{\"GuzzleHttp\\Psr7\\Streamstream\":\"[resource] Resource id #1895\",\"GuzzleHttp\\Psr7\\Streamsize\":null,\"GuzzleHttp\\Psr7\\Streamseekable\":true,\"GuzzleHttp\\Psr7\\Streamreadable\":true,\"GuzzleHttp\\Psr7\\Streamwritable\":true,\"GuzzleHttp\\Psr7\\Streamuri\":\"php://temp\",\"GuzzleHttp\\Psr7\\StreamcustomMetadata\":[]}}}}","retry":"0"}}
{"reqId":"5SX4qoTiPy5iS9tMVEzS","level":0,"time":"2023-09-19T11:30:04+02:00","remoteAddr":"","user":"--","app":"fulltextsearch_elasticsearch","method":"","url":"/occ","message":"Headers: {\"X-elastic-product\":[\"Elasticsearch\"],\"content-type\":[\"application\\/vnd.elasticsearch+json;compatible-with=8\"],\"Transfer-Encoding\":[\"chunked\"]}\nBody: {\"took\":1,\"timed_out\":false,\"_shards\":{\"total\":1,\"successful\":1,\"skipped\":0,\"failed\":0},\"hits\":{\"total\":{\"value\":0,\"relation\":\"eq\"},\"max_score\":null,\"hits\":[]}}","userAgent":"--","version":"27.1.0.7","data":{"app":"fulltextsearch_elasticsearch"}}
{"reqId":"5SX4qoTiPy5iS9tMVEzS","level":1,"time":"2023-09-19T11:30:04+02:00","remoteAddr":"","user":"--","app":"fulltextsearch_elasticsearch","method":"","url":"/occ","message":"Response time in 0.006 sec","userAgent":"--","version":"27.1.0.7","data":{"app":"fulltextsearch_elasticsearch"}}
{"reqId":"5SX4qoTiPy5iS9tMVEzS","level":0,"time":"2023-09-19T11:30:04+02:00","remoteAddr":"","user":"--","app":"fulltextsearch_elasticsearch","method":"","url":"/occ","message":"result from ES","userAgent":"--","version":"27.1.0.7","data":{"app":"fulltextsearch_elasticsearch","result":"{\"[object] (Elastic\\Elasticsearch\\Response\\Elasticsearch)\":{\"*response\":{\"[object] (GuzzleHttp\\Psr7\\Response)\":{\"GuzzleHttp\\Psr7\\ResponsereasonPhrase\":\"OK\",\"GuzzleHttp\\Psr7\\ResponsestatusCode\":200,\"GuzzleHttp\\Psr7\\Responseheaders\":[],\"GuzzleHttp\\Psr7\\ResponseheaderNames\":[],\"GuzzleHttp\\Psr7\\Responseprotocol\":\"1.1\",\"GuzzleHttp\\Psr7\\Responsestream\":\"[object] (GuzzleHttp\\Psr7\\Stream)\"}}}}"}}
{"reqId":"5SX4qoTiPy5iS9tMVEzS","level":0,"time":"2023-09-19T11:30:04+02:00","remoteAddr":"","user":"--","app":"fulltextsearch_elasticsearch","method":"","url":"/occ","message":"Search Result","userAgent":"--","version":"27.1.0.7","data":{"app":"fulltextsearch_elasticsearch","searchResult":"{\"[object] (OCA\\FullTextSearch\\Model\\SearchResult)\":{\"OCA\\FullTextSearch\\Model\\SearchResultdocuments\":[],\"OCA\\FullTextSearch\\Model\\SearchResultrawResult\":\"{\\\"took\\\":1,\\\"timed_out\\\":false,\\\"_shards\\\":{\\\"total\\\":1,\\\"successful\\\":1,\\\"skipped\\\":0,\\\"failed\\\":0},\\\"hits\\\":{\\\"total\\\":{\\\"value\\\":0,\\\"relation\\\":\\\"eq\\\"},\\\"max_score\\\":null,\\\"hits\\\":[]}}\",\"OCA\\FullTextSearch\\Model\\SearchResultprovider\":{\"[object] (OCA\\FullTextSearch\\Provider\\TestProvider)\":{\"OCA\\FullTextSearch\\Provider\\TestProviderconfigService\":\"[object] (OCA\\FullTextSearch\\Service\\ConfigService)\",\"OCA\\FullTextSearch\\Provider\\TestProvidertestService\":\"[object] (OCA\\FullTextSearch\\Service\\TestService)\",\"OCA\\FullTextSearch\\Provider\\TestProvidermiscService\":\"[object] (OCA\\FullTextSearch\\Service\\MiscService)\",\"OCA\\FullTextSearch\\Provider\\TestProviderrunner\":\"[object] (OCA\\FullTextSearch\\Model\\Runner)\",\"OCA\\FullTextSearch\\Provider\\TestProviderindexOptions\":\"[object] (OCA\\FullTextSearch\\Model\\IndexOptions)\"}},\"OCA\\FullTextSearch\\Model\\SearchResultplatform\":{\"[object] (OCA\\FullTextSearch_Elasticsearch\\Platform\\ElasticSearchPlatform)\":{\"OCA\\FullTextSearch_Elasticsearch\\Platform\\ElasticSearchPlatformclient\":\"[object] (Elastic\\Elasticsearch\\Client)\",\"OCA\\FullTextSearch_Elasticsearch\\Platform\\ElasticSearchPlatformrunner\":\"[object] (OCA\\FullTextSearch\\Model\\Runner)\",\"OCA\\FullTextSearch_Elasticsearch\\Platform\\ElasticSearchPlatformconfigService\":\"[object] (OCA\\FullTextSearch_Elasticsearch\\Service\\ConfigService)\",\"OCA\\FullTextSearch_Elasticsearch\\Platform\\ElasticSearchPlatformindexService\":\"[object] (OCA\\FullTextSearch_Elasticsearch\\Service\\IndexService)\",\"OCA\\FullTextSearch_Elasticsearch\\Platform\\ElasticSearchPlatformsearchService\":\"[object] (OCA\\FullTextSearch_Elasticsearch\\Service\\SearchService)\",\"OCA\\FullTextSearch_Elasticsearch\\Platform\\ElasticSearchPlatformlogger\":\"[object] (OC\\AppFramework\\ScopedPsrLogger)\"}},\"OCA\\FullTextSearch\\Model\\SearchResulttotal\":0,\"OCA\\FullTextSearch\\Model\\SearchResultmaxScore\":0,\"OCA\\FullTextSearch\\Model\\SearchResulttime\":1,\"OCA\\FullTextSearch\\Model\\SearchResulttimedOut\":false,\"OCA\\FullTextSearch\\Model\\SearchResultrequest\":{\"[object] (OCA\\FullTextSearch\\Model\\SearchRequest)\":{\"OCA\\FullTextSearch\\Model\\SearchRequestproviders\":[],\"OCA\\FullTextSearch\\Model\\SearchRequestsearch\":\"test\",\"OCA\\FullTextSearch\\Model\\SearchRequestemptySearch\":false,\"OCA\\FullTextSearch\\Model\\SearchRequestpage\":1,\"OCA\\FullTextSearch\\Model\\SearchRequestsize\":10,\"OCA\\FullTextSearch\\Model\\SearchRequestauthor\":\"\",\"OCA\\FullTextSearch\\Model\\SearchRequesttags\":[],\"metaTags\":[],\"subTags\":[],\"OCA\\FullTextSearch\\Model\\SearchRequestoptions\":[],\"OCA\\FullTextSearch\\Model\\SearchRequestparts\":[],\"OCA\\FullTextSearch\\Model\\SearchRequestfields\":[],\"OCA\\FullTextSearch\\Model\\SearchRequestlimitFields\":[],\"OCA\\FullTextSearch\\Model\\SearchRequestwildcardFields\":[],\"OCA\\FullTextSearch\\Model\\SearchRequestwildcardFilters\":[],\"OCA\\FullTextSearch\\Model\\SearchRequestregexFilters\":[],\"OCA\\FullTextSearch\\Model\\SearchRequestsimpleQueries\":[]}}}}"}}

Same without this PR (no error, alls tests pass):

{"reqId":"juwQCqvwIat5CbsJnNFZ","level":0,"time":"2023-09-19T11:35:44+02:00","remoteAddr":"","user":"--","app":"fulltextsearch_elasticsearch","method":"","url":"/occ","message":"Headers: {\"Host\":[\"localhost:9200\"],\"Accept\":[\"application\\/vnd.elasticsearch+json; compatible-with=8\"],\"Content-Type\":[\"application\\/vnd.elasticsearch+json; compatible-with=8\"],\"User-Agent\":[\"elasticsearch-php\\/8.6.1 (Linux 6.2.16-6-pve; PHP 8.1.2-1ubuntu2.14)\"],\"x-elastic-client-meta\":[\"es=8.6.1,php=8.1.2,t=8.7.0,a=0,gu=7.7.0\"]}\nBody: {\"query\":{\"bool\":{\"must\":{\"bool\":{\"should\":[{\"match_phrase_prefix\":{\"content\":\"test\"}},{\"match_phrase_prefix\":{\"title\":\"test\"}}]}},\"filter\":[{\"bool\":{\"must\":{\"term\":{\"provider\":\"test_provider\"}}}},{\"bool\":{\"should\":[{\"term\":{\"owner\":\"user1\"}},{\"term\":{\"users\":\"user1\"}},{\"term\":{\"users\":\"__all\"}}]}},{\"bool\":{\"should\":[]}},{\"bool\":{\"must\":[]}},{\"bool\":{\"must\":[]}}]}},\"highlight\":{\"fields\":{\"content\":{}},\"pre_tags\":[\"\"],\"post_tags\":[\"\"]}}","userAgent":"--","version":"27.1.0.7","data":{"app":"fulltextsearch_elasticsearch"}}
{"reqId":"juwQCqvwIat5CbsJnNFZ","level":1,"time":"2023-09-19T11:35:44+02:00","remoteAddr":"","user":"--","app":"fulltextsearch_elasticsearch","method":"","url":"/occ","message":"Response (retry 0): 200","userAgent":"--","version":"27.1.0.7","data":{"app":"fulltextsearch_elasticsearch","response":"{\"[object] (GuzzleHttp\\Psr7\\Response)\":{\"GuzzleHttp\\Psr7\\ResponsereasonPhrase\":\"OK\",\"GuzzleHttp\\Psr7\\ResponsestatusCode\":200,\"GuzzleHttp\\Psr7\\Responseheaders\":{\"X-elastic-product\":[\"Elasticsearch\"],\"content-type\":[\"application/vnd.elasticsearch+json;compatible-with=8\"],\"Transfer-Encoding\":[\"chunked\"]},\"GuzzleHttp\\Psr7\\ResponseheaderNames\":{\"x-elastic-product\":\"X-elastic-product\",\"content-type\":\"content-type\",\"transfer-encoding\":\"Transfer-Encoding\"},\"GuzzleHttp\\Psr7\\Responseprotocol\":\"1.1\",\"GuzzleHttp\\Psr7\\Responsestream\":{\"[object] (GuzzleHttp\\Psr7\\Stream)\":{\"GuzzleHttp\\Psr7\\Streamstream\":\"[resource] Resource id #1895\",\"GuzzleHttp\\Psr7\\Streamsize\":null,\"GuzzleHttp\\Psr7\\Streamseekable\":true,\"GuzzleHttp\\Psr7\\Streamreadable\":true,\"GuzzleHttp\\Psr7\\Streamwritable\":true,\"GuzzleHttp\\Psr7\\Streamuri\":\"php://temp\",\"GuzzleHttp\\Psr7\\StreamcustomMetadata\":[]}}}}","retry":"0"}}
{"reqId":"juwQCqvwIat5CbsJnNFZ","level":0,"time":"2023-09-19T11:35:44+02:00","remoteAddr":"","user":"--","app":"fulltextsearch_elasticsearch","method":"","url":"/occ","message":"Headers: {\"X-elastic-product\":[\"Elasticsearch\"],\"content-type\":[\"application\\/vnd.elasticsearch+json;compatible-with=8\"],\"Transfer-Encoding\":[\"chunked\"]}\nBody: {\"took\":7,\"timed_out\":false,\"_shards\":{\"total\":1,\"successful\":1,\"skipped\":0,\"failed\":0},\"hits\":{\"total\":{\"value\":1,\"relation\":\"eq\"},\"max_score\":7.7150726,\"hits\":[{\"_index\":\"nextcloud\",\"_id\":\"test_provider:simple\",\"_score\":7.7150726,\"_source\":{\"owner\":\"user1\",\"users\":[],\"groups\":[],\"circles\":[],\"links\":[],\"metatags\":[],\"subtags\":[],\"tags\":[],\"hash\":\"dc5617141771b9472dcc0739960bf07a\",\"provider\":\"test_provider\",\"source\":\"\",\"title\":\"\",\"parts\":[]},\"highlight\":{\"content\":[\"testing document is a simple test\"]}}]}}","userAgent":"--","version":"27.1.0.7","data":{"app":"fulltextsearch_elasticsearch"}}
{"reqId":"juwQCqvwIat5CbsJnNFZ","level":1,"time":"2023-09-19T11:35:44+02:00","remoteAddr":"","user":"--","app":"fulltextsearch_elasticsearch","method":"","url":"/occ","message":"Response time in 0.020 sec","userAgent":"--","version":"27.1.0.7","data":{"app":"fulltextsearch_elasticsearch"}}
{"reqId":"juwQCqvwIat5CbsJnNFZ","level":0,"time":"2023-09-19T11:35:44+02:00","remoteAddr":"","user":"--","app":"fulltextsearch_elasticsearch","method":"","url":"/occ","message":"result from ES","userAgent":"--","version":"27.1.0.7","data":{"app":"fulltextsearch_elasticsearch","result":"{\"[object] (Elastic\\Elasticsearch\\Response\\Elasticsearch)\":{\"*response\":{\"[object] (GuzzleHttp\\Psr7\\Response)\":{\"GuzzleHttp\\Psr7\\ResponsereasonPhrase\":\"OK\",\"GuzzleHttp\\Psr7\\ResponsestatusCode\":200,\"GuzzleHttp\\Psr7\\Responseheaders\":[],\"GuzzleHttp\\Psr7\\ResponseheaderNames\":[],\"GuzzleHttp\\Psr7\\Responseprotocol\":\"1.1\",\"GuzzleHttp\\Psr7\\Responsestream\":\"[object] (GuzzleHttp\\Psr7\\Stream)\"}}}}"}}
{"reqId":"juwQCqvwIat5CbsJnNFZ","level":0,"time":"2023-09-19T11:35:44+02:00","remoteAddr":"","user":"--","app":"fulltextsearch_elasticsearch","method":"","url":"/occ","message":"Search Result","userAgent":"--","version":"27.1.0.7","data":{"app":"fulltextsearch_elasticsearch","searchResult":"{\"[object] (OCA\\FullTextSearch\\Model\\SearchResult)\":{\"OCA\\FullTextSearch\\Model\\SearchResultdocuments\":[{\"[object] (OC\\FullTextSearch\\Model\\IndexDocument)\":[]}],\"OCA\\FullTextSearch\\Model\\SearchResultrawResult\":\"{\\\"took\\\":7,\\\"timed_out\\\":false,\\\"_shards\\\":{\\\"total\\\":1,\\\"successful\\\":1,\\\"skipped\\\":0,\\\"failed\\\":0},\\\"hits\\\":{\\\"total\\\":{\\\"value\\\":1,\\\"relation\\\":\\\"eq\\\"},\\\"max_score\\\":7.7150726,\\\"hits\\\":[{\\\"_index\\\":\\\"nextcloud\\\",\\\"_id\\\":\\\"test_provider:simple\\\",\\\"_score\\\":7.7150726,\\\"_source\\\":{\\\"owner\\\":\\\"user1\\\",\\\"users\\\":[],\\\"groups\\\":[],\\\"circles\\\":[],\\\"links\\\":[],\\\"metatags\\\":[],\\\"subtags\\\":[],\\\"tags\\\":[],\\\"hash\\\":\\\"dc5617141771b9472dcc0739960bf07a\\\",\\\"provider\\\":\\\"test_provider\\\",\\\"source\\\":\\\"\\\",\\\"title\\\":\\\"\\\",\\\"parts\\\":[]},\\\"highlight\\\":{\\\"content\\\":[\\\"testing document is a simple test\\\"]}}]}}\",\"OCA\\FullTextSearch\\Model\\SearchResultprovider\":{\"[object] (OCA\\FullTextSearch\\Provider\\TestProvider)\":{\"OCA\\FullTextSearch\\Provider\\TestProviderconfigService\":\"[object] (OCA\\FullTextSearch\\Service\\ConfigService)\",\"OCA\\FullTextSearch\\Provider\\TestProvidertestService\":\"[object] (OCA\\FullTextSearch\\Service\\TestService)\",\"OCA\\FullTextSearch\\Provider\\TestProvidermiscService\":\"[object] (OCA\\FullTextSearch\\Service\\MiscService)\",\"OCA\\FullTextSearch\\Provider\\TestProviderrunner\":\"[object] (OCA\\FullTextSearch\\Model\\Runner)\",\"OCA\\FullTextSearch\\Provider\\TestProviderindexOptions\":\"[object] (OCA\\FullTextSearch\\Model\\IndexOptions)\"}},\"OCA\\FullTextSearch\\Model\\SearchResultplatform\":{\"[object] (OCA\\FullTextSearch_Elasticsearch\\Platform\\ElasticSearchPlatform)\":{\"OCA\\FullTextSearch_Elasticsearch\\Platform\\ElasticSearchPlatformclient\":\"[object] (Elastic\\Elasticsearch\\Client)\",\"OCA\\FullTextSearch_Elasticsearch\\Platform\\ElasticSearchPlatformrunner\":\"[object] (OCA\\FullTextSearch\\Model\\Runner)\",\"OCA\\FullTextSearch_Elasticsearch\\Platform\\ElasticSearchPlatformconfigService\":\"[object] (OCA\\FullTextSearch_Elasticsearch\\Service\\ConfigService)\",\"OCA\\FullTextSearch_Elasticsearch\\Platform\\ElasticSearchPlatformindexService\":\"[object] (OCA\\FullTextSearch_Elasticsearch\\Service\\IndexService)\",\"OCA\\FullTextSearch_Elasticsearch\\Platform\\ElasticSearchPlatformsearchService\":\"[object] (OCA\\FullTextSearch_Elasticsearch\\Service\\SearchService)\",\"OCA\\FullTextSearch_Elasticsearch\\Platform\\ElasticSearchPlatformlogger\":\"[object] (OC\\AppFramework\\ScopedPsrLogger)\"}},\"OCA\\FullTextSearch\\Model\\SearchResulttotal\":1,\"OCA\\FullTextSearch\\Model\\SearchResultmaxScore\":7,\"OCA\\FullTextSearch\\Model\\SearchResulttime\":7,\"OCA\\FullTextSearch\\Model\\SearchResulttimedOut\":false,\"OCA\\FullTextSearch\\Model\\SearchResultrequest\":{\"[object] (OCA\\FullTextSearch\\Model\\SearchRequest)\":{\"OCA\\FullTextSearch\\Model\\SearchRequestproviders\":[],\"OCA\\FullTextSearch\\Model\\SearchRequestsearch\":\"test\",\"OCA\\FullTextSearch\\Model\\SearchRequestemptySearch\":false,\"OCA\\FullTextSearch\\Model\\SearchRequestpage\":1,\"OCA\\FullTextSearch\\Model\\SearchRequestsize\":10,\"OCA\\FullTextSearch\\Model\\SearchRequestauthor\":\"\",\"OCA\\FullTextSearch\\Model\\SearchRequesttags\":[],\"metaTags\":[],\"subTags\":[],\"OCA\\FullTextSearch\\Model\\SearchRequestoptions\":[],\"OCA\\FullTextSearch\\Model\\SearchRequestparts\":[],\"OCA\\FullTextSearch\\Model\\SearchRequestfields\":[],\"OCA\\FullTextSearch\\Model\\SearchRequestlimitFields\":[],\"OCA\\FullTextSearch\\Model\\SearchRequestwildcardFields\":[],\"OCA\\FullTextSearch\\Model\\SearchRequestwildcardFilters\":[],\"OCA\\FullTextSearch\\Model\\SearchRequestregexFilters\":[],\"OCA\\FullTextSearch\\Model\\SearchRequestsimpleQueries\":[]}}}}"}}

I can just see that there's no hit if this PR is applied. Does the query look as expected?

@R0Wi
Copy link
Member

R0Wi commented Sep 19, 2023

Yes both queries look exactly the same, except the filter part which has been changed from

{
    "bool": {
        "should": [
            {
                "term": {
                    "owner": "user1"
                }
            },
            {
                "term": {
                    "users": "user1"
                }
            },
            {
                "term": {
                    "users": "__all"
                }
            }
        ]
    }
}

to

{
    "bool": {
        "should": [
            {
                "term": {
                    "owner.keyword": "user1"
                }
            },
            {
                "term": {
                    "users.keyword": "user1"
                }
            },
            {
                "term": {
                    "users.keyword": "__all"
                }
            }
        ]
    }
}

which is expected. @XueSheng-GIT would you mind sharing the ES index _mapping-info? You can get it by curl http://localhost:9200/<index_name>/_mapping?pretty. I saw some earlier problems came from old document metadata being stored in the ES index.

@XueSheng-GIT
Copy link

XueSheng-GIT commented Sep 19, 2023

/This is the _mapping info:

{
  "nextcloud" : {
    "mappings" : {
      "dynamic" : "true",
      "properties" : {
        "attachment" : {
          "properties" : {
            "author" : {
              "type" : "text",
              "fields" : {
                "keyword" : {
                  "type" : "keyword",
                  "ignore_above" : 256
                }
              }
            },
            "content_length" : {
              "type" : "long"
            },
            "content_type" : {
              "type" : "text",
              "fields" : {
                "keyword" : {
                  "type" : "keyword",
                  "ignore_above" : 256
                }
              }
            },
            "creator_tool" : {
              "type" : "text",
              "fields" : {
                "keyword" : {
                  "type" : "keyword",
                  "ignore_above" : 256
                }
              }
            },
            "date" : {
              "type" : "date"
            },
            "format" : {
              "type" : "text",
              "fields" : {
                "keyword" : {
                  "type" : "keyword",
                  "ignore_above" : 256
                }
              }
            },
            "keywords" : {
              "type" : "text",
              "fields" : {
                "keyword" : {
                  "type" : "keyword",
                  "ignore_above" : 256
                }
              }
            },
            "language" : {
              "type" : "text",
              "fields" : {
                "keyword" : {
                  "type" : "keyword",
                  "ignore_above" : 256
                }
              }
            },
            "modified" : {
              "type" : "date"
            },
            "title" : {
              "type" : "text",
              "fields" : {
                "keyword" : {
                  "type" : "keyword",
                  "ignore_above" : 256
                }
              }
            }
          }
        },
        "circles" : {
          "type" : "keyword"
        },
        "combined" : {
          "type" : "text",
          "term_vector" : "with_positions_offsets",
          "analyzer" : "analyzer"
        },
        "content" : {
          "type" : "text",
          "copy_to" : [
            "combined"
          ],
          "term_vector" : "with_positions_offsets",
          "analyzer" : "analyzer"
        },
        "groups" : {
          "type" : "keyword"
        },
        "hash" : {
          "type" : "keyword"
        },
        "links" : {
          "type" : "keyword"
        },
        "metatags" : {
          "type" : "keyword"
        },
        "owner" : {
          "type" : "keyword"
        },
        "parts" : {
          "properties" : {
            "comments" : {
              "type" : "text",
              "fields" : {
                "keyword" : {
                  "type" : "keyword",
                  "ignore_above" : 256
                }
              }
            },
            "ocr" : {
              "type" : "text",
              "fields" : {
                "keyword" : {
                  "type" : "keyword",
                  "ignore_above" : 256
                }
              }
            }
          }
        },
        "provider" : {
          "type" : "keyword"
        },
        "share_names" : {
          "properties" : {
            "XXX\\" : {
              "properties" : {
                "XXX" : {
                  "type" : "text",
                  "fields" : {
                    "keyword" : {
                      "type" : "keyword",
                      "ignore_above" : 256
                    }
                  }
                }
              }
            },
            "XXX\\" : {
              "properties" : {
                "XXX" : {
                  "type" : "text",
                  "fields" : {
                    "keyword" : {
                      "type" : "keyword",
                      "ignore_above" : 256
                    }
                  }
                }
              }
            },
            "XXX\\" : {
              "properties" : {
                "XXX" : {
                  "type" : "text",
                  "fields" : {
                    "keyword" : {
                      "type" : "keyword",
                      "ignore_above" : 256
                    }
                  }
                }
              }
            },
            "XXX\\" : {
              "properties" : {
                "XXX" : {
                  "type" : "text",
                  "fields" : {
                    "keyword" : {
                      "type" : "keyword",
                      "ignore_above" : 256
                    }
                  }
                }
              }
            },
            "XXX" : {
              "type" : "text",
              "fields" : {
                "keyword" : {
                  "type" : "keyword",
                  "ignore_above" : 256
                }
              }
            },
            "XXX\\" : {
              "properties" : {
                "XXX" : {
                  "type" : "text",
                  "fields" : {
                    "keyword" : {
                      "type" : "keyword",
                      "ignore_above" : 256
                    }
                  }
                }
              }
            },
            "XXX\\" : {
              "properties" : {
                "XXX" : {
                  "type" : "text",
                  "fields" : {
                    "keyword" : {
                      "type" : "keyword",
                      "ignore_above" : 256
                    }
                  }
                },
                "XXX" : {
                  "type" : "text",
                  "fields" : {
                    "keyword" : {
                      "type" : "keyword",
                      "ignore_above" : 256
                    }
                  }
                }
              }
            },
            "XXX\\" : {
              "properties" : {
                "XXX" : {
                  "type" : "text",
                  "fields" : {
                    "keyword" : {
                      "type" : "keyword",
                      "ignore_above" : 256
                    }
                  }
                }
              }
            },
            "XXX\\" : {
              "properties" : {
                "XXX" : {
                  "type" : "text",
                  "fields" : {
                    "keyword" : {
                      "type" : "keyword",
                      "ignore_above" : 256
                    }
                  }
                }
              }
            },
            "XXX\\" : {
              "properties" : {
                "XXX" : {
                  "type" : "text",
                  "fields" : {
                    "keyword" : {
                      "type" : "keyword",
                      "ignore_above" : 256
                    }
                  }
                }
              }
            },
            "XXX\\" : {
              "properties" : {
                "XXX" : {
                  "type" : "text",
                  "fields" : {
                    "keyword" : {
                      "type" : "keyword",
                      "ignore_above" : 256
                    }
                  }
                }
              }
            },
            "XXX\\" : {
              "properties" : {
                "XXX" : {
                  "type" : "text",
                  "fields" : {
                    "keyword" : {
                      "type" : "keyword",
                      "ignore_above" : 256
                    }
                  }
                }
              }
            },
            "XXX\\" : {
              "properties" : {
                "XXX" : {
                  "type" : "text",
                  "fields" : {
                    "keyword" : {
                      "type" : "keyword",
                      "ignore_above" : 256
                    }
                  }
                }
              }
            },
            "XXX\\" : {
              "properties" : {
                "XXX" : {
                  "type" : "text",
                  "fields" : {
                    "keyword" : {
                      "type" : "keyword",
                      "ignore_above" : 256
                    }
                  }
                }
              }
            },
            "XXX\\" : {
              "properties" : {
                "XXX" : {
                  "type" : "text",
                  "fields" : {
                    "keyword" : {
                      "type" : "keyword",
                      "ignore_above" : 256
                    }
                  }
                }
              }
            },
            "XXX\\" : {
              "properties" : {
                "XXX" : {
                  "type" : "text",
                  "fields" : {
                    "keyword" : {
                      "type" : "keyword",
                      "ignore_above" : 256
                    }
                  }
                }
              }
            },
            "XXX\\" : {
              "properties" : {
                "XXX" : {
                  "type" : "text",
                  "fields" : {
                    "keyword" : {
                      "type" : "keyword",
                      "ignore_above" : 256
                    }
                  }
                }
              }
            },
            "XXX\\" : {
              "properties" : {
                "XXX" : {
                  "type" : "text",
                  "fields" : {
                    "keyword" : {
                      "type" : "keyword",
                      "ignore_above" : 256
                    }
                  }
                }
              }
            },
            "XXX" : {
              "type" : "text",
              "fields" : {
                "keyword" : {
                  "type" : "keyword",
                  "ignore_above" : 256
                }
              }
            }
          }
        },
        "source" : {
          "type" : "keyword"
        },
        "subtags" : {
          "type" : "keyword"
        },
        "tags" : {
          "type" : "keyword"
        },
        "title" : {
          "type" : "text",
          "copy_to" : [
            "combined"
          ],
          "term_vector" : "with_positions_offsets",
          "analyzer" : "keyword"
        },
        "users" : {
          "type" : "keyword"
        }
      }
    }
  }

@R0Wi
Copy link
Member

R0Wi commented Sep 19, 2023

Thanks. This looks entirely correct to me. I will try to do some tests with the latest ES 8.10, since I'm using 8.6.1. I don't think this should make any difference but let's see ...

@vbier
Copy link

vbier commented Sep 19, 2023

This does not look like my mapping. The respective fields are of type keyword, whereas my fields are of type text and have the keyword subfield.

        "users" : {
          "type" : "text",
          "fields" : {
            "keyword" : {
              "type" : "keyword",
              "ignore_above" : 256
            }
          }
        }

@R0Wi
Copy link
Member

R0Wi commented Sep 19, 2023

Good spot @vbier ! Indeed mine looks the same and for example users is a combination of keyword and text. Same goes for owner and all other fields discussed here. @XueSheng-GIT how big is your index? Would it be possible to rebuild it from scratch?

@ArtificialOwl
Copy link
Member

/backport to stable27

@ArtificialOwl
Copy link
Member

/backport to stable26

@ArtificialOwl ArtificialOwl merged commit 4de9413 into nextcloud:master Sep 20, 2023
1 check passed
@ArtificialOwl
Copy link
Member

Nice work and, well, thanks for your patience :-]

@XueSheng-GIT
Copy link

XueSheng-GIT commented Sep 20, 2023

Good spot @vbier ! Indeed mine looks the same and for example users is a combination of keyword and text. Same goes for owner and all other fields discussed here. @XueSheng-GIT how big is your index? Would it be possible to rebuild it from scratch?

I started a rebuild of the index (stop, delete, reset, index). After it was finished, it now started to rebuild again. Could be related to nextcloud/fulltextsearch#767 and nextcloud/fulltextsearch#723. I'll add my comments over there and add a new issue if required.

Thanks for your help.

@XueSheng-GIT
Copy link

I was now able to do a "quick" index with disabled tesseract. Mapping looks now like shown above #237 (comment) and test runs without issues.
Thanks @vbier and @R0Wi for your help.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants