-
Notifications
You must be signed in to change notification settings - Fork 24.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Make limit on number of expanded fields configurable #34778
Comments
Pinging @elastic/es-search-aggs |
@melissachang Are you actually running into this kind of problem in a release version of Elasticsearch? Asking because I think the limit is only enforced starting with the yet unreleased version 7.0 (see #26541). I think the warning might have been backported to previous versions of the docs by accident. However it is a good warning, since the limit is going to be enforced at least from 7.0 on to prevent accidental expansions of queries to all fields causing performance problems. So increasing it might be causing performance issues and at the moment doesn't seem to be possible. 6600 is a huge number of fields, it would be interesting to learn more about your use case and see why you need to query so many fields. Maybe this indicates a problem in your document design. Could you elaborate? |
So we finally got the index working. Elasticsearch 6.2.2, 122k documents, 6624 fields. Some of them are text fields with extra keyword fields, so there are actually 13251 mappings.
This multi_match query worked beautifully:
I will explain my use case. But since things work with 13251 mappings, it seems like Elasticsearch shouldn't impose a hard limit? I can understand making the default limit 1024, but users should be able to increase that. I am using Elasticsearch to do faceted search on health-related datasets. You can see an example here. Nurse's Health Study is one of our datasets. The Nurse's Health data is a table with 6k columns and 120k rows. Each row is a participant. Each column represents a field -- eg weight in a certain year, menopausal status in a certain year, etc. (Of course not every participant will have every field filled out.) In the index, each document corresponds to a participant. Each of the 6k table columns is a field on the document. We want to use CC @bfcrampton |
Great that it works in this particular case, but that still doesn't mean it works for other cases. We often see problems with that amount of fields. But I agree that it probably makes sense to make this configurable.
This again makes sense if you are coming from thinking of your data as a database table, but is problematic for an inverted index like Lucene that started out as being used for full text search. In your particular case I think the limit come into effect when searching across all fields. I would suggest to copy all text fields relevant to your search to a dedicated "catch_all" field using the |
@melissachang fyi I changed the issue title to reflect better your ask about making the current hard limit introduced with #26541 configurable. I marked this for internal group discussion, but maybe @dakrone who authored that change can share his thoughts on this as well on this issue. |
Conceptually, our data is a database. It's not like other use cases like logs ingestion, where the fields are arbitrary strings. Our columns are well-defined and meaningful. We are using Elasticsearch to perform full-text search over the database. It's performant, easy, and it works -- I don't see what the problem is. I experimented with
|
We discussed this issue internally and agreed this limit should be configurable. We also said it would make sense to not introduce a new setting for overwrite this limit but instead use the |
Great. Is there going to be an upper limit on |
Its default is 1024 clauses (if we use it for the field expansion as well this will mean 1024 fields). It can be increased to whatever value you want to, but you use this at your own risk then basically. Up until recently we didn't propery document this but I added docs about the setting in #34779. |
Currently we introduced a hard limit of 1024 to the number of fields a query can be expanded to in elastic#26541. Instead of using a hard limit, we should make this configurable. This change removes the hard limit check and uses the existing `max_clause_count` setting instead. Closes elastic#34778
I'd like to use multi_match to search over documents with 6600 fields. Is there any way the 1024 limit can be increased?
The text was updated successfully, but these errors were encountered: