Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Performance of yz_solr:partition_list/1 [JIRA: RIAK-3221] #720

Open
Vorticity-Flux opened this issue Jan 28, 2017 · 1 comment
Open

Performance of yz_solr:partition_list/1 [JIRA: RIAK-3221] #720

Vorticity-Flux opened this issue Jan 28, 2017 · 1 comment

Comments

@Vorticity-Flux
Copy link

Vorticity-Flux commented Jan 28, 2017

yz_solr:partition_list/1 function performs Solr lookup using facet search.

partition_list(Core) ->

For some reason in our setup it was observed that this Solr query takes 10 seconds to complete. It is likely that this query is waiting for commit to complete and/or new searcher to finish opening.
(Details about observed poor Solr performance and reasons are given in issue #719 ).

After consulting Solr IRC it was established that facet.method=enum resolves this aspect of our performance problems. With this parameter yz_solr:partition_list/1 always completes in under 10ms (1000 times speed up!).

For now we have modified solr_config.xml and set the default facet.method to enum. However as far as I understand this is not a reliable solution (as solr_config.xml is overwritten in some circumstances(?)).

I think it is worthwhile to do one of the following:
a) Add a way to add facet.method=enum to the Solr partition list facet query. It seems to perform much faster then the default facet method.
b) Turn Solr docValues on for the _yz_pn field. This will should make faceting on this field really fast in all cases. This could be added to the default schema.

@Basho-JIRA Basho-JIRA changed the title Performance of yz_solr:partition_list/1 Performance of yz_solr:partition_list/1 [JIRA: RIAK-3221] Jan 28, 2017
kesslerm pushed a commit to kesslerm/yokozuna that referenced this issue Feb 2, 2017
Use 'facet.method=enum' for much reduced query times, and avoid overhead
of returning unused actual query results and headers.

Fixes basho#720
kesslerm pushed a commit to kesslerm/yokozuna that referenced this issue Feb 2, 2017
Use 'facet.method=enum' for much reduced query times, and avoid overhead
of returning unused actual query results and headers.

Fixes basho#720
@kesslerm
Copy link

kesslerm commented Feb 2, 2017

I was able to reproduce the speedup of the faceted query for yz_solr:partition_list/1. Switching to facet.method=enum yields consistently faster results.

Additional optimisations include not asking for actual query results (which may generate substantial amounts of unused data internally for big documents) and not returning query headers with the result. Every little helps, as they say.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants