Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add Elasticsearch settings #36

Merged
merged 2 commits into from
Oct 17, 2016
Merged

Conversation

orangejulius
Copy link
Member

These were buried in pelias/schema and will be removed in another PR.

Note that both commits have nice big friendly commit messages with information about these settings. Because JSON doesn't have any support for comments, hopefully people will look at git blame when trying to find information on these settings, so I tried to include as much as possible.

Connects pelias/schema#178

These were buried in pelias/schema (https://github.com/pelias/schema/blob/f28002db187f1685abc3688b141e0bfdd5cdd01a/settings.js#L289-L296),
so by moving them here it's more obvious they can be overridden.

We use 1 shard as a default in development where scalability isn't
required.

Also, because we use the [dfs_query_then_fetch](https://www.elastic.co/guide/en/elasticsearch/reference/current/search-request-search-type.html#dfs-query-then-fetch)
search mode, having one shard eliminates any possibility of queries run
_without_ that setting having confusing results due to TF/IDF

The `index_concurrency` setting is set to 10 as an attempt to increase
indexing performance as well.

Reference:
https://github.com/pelias/api/blob/9ff383cc2b4a690fa05a88e70c598bfdc28751f4/controller/search.js#L44
https://www.elastic.co/guide/en/elasticsearch/guide/current/relevance-is-broken.html

Connects pelias/schema#178
This default was buried in our private chef configuration. Because
Pelias is generally run with a bulk indexin phase before any queries are
run, a higher refresh interval makes sense to improve indexing
performance.

Reference links:
https://www.elastic.co/guide/en/elasticsearch/reference/current/indices-update-settings.html#bulk
https://sematext.com/blog/2013/07/08/elasticsearch-refresh-interval-vs-indexing-performance/
@orangejulius orangejulius merged commit 54c4c44 into master Oct 17, 2016
@orangejulius orangejulius deleted the add-elasticsearch-settings branch October 17, 2016 17:15
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants