Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Synonym list + classification terms #585

Merged
merged 7 commits into from
Jan 3, 2022

Conversation

lonvia
Copy link
Collaborator

@lonvia lonvia commented Jun 11, 2021

This PR extends #581 and adds classification terms. The classification terms enable searches where the descriptive term of the place is given in addition to the name ('London airport', 'Louvre museum', ...). Like synonyms, the classification terms are restricted to single word terms. This is a restriction in the synonym module of ES5. See docs/synonyms.md for more information on the new format of the synonym file.

Classification terms work by adding a cryptic term that is derived from the OSM key and value and usually does not interfere with any other search terms. The cryptic term is added to the collector and then we define synonyms from the actual classification terms to the cryptic term. This means that we can stay with a single query and have ES decide if a term should be interpreted as a classification term or as part of a name. It also means that classification terms can be changed on an existing database, given that they are just query time synonyms.

As classification terms need to be newly indexed, a mapping schema change and reimport is required and the database version has increased.

Partially implements #557. Classification terms cannot be used for searching by type ('museum in Paris', 'airport near London').

Fixes #318.

@kenseii
Copy link

kenseii commented Jun 14, 2021

See docs/synonyms.md for more information on the new format of the synonym file

Sorry I cant find this file, do you mind sharing its path?
It seems like the normal .txt synonym files are no longer supported.

@lonvia
Copy link
Collaborator Author

lonvia commented Jun 14, 2021

Forgot to push the documentation file. Fixed now.

lonvia added 6 commits June 16, 2021 11:29
Adds OSM key and value of the main tag as a special string
to the database and makes the string searchable via the
collector.

This is an incompatible change to the index structure.
Multi-word synonyms are not supported properly in ES5. They
cause wrong word boundary indexes.
@lonvia lonvia force-pushed the classification-terms-II branch from e752625 to c273422 Compare June 16, 2021 09:41
The index only works if the classification search terms are converted
to the special class/type terms at search time.
@lonvia
Copy link
Collaborator Author

lonvia commented Jun 25, 2021

The last commit fixes a small error that prevented results that match the classificiation terms from being boosted.

@leonardehrenfried you might want to update your branches again to get the last commit 0ac89a5. This change only affects searches, no reimport of the index necessary.

@lonvia lonvia merged commit 32ad992 into komoot:master Jan 3, 2022
@lonvia lonvia deleted the classification-terms-II branch January 3, 2022 10:15
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Improve POI search
2 participants