Skip to content

Commit

Permalink
add documentation for synonym feature
Browse files Browse the repository at this point in the history
  • Loading branch information
lonvia committed Jun 16, 2021
1 parent af16f48 commit c273422
Showing 1 changed file with 83 additions and 0 deletions.
83 changes: 83 additions & 0 deletions docs/synonyms.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,83 @@
# Using Synonyms and Classification Terms

Photon has built-in support for using custom query-time synonyms and
special phrases for searching a place by its type. This document explains
how to configure this feature.

## Configuration

Synonyms and classification terms are configured with a JSON file which can
be added to a Photon server instance using the command line parameter
`-synonym-file`. Synonyms are a run-time feature. Handing in a synonym list
at import time has no effect. The list of synonyms in use can simply be
changed by restarting the Photon server with a different synonym list (or
not at all, if you want to completely disable the feature again).

Here is a simple example of a synonym configuration file:

```
{
"search_synonyms": [
"first,1st",
"second,2nd"
],
"classification_terms": [
{
"key": "aeroway",
"value": "aerodrome",
"terms": ["airport", "airfield"]
},
{
"key": "railway",
"value": "station",
"terms": ["station"]
}
]
}
```

The file has two main sections: `search_synonyms` allows for simple synonym
replacements in the query. `classification_term` defines descriptive terms
for a OSM key/value pair.

## Synonyms

The `search_synonyms` section must contain a list of synonym replacements.
Each entry contains a comma-separated of terms that may be replaced with each
other in the query. Only single-word terms are allowed. That means the terms
must neither contain spaces nor hyphens or the like.[^1]

[^1] This is a restriction of ElasticSearch 5. Synonym replacement does not
create correct term positions when multi-word synonyms are involved.

## Classification Terms

The second section `classification_terms` defines a list of OSM key/value
pairs with their descriptive terms. `place` and `building` may not be used as
keys. Neither will `highway=residential` nor `highway=unclassified` work.
There may be multiple entries for the same key/value pair (for example,
if you have extra entries for each supported language).

The classification terms can help improve search when the type of an object
is used in the query but does not appear in the name. For example, with the
configuration given above a query of "Berlin Station" will find a railway
station which in OpenStreetMap has the name "Berlin" and also one with
the name "Berlin Hauptbahnhof".

Classification terms do not enable searching for objects of a certain type.
"Station London" will not get you all railway stations in London but a
railway station _named_ London.

## Usage Advice

Use synonyms and classification terms sparingly and only if you can be
reasonably sure that they will target the intended part of the address.
Short or frequent terms can have unexpected side-effects and worsen the
search results. For example, it might sound like a good idea to use synonyms
to handle the abbreviation from 'Saint' to 'St'. The problem here is that
'St' is also used as an abbreviation for 'Street'. So all searches that
involve a 'Street' will suddenly also search for places containing 'Saint'.

Do not create synonyms for terms that are used as classification terms.
Photon will not complain but again there might be unintended side effects.

0 comments on commit c273422

Please sign in to comment.