Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Can't find address when using блок instead of бл. #3049

Closed
Dimitar5555 opened this issue May 2, 2023 · 11 comments
Closed

Can't find address when using блок instead of бл. #3049

Dimitar5555 opened this issue May 2, 2023 · 11 comments
Labels
Milestone

Comments

@Dimitar5555
Copy link

What did you search for?

блок 717 ж.к. люлин 7
https://www.openstreetmap.org/search?query=%D0%B1%D0%BB%D0%BE%D0%BA%20717%20%D0%B6.%D0%BA.%20%D0%BB%D1%8E%D0%BB%D0%B8%D0%BD%207

What result did you get?

Nothing

What result did you expect?

The same as when searching for бл. 717 ж.к. люлин 7
https://www.openstreetmap.org/search?query=%D0%B1%D0%BB.%20717%20%D0%B6.%D0%BA.%20%D0%BB%D1%8E%D0%BB%D0%B8%D0%BD%207

When the result missing completely:
https://www.openstreetmap.org/way/26667926

Further details

блок is the long version of бл.. It is also listed in https://github.com/osm-search/Nominatim/blob/master/settings/icu-rules/variants-bg.yaml.

@lonvia
Copy link
Member

lonvia commented May 2, 2023

Nominatim only abbreviates names coming from OSM, not the other way around because it is a long-standing rule that OSM should not contain abbreviations. Looks like Bulgaria is mapped in violation of this rule at a large scale.

@Dimitar5555
Copy link
Author

Bulgaria is indeed mapped using a lot of abbreviations but it's because we don't have cases where a single abbreviation can be expanded to 2 or more words with different meanings (like in English St can be expanded to Street or Saint).

@lonvia
Copy link
Member

lonvia commented May 2, 2023

OSM is a global project. The abbreviations can extend to different words in other languages.

@Dimitar5555
Copy link
Author

I've changed addr:housenumber's value to блок 717. Both бл. 717, ж.к. люлин 7 and блок 717, ж.к. Люлин 7 do not work now. Am I missing something?

@Dimitar5555
Copy link
Author

Nominatim only abbreviates names coming from OSM, not the other way around because it is a long-standing rule that OSM should not contain abbreviations.

It seems like the wiki has conflicting information on this topic. See https://wiki.openstreetmap.org/wiki/Name_finder:Abbreviations

Abbreviations work both ways: so if someone has tagged a street Saint Johns St, then users can also expect a match for St Johns St, St Johns Street and Saint Johns Street (and also, incidentally, St John's St or St John St etc.), and vice-versa.

@lonvia
Copy link
Member

lonvia commented May 3, 2023

That's because Nominatim penalizes housenumbers with lots of letters in them up to a point where it cannot find them anymore. This should get fixed with the modified search algorithm currently in the works for the Python implementation.

The underlying problem is that addr:housenumber contains something that is not really a housenumber. If I understand the system right, then бл. 717, ж.к. люлин 7 isn't even a full address yet. It is missing at least the entrance number for the block. So a system where the apartment building simply gets a name=блок 717 and the entrances something like addr:housenumber=1+addr:block=717+addr:... might be better to capture the addressing system. But I'm speculating here. addr:block doesn't really have a standardized use in OSM and is not yet implemented in Nominatim.

@lonvia lonvia added this to the Python API milestone May 18, 2023
@lonvia lonvia added the Search label May 18, 2023
@Dimitar5555
Copy link
Author

This should get fixed with the modified search algorithm currently in the works for the Python implementation.

Is there an ETA for it?

The underlying problem is that addr:housenumber contains something that is not really a housenumber. If I understand the system right, then бл. 717, ж.к. люлин 7 isn't even a full address yet. It is missing at least the entrance number for the block. So a system where the apartment building simply gets a name=блок 717 and the entrances something like addr:housenumber=1+addr:block=717+addr:... might be better to capture the addressing system. But I'm speculating here. addr:block doesn't really have a standardized use in OSM and is not yet implemented in Nominatim.

addr:block seems to be used mainly in India as a replacement of addr:place (or at least so it seems).

For the entrances we use entrance=staircase + ref=* without duplicating addr:* information since it can be automatically copied from the building footprint.

addr:housenumber represents the building number within that neighborhood (the neighborhood acts like addr:street). блок is used do distinguish between house numbers and apartment buildings. It's not possible to remove it since there are places where on a single street you can find both addressing systems. Of course, the numbers will match so you wouldn't know which of the two numbers 4 is the house and which is the apartment building.

Nominatim only abbreviates names coming from OSM, not the other way around because it is a long-standing rule that OSM should not contain abbreviations. Looks like Bulgaria is mapped in violation of this rule at a large scale.

Is it possible for Nominatim to perform a second search after the regular one but with abbreviated search terms or that would add too much complexity?

Another thing to note about abbreviations in Bulgaria is that there are almost no modern paper/digital Bulgarian maps which shows the full word блок. The same rule applies for boulevards (shortened to бул.) and streets (usually shortened to ул. or not shown at all).

@lonvia
Copy link
Member

lonvia commented Aug 8, 2023

Works with new Python frontend.

@lonvia lonvia closed this as completed Aug 8, 2023
@Dimitar5555
Copy link
Author

Dimitar5555 commented Oct 15, 2023

@lonvia блок 717 ж.к. Люлин 7 is working correctly on the main OSM website but using the abbreviated version doesn't seem to work (бл. 717 ж.к. Люлин 7
https://www.openstreetmap.org/search?query=%D0%B1%D0%BB.%20717%20%D0%B6.%D0%BA.%20%D0%BB%D1%8E%D0%BB%D0%B8%D0%BD%207).

@lonvia
Copy link
Member

lonvia commented Oct 25, 2023

Doesn't look like it is a Russian abbreviation known to Nominatim: https://github.com/osm-search/Nominatim/blob/master/settings/icu-rules/variants-ru.yaml

@Dimitar5555
Copy link
Author

It's a Bulgarian abbreviation that should be known to Nominatim: https://github.com/osm-search/Nominatim/blob/master/settings/icu-rules/variants-bg.yaml#L4

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

2 participants