Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Much of the addresses in Vietnam in OSM are either incomplete or malformed. From a glance at overpass turbo, virtually all them were entered using iD and JOSM, which both present the same address fields. For example:
addr:city
.addr:street
.addr:city
but the ward and arrondissement inaddr:postcode
.name
andaddr:postcode
fields to encode the full housenumber (see the discussion on alleys below).Most of the remaining addresses simply don’t bother with anything beyond the street, which makes it hard for Nominatim to give decent results in large cities where the third-level administrative boundaries haven’t been mapped and the wards aren’t perfect circles.
The problem stems from the limited number of fields available in iD’s and JOSM’s address presets. Vietnamese addresses in urban areas typically have more components than you’d find in a North American address, for example. So mappers tend to stuff the full address into one of the fields they see. iD can help stem this problem by providing better presets.
Vietnamese address formats are based on the administrative hierarchy, which varies between rural and urban areas. This org chart shows the relationships between all the officially recognized administrative units in Vietnam. Not all levels need to be included in every given street address, but the order will always go from smallest to largest in the org chart. Based on the existing (well-formed) addresses and wiki documentation, the full tagging scheme for each possible Vietnamese address format would be:
housenumber
Streetstreet
, Wardsubdistrict
, Arrondissementdistrict
, Municipalitycity
123456postcode
housenumber
Streetstreet
, Wardsubdistrict
, Towndistrict
, Municipalitycity
123456postcode
housenumber
Streetstreet
, Communesubdistrict
, Towndistrict
, Municipalitycity
123456postcode
housenumber
Streetstreet
, Communesubdistrict
, Districtdistrict
, Municipalitycity
123456postcode
housenumber
Streetstreet
, Townletsubdistrict
, Districtdistrict
, Municipalitycity
123456postcode
housenumber
Streetstreet
, Wardsubdistrict
, Towncity
, Provinceprovince
123456postcode
housenumber
Streetstreet
, Communesubdistrict
, Towncity
, Provinceprovince
123456postcode
housenumber
Streetstreet
, Communesubdistrict
, Districtdistrict
, Provinceprovince
123456postcode
housenumber
Streetstreet
, Townletsubdistrict
, Districtdistrict
, Provinceprovince
123456postcode
housenumber
Streetstreet
, Wardsubdistrict
, Provincial Citycity
, Provinceprovince
123456postcode
housenumber
Streetstreet
, Communesubdistrict
, Provincial Citycity
, Provinceprovince
123456postcode
Therefore, I think it makes sense to lay out iD’s address preset like this (with English translations of the Vietnamese translations already in Transifex):
Everything is on its own line to ensure that the placeholder text fits at all window sizes. This is also how addresses are written on post in Vietnam.
One caveat is that street addresses on alleys in urban areas include the name of the collector street and any alleys needed to reach the address. For example:
There seems to be no consensus on whether to include the extra alley numbers in
addr:housenumber
oraddr:street
, so I wouldn’t worry about accommodating them for the time being.