Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Place areas #546

Closed
wants to merge 7 commits into from
Closed

Place areas #546

wants to merge 7 commits into from

Conversation

pnorman
Copy link
Collaborator

@pnorman pnorman commented May 20, 2014

Superceeds #538
Fixes #103
Fixes #522

This will reveal two widespread problems in the data

  • TIGER import place classifications are awful.
  • There are frequently duplicate place areas and place points

There is a very minor performance boost overall to this pull, in spite of adding planet_osm_polygon which comes from combining some of the MSS.

The MSS could probably be further simplified, but I'll leave that to others.

Previously we had one place layer for capitals and one for non-capitals
when it came to towns and cities. This combines them, reducing the number
of queries, making the SQL simpler, and making the layers more useful for
other styles.

Incidentially fixes gravitystorm#522
This is part of gravitystorm#103, but only fixes it for place=city/town

The SQL is moderately complex and difficult to document in the .mml file so it's worth documenting here

```sql
(SELECT way,place,name,capital,population
  FROM
    (SELECT
        way,place,name,NULL AS capital,way_area, -- The polygon table doesn't have the capital information with the default style
        CASE WHEN population~E'^\\d{1,9}$' THEN population::integer ELSE NULL END AS population
          -- We need to filter out population values that might cause exceptions. This is a fairly rigerous filtering as it will remove population=1,000
      FROM planet_osm_polygon
    UNION ALL -- Join the two together
    SELECT
        way,place,name,capital,NULL AS way_area, -- Points don't have an area
        CASE WHEN population~E'^\\d{1,9}$' AND length(population)<10 THEN population::integer ELSE NULL END AS population -- as above
      FROM planet_osm_point) AS p
  WHERE place IN ('city','town') -- This condition is brought inside by the query optimizer
  ORDER BY
    CASE place WHEN 'city' THEN 0 ELSE 10 END, -- We don't actually need this at moment with the current MSS
    CASE capital WHEN 'yes' THEN 0 ELSE 10 END, -- Nor this
    way_area DESC NULLS LAST, -- Put physically larger first
    population DESC NULLS LAST -- For points (no area) put larger population first
) AS placenames_medium
```

The ordering merits further discussion

We're currently not considering area or population for ordering, so anything is an improvement here. It's hard to say if area,population or population,area is better without trying one first.
Benchmarking shows \d+ and length with no NULL check is the fastest way to check if the population is safe to cast
Builds on previous work on city/town, in the same fashion.

The ordering is a bit more arbitrary here, but we didn't have a well defined order before within many of the types
@pnorman pnorman mentioned this pull request May 20, 2014
@pnorman pnorman changed the title Place small Place areas May 20, 2014
@matthijsmelissen
Copy link
Collaborator

How does this handle relations like Lisboa? Where would the name be rendered?

@pnorman
Copy link
Collaborator Author

pnorman commented May 20, 2014

Like all polygons, using Mapnik's placement. That particular relation doesn't have a label node, but see #105 for the choice to not use "label" nodes.

@matthijsmelissen
Copy link
Collaborator

Thank you for the clarification.

If this is rolled out, mappers will have three options:

  1. Tag place=city both on polygon and point
  2. Tag place=city only on point
  3. Tag place=city only on area

Option 1. leads to double rendering of place names; 2. leads to removal of the place=city polygon, which is valuable data; and 3. leads to undesirable label placement (see Lisbon). All of these options are undesirable.

Therefore, I do not support the pull request as it is now.

@pnorman
Copy link
Collaborator Author

pnorman commented May 20, 2014

Unless Andy wants to reconsider #105, undesirable label placement will occur. I could write SQL which de-duplicates, but then that's basically using the label node.

Given the constraints applied, we cannot

@gravitystorm, which constraint do you prefer to drop? Currently we're dropping the first one.

@matthijsmelissen
Copy link
Collaborator

The wiki currently states:

Add place=town as a node, and possibly also optionally as an area.

So I think technically, the current rendering is in line with the wiki.

By the way, almost all place=city / place=town tags on areas are on admin boundaries (which might not correspond to settlements). At least in Luxembourg, this is legally correct: city status can be granted to a municipality, not to a settlement. I'm not sure about other places.

@pnorman
Copy link
Collaborator Author

pnorman commented May 21, 2014

By the way, almost all place=city / place=town tags on areas are on admin boundaries (which might not correspond to settlements).

Generally when a geographic place has a clearly defined boundary it coincides with an administrative boundary. That being said, a lot of the place=city/town on admin boundaries are wrong and confuse administrative vs geographic. There's also the bad TIGER 2008 place.shp import which put place=city on all sorts of absurd stuff. There's place=city here on admin_level=8 with under 500 people that I'm working on cleaning up

My view is that if you've said a place is an area, then putting a label anywhere in that area is valid.

@gravitystorm
Copy link
Owner

I'd rather that we label polygons if available, then points if not. I'm not so worried about the label placement as I am about the bogus data - as @pnorman says, there's a lot of incorrect place polygons precisely because they aren't rendered.

@lonvia
Copy link

lonvia commented May 21, 2014

Speaking from geocoding experience, I strongly disagree that place duplication as area and node is an error. On the contrary, I'd rather see The Map encourage the use of place nodes by using only them for labels.

I think it's obvious why place areas (or admin relations) are better than just nodes. However, it has also turned out that the nodes are important because they generally represent the real economic center of the place. It is not the same as the area centroid and cannot be easily computed from the polygons, so removing the place node when adding an area looses valuable information. This economic center point (rather badly referred to as 'label' at the moment) is not just useful for rendering but is needed for pretty much any application which needs to collapse a place to a single point. One example is routing: it needs a well defined point to start with, areas are useless. There was a time when OSRM did go to some lengths in reordering Nominatim search results just to get the place nodes instead of the area centroids. I think the word 'useless' was mentioned.

@matthijsmelissen
Copy link
Collaborator

Would replacing the word 'label' by 'centre' (or something similar), and using that for rendering place names, be a solution for everyone?

Of course, that would need to be discussed on the tagging mailing list as well.

@dieterdreist
Copy link

2014-05-21 15:10 GMT+02:00 Sarah Hoffmann [email protected]:

Speaking from geocoding experience, I strongly disagree that place
duplication as area and node is an error. On the contrary, I'd rather see
The Map encourage the use of place nodes by using only them for labels.

I think it's obvious why place areas (or admin relations) are better than
just nodes. However, it has also turned out that the nodes are important
because they generally represent the real economic center of the place. It
is not the same as the area centroid and cannot be easily computed from the
polygons, so removing the place node when adding an area looses valuable
information. This economic center point (rather badly referred to as
'label' at the moment) is not just useful for rendering but is needed for
pretty much any application which needs to collapse a place to a single
point. One example is routing: it needs a well defined point to start with,
areas are useless. There was a time when OSRM did go to some lengths in
reordering Nominatim search results just to get the place nodes instead of
the area centroids. I think the word 'useless' was mentioned.

+1

I'd also like to see area places rendered (as replacement for the built-up
shapefile from natural earth) to encourage people to map the place
extensions as well, and to put the place tags on the "right" objects (and
not on the administrative boundaries where they comprise non-built-up
space).

@matthijsmelissen
Copy link
Collaborator

Any ideas on how to progress with this?

@pnorman
Copy link
Collaborator Author

pnorman commented Jun 1, 2014

Any ideas on how to progress with this?

From my perspective, it's ready to go.

I'm not so worried about the label placement as I am about the bogus data - as @pnorman says, there's a lot of incorrect place polygons precisely because they aren't rendered.

@gravitystorm, how do you feel about a duplicate label showing up when there's both a POINT and a POLYGON with the same name and place values? If you feel that we can't do that, then we're stuck not rendering points, not rendering polygons, or doing fancy data manipulation. In which case, please pick which one, because I can code any of them.

@Komzpa
Copy link

Komzpa commented Jun 3, 2014

There's extremely simple approach in rendering places that I've been using for ages:

  • render polygon as fill;
  • render node as text.

This lets us:

  • have nodes as description of economic/cultural centre;
  • have polygons as visible extent estimation of a place;
  • avoids label duplication.

@daganzdaanda
Copy link

Would replacing the word 'label' by 'centre' (or something similar), and using that for rendering place names, be a solution for everyone?

Sounds like the existing "admin_centre" role of http://wiki.openstreetmap.org/wiki/Relation:boundary But "centre" fits better, because not every village has an administrative centre, but it will have some kind of town centre.

@daganzdaanda
Copy link

@Komzpa: That sounds good. The polygon would need to be an outline of the whole built-up area (residential and industrial etc) of a town. Usually, the administrative boundary will be larger and will include "empty" land. I don't know if many admin-boundaries are tagged with place=* right now, but these would look silly with a filled polygon.

@daganzdaanda
Copy link

OK, now I've seen some relations of type=boundary with place=* . IMHO, this is bad tagging most of the time, because the two things are not the same, as @dieterdreist has said. So it should not be encouraged, and we should not render a name inside an area if the place=* is tagged on a boundary relation. And we should definitely not render the area fill if its an boundary at the same time as an place.

As for

a duplicate label showing up when there's both a POINT and a POLYGON with the same name and place values

Hm. The node needs to be rendered. There are 2'830k place nodes, against 281k ways and 57k relations.

I don't know if there are many place-polygons without a place node in them. A mapper would always have added a node, to see a name. Only if there have been imports I can imagine place-polygons without nodes.
So if that is right, then we can ignore names on place-polygons as long as we start rendering the fill (or the outline, come to think of it). The name tag on the area will still be good for searching.

Relations would be the most logical thing to bring the area and the label together, and reduce the doubling of names and possible other tags. But we can't force people to use them.

@matthijsmelissen
Copy link
Collaborator

I'd rather that we label polygons if available, then points if not. I'm not so worried about the label placement as I am about the bogus data

I still strongly disagree with this. Label placement is very important.

@pnorman for the other discussions, I found the rendering example on http://bl.ocks.org very helpful. Maybe you could set one up for this issue as well?

@pnorman
Copy link
Collaborator Author

pnorman commented Jun 8, 2014

Maybe you could set one up for this issue as well?

Potentially, but I'll have to think about if the changes mainly being at expensive to render zooms is an issue.

The rendering isn't at bl.ocks.org, it's just an easy way to share a single HTML page.

@matthijsmelissen
Copy link
Collaborator

Makes sense. Could you instead do a screenshot of Brighton before and after?

@pnorman
Copy link
Collaborator Author

pnorman commented Jun 8, 2014

Makes sense. Could you instead do a screenshot of Brighton before and after?

Not a great area to test in as the only place=city/town in the region are

Of these, Cuckfield is the only one that doesn't duplicate another object (i.e. the others are double-mapped). Brighton will probably have a label rendered for each object.

Without more labels, you can't really test label collisions. The nodes don't have population either, so can't test the collisions along the coast

@dieterdreist
Copy link

2014-06-10 19:33 GMT+02:00 math1985 [email protected]:

Another issue: this relies on the population tag for sorting purposes.

However, most countries only count population per admin area, not per
place. This is certainly the case for Germany, France, the United Kingdom,
Netherlands, and Belgium.

The best known and published values do indeed refer to administrative
entities, but you could also get numbers for the places if you try to (e.g.
by adding the numbers of appropriate sub-entities). E.g. in Germany and
Italy (but my guess is in all developed countries) you can get numbers down
to a "block" of houses (from the statistics agency). "only count" is not
true.

@hlaw
Copy link

hlaw commented Jun 11, 2014

I have no preference about needing to label a town over the top of its town centre point. Town centres tend to, by definition, be full of interesting things so if the label appears elsewhere (e.g. over the sea) that's fine by me.

Here is a dilemma - for search or routing, some nodes to indicate the (whatever) center is useful for serving as the default location (rather than using some random spot that happens to be the centroid). Similar considerations apply for rendering the label, except that, if the place label is put exactly on the most "interesting" place, we may end up masking important things underneath, eg train station names, which is not desirable. (This is especially true for lower level place tags such as suburb, which would appear at higher zoom).

It may be common for place=* nodes to be used now, whether consciously or not, for positioning labels to avoid the above, say around the center but keeping a little distance from important features (e.g. not over a train station or town hall, but at a nearby parks or residential area). This is essentially the same as a "label" role. These place nodes may or may not be appropriately renamed as admin centre / centre point.

And unless there are some automatic means to place labels intelligently given the place polygon and / or 'the' town center (assuming non-rendering based criteria could be agreed upon locally), I bet that the "positioning-for-the-renderer-at-least-no-label-at-inappropriate-places" practice would stay.

screenshot from 2014-06-11 20 11 23
[United Denver States - another example of things to be avoided for label placement]

@dieterdreist
Copy link

2014-06-11 14:26 GMT+02:00 hlaw [email protected]:

Here is a dilemma - for search or routing, some nodes to indicate the
(whatever) center is useful for serving as the default location

+1, no doubt, but maybe the tagging should also follow this (i.e. not
"place" but a tag that says that it is the centre of the place). Which also
opens up another can of worms: what about polycentric places (with more
than one centre)? My suggestion is to send this back to "talk" or "tagging"
mailing lists.

This is essentially the same as a "label" role.

it is not, it'a tag for the center

These place nodes may or may not be appropriately renamed as admin centre
/ centre point.

not "admin_centre" as we shouldn't limit this to administration (again:
tagging ML)

@matthijsmelissen
Copy link
Collaborator

| My suggestion is to send this back to "talk" or "tagging" mailing lists.

I agree. Would anyone here be able to coordinate that? I could do it myself, but I think my time is better spent wroting code.

@chrisfleming
Copy link
Contributor

I think that this is definitely the right way to go. Having had a few complaints that the reverse geocoding was producing some very wrong results in Edinburgh, I went through and divided the city into areas. I'm currently waiting for this to go through before I go through and remove the nodes.

It strikes me that there is still a place for routing nodes, ie. the place that you might get routed to if asking for that place. But for labelling purposes the area is definitely better.

So I'm now in Limbo where I have both area's and node place labels. Both of which are rendered on the cycle map layer but until this is resolved I can't really fix it.

@matthijsmelissen
Copy link
Collaborator

@pnorman Is this ready to be merged? Assigned to you for now.

@pnorman
Copy link
Collaborator Author

pnorman commented Jul 23, 2014

@pnorman Is this ready to be merged? Assigned to you for now.

I need to merge in master, but I also want to address this issue:

we can probably just switch off labels when the centroid is likely to be significantly different from the 'town centre'. Nobody needs 'London' written over charing cross at z14 anyway - the city is about 40 times larger than any screen at that scale.

@daganzdaanda
Copy link

Just FYI: rendering the names of place=neighbourhood on closed ways would make the map much more useful in Mannheim: In the city centre, there are named "blocks" which are used for orientation and addresses. Many streets don't even have names there.
http://www.openstreetmap.org/way/193633602
http://de.wikipedia.org/wiki/Quadratestadt

@joakimfors
Copy link

Any progress on this issue? The local community recently updated/added place areas here in Lund, Sweden (http://www.openstreetmap.org/#map=13/55.7093/13.1990). At the same time most of the now superflous place=neighbourhood and place=suburb nodes where removed to avoid duplicate data. The downside is that now place names are not visible on the standard osm.org map anymore. The town node was kept as otherwise the whole town would "disappear" from the map. It is, however, referenced from the relation, describing the town, as role "label". Also, using the centroid to place the town label in this case would put it way off the real center.

@dieterdreist
Copy link

2014-09-24 10:38 GMT+02:00 Joakim Fors [email protected]:

At the same time most of the now superflous place=neighbourhood and
place=suburb nodes where removed to avoid duplicate data.

the role "label" doesn't sound nice to me, these are not "labels" (labels
are what a renderer draws, based on its own considerations and often in
interference with other nearby features), these are points that indicate
the centre / a central spot. That's also why I don't think these nodes are
"superfluous" or duplicate data (maybe the tagging could be improved to
something more explicit like "central_point"). This centre also isn't
determinable automatically, as it isn't strictly mathematic / geometric,
but it is cultural / social.

@joakimfors
Copy link

Yeah, the naming of the role is perhaps not the best, but so is a lot of other tag-names in OSM. ;) If someone comes up with a better name and it is accepted as The Way to Tag Things™ then fine by me. What I don't like is having a relation and a separate node both with the same data.

In the case of Lund, the neighbourhoods don't really have a "center" except for a few cases. The town, however, has a well defined center and a well defines boundary. It would, as I said earlier, be nice to be able to put all relevant data in the town relation and have it show up on the map.

@pnorman
Copy link
Collaborator Author

pnorman commented Sep 24, 2014

Any progress on this issue?

I haven't rebased the commit and don't have it on my immediate schedule.

maybe the tagging could be improved to
something more explicit like "central_point"

Maybe, but such a discussion isn't within scope for here, particularly since we do nothing with the role, and can't change that, so we aren't impacted by the changes. Other consumers would of course be negatively impacted by a tag change, but we don't have the information in the rendering tables.

@mboeringa
Copy link

In all this long discussion, I am missing one vital item: as it seems, almost all of you assume that places tagged as areas should conform to the actual built-up area of a "place". This is exemplified by the the rejection of administrative boundary relations tagged with place...

However, I have been sifting through the Key:place Wiki page (http://wiki.openstreetmap.org/wiki/Key:place) , and nowhere do I see listed that "place" as areas should cover built-up area only, it just says:

"You can also define a boundary for a place by creating an Area and tagging it with the place tag."

and nothing more than that... (well, it even contains a reference to the word "boundary", which may be mis-interpreted as "boundary relation" in this context).

While I basically agree that drawing a (multi-)polygon around the total coverage of built-up (residential, industrial, commercial, retail, garages, railway?) is the desirable way to tag a place as an area object, unless the Wiki starts stating this as well, I do think some of the basic assumptions made here in the thread may lack backup in tagging practice and Wiki documentation.

There's extremely simple approach in rendering places that I've been using for ages:

render polygon as fill;
render node as text.

This lets us:

have nodes as description of economic/cultural centre;
have polygons as visible extent estimation of a place;
avoids label duplication.

++ 1

I think this is the most sensible option for now, assuming there is a desire to have the place-as-areas rendered, potentially as a kind of "quality control". It is close to what is happening now, but will allow more awareness of place-areas, and how they have been used up to now (leaving aside the question whether that is considered good or bad practice, which is to a large extent a personal opinion in the absence of a documented Wiki guideline).

@mboeringa
Copy link

Never ever do anything with anything tagged as a "label", that way madness lies. Mapping a 'town centre point' or whatever is fine if all the mappers can agree on where that node should go, but label positioning is font-size and font-face and cartographic-choice specific so should never ever ever be in the OSM database. Anyone adding 'label' roles to relations should remove them.

The OpenStreetMap "Rails" application supporting the display of all styles, actually seems to support "label" roles in its relation display, as I noticed for the first time today... (Queensland state in Australia):

http://www.openstreetmap.org/relation/2316595

Admittedly, I don't see much use for the "label" node in this particular example, or more general on relations, either... Labelling and label position, can indeed be left to the renderer. In this case, it just mirrors the information on the relation, especially the name of the state. The information about multi-lingual names for Queensland should have been put on the multipolygon relation, not on this label node, which makes the node redundant.

But I think a wider discussion may be needed if you wish to have label roles abolished from relations with such support embedded in the base web application...

@pnorman
Copy link
Collaborator Author

pnorman commented Dec 28, 2014

The OpenStreetMap "Rails" application supporting the display of all styles, actually seems to support "label" roles in its relation display, as I noticed for the first time today... (Queensland state in Australia):

No - it supports no roles. The data view would look the same without any roles set for any members.

@dieterdreist
Copy link

2014-12-25 12:18 GMT+01:00 mboeringa [email protected]:

In all this long discussion, I am missing one vital item: as it seems,
almost all of you assume that places tagged as areas should conform to the
actual built-up area of a "place". This is exemplified by the the
rejection of administrative boundary relations tagged with place...

However, I have been sifting through the Key:place Wiki page (
http://wiki.openstreetmap.org/wiki/Key:place) , and nowhere do I see
listed that "place" areas should cover built-up area only, it just says:

I don't think the place-area is necessarily about built-up areas only, I
see it as the settlement geographic way of looking at stuff (i.e. through
the lense of anthropology, architecture, sociology, geography, ...) as
opposed to admin (political borders). You look at the settlement and
interpret its shape (morphology), history, population structure, etc. Often
you'll find hard borders (e.g. railways, arterial roads, rivers, airports,
military areas, parks, cemeteries, ...) between distinct (sub-)areas.

@jojo4u
Copy link

jojo4u commented Feb 16, 2015

With the routing now on the main page there is even more need for placement of the city centre.

@pnorman
Copy link
Collaborator Author

pnorman commented Mar 14, 2015

this would need complete rewriting against the current code and yaml. closing.

@lukaszkruk
Copy link

@gravitystorm

Never ever do anything with anything tagged as a "label", that way madness lies. Mapping a 'town centre point' or whatever is fine if all the mappers can agree on where that node should go, but label positioning is font-size and font-face and cartographic-choice specific so should never ever ever be in the OSM database. Anyone adding 'label' roles to relations should remove them.

Sorry to dig out the old thread, but https://wiki.openstreetmap.org/wiki/Relation:boundary#Relation_members permits (and has been like this for a long time) the label role for relations, recognizing it is not actually used by Mapnik. Doesn't this override carto rendering choices?

Was there ever an attempt to repeal this?

Thanks for clearing this up for me.

@kocio-pl
Copy link
Collaborator

I would say there's a spirit of autonomy here. No wiki rules, nor the usage numbers or any other kind of agreement between the rest of OSM community has direct impact on this style.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Non-capital=yes values Render place names on areas the same way as on nodes