-
-
Notifications
You must be signed in to change notification settings - Fork 474
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
New "flex" backend #1036
Comments
The complex interaction between Lua and C++ code (multiple phases, storing/retrieving data, marking objects for later processing) made me wonder if you had planned any kind of debugging/tracing/troubleshooting option for people writing Lua scripts? |
@mmd-osm You can use the full power of Lua to do that, in the simplest case just print out intermediate values etc. But you can also open a log file and write stuff there etc. If anything this becomes easier, because more of the magic isn't hidden in C++ code but plain to see in Lua code. |
On Thu, Dec 19, 2019 at 5:45 AM Jochen Topf ***@***.***> wrote:
In the past weeks I have started working on a new output backend. I am
calling this the "flex" backend, because it is more flexible than any of
the other backends and should eventually be able to support all use cases
currently supported by the other backends and many more. This flexibility
is mostly due to an increase in the use of Lua scripts which are used for
configuration as well as more powerfull callback functions.
This work has mostly been triggered by #230
<#230>, but it also
touches many other issues like #901
<#901>
This is really, really welcome news! Thanks for tackling it.
Given the discussion of #230
<#230>, could I beg you to
contemplate the possibility of allowing the designer to represent relations
as database rows without geometry, but with auxiliary tables to represent
the relation membership? I suppose the members could instead be tracked as
arrays in the main row. I'm an old man, and an old-school DBA, and have
rather a preference for breaking fields of variable cardinality off into
their own tables, perhaps clustering with the main table.
https://github.com/kennykb/osm-shields/wiki/Proposal:-add-route-tables-to-osm2pgsql
describes
the particular use case that I have in mind. My current implementation of
the rendering works by dredging the appropriate information out of the slim
tables - which I recognize is deprecated, and I'd like to move away from.
If I have enough hooks in the code that I can populate tables like these
two and keep them up to date when osmosis runs, I can return to working on
making osm-shields capable of coping with minutely updates and scaling to a
larger map area than I currently attempt to render.
I'm not asking to have tables like this on the main OSM server. I do not
intend to have the main render chain depend on them. I'm simply trying to
find a way to support my experimental renderings with osm2pgsql, in a
manner that isn't deprecated and doesn't require me to fork the project.
The project <https://github.com/kennykb/osm-shields> for rendering US
shaped highway shields and route concurrencies has been on hold for a
little over a year, largely because I haven't had the time and energy to
pursue retooling all the rest of my rendering chain to work with imposm3
instead of osm2pgsql. I had started an effort on adding support for this
kind of table structure to osm2pgsql, but was informed that if I did
develop it and submit a PR, the PR would be rejected.That response
convinced me that I needed to switch the osm-shields project over to
imposm3, but I've not been able to muster the time and energy to work
through the impact that would have on the rest of my rendering chain and my
other data consumers. You offer me a ray of hope that I won't have to
tackle that rework.
Anticipating similar objections to the ones I've heard in the past:
'It can't perform well.' I'm looking to deploy this on my own server, not
in the OSM main render chain. If I'm willing to accept the performance
penalty (I am) and the burden of maintaining the code that supports it (I
am, if the developers will have me), and the feature will not degrade the
performance of code that does not use it (I treat that as a requirement),
why is this such an issue.
'Implementing it will require a database reload.' It's my database. I'm not
proposing to deploy this on the main servers, or anywhere else that a user
isn't accepting it willingly.
'You can do what you want to with highway shields without this sort of
support.' Please show me how. I've heard this assertion from several
members of the osm2pgsql team. Alas, I haven't been smart enough to put the
hints that they've given about how to implement such a thing into anything
that resembles a coherent design that I can implement. Note that the messy
North American highway network needs to worry about route concurrencies
(very common!) and such things as having the state highway of one state
briefly cross over into another state without changing its numbering or
signage (About half of New York Route 120A
https://www.openstreetmap.org/relation/407958 is in Connecticut, but it is
signed and maintained by New York.) Moreover, the 'ref' tag does not
contain sufficient information to choose the sign to use - a ref such as
'CR 104' identifies only that it is a county road, and the choice of sign
needs to know which county. That information, as we just saw, cannot be
reliably inferred by intersection with administrative boundaries.
|
@openstreetmap-tiles The whole point of this effort is that you can do things like proper highway shields efficiently without having to have extra tables in your database. So I think if this all pans out the way I envision it now, this should be able to solve your problem. Allowing tables without any geometry was a small change, so I have added this to the code. I have added an example |
@ImreSamu Please don't put everything and the kitchen sink into this issue. This is all not really related to the issue at hand and not useful to have everything in one huge issue. |
@joto : sorry ; I have removed the "noise" .. |
@joto Could you please add an example with an additional geometry column and calculating a centroid (i.e. for |
@nextstopsun Multiple geometry columns and centroids don't work yet. Supporting multiple geometries is difficult, especially because a geometry might be split into several geometries (multi-geometries or splitting long lines), and I don't know yet how exactly to do this. |
@joto So in theory with the new flex backend one can set up a table with two or more columns of type geometry, right? Which lua library could be used to perform operations on geometry (i.e. calculation of a centroid)? |
So far this is all a rough design and we don't know yet what the code will look like. I would like to support several geometry columns, because this is basically a generalization of a "normal" geometry plus the I am pretty sure we are not going to give Lua access to the actual geometries and are not using any Lua libraries for this, because it would be quite a lot of overhead "shipping" geometries from C++ to Lua code and back etc. All this should be done in C++ code for performance, but this would, of course, put some limits on what can be done. I'd be interested in hearing about use cases and ideas people might have, this might influence some design decisions. |
The current version of the PR is now in pretty good shape. It doesn't do everything we might ever want (no multiple geometry columns yet, for instance), but it is getting towards something that might me mergeable. There is even some documentation and lots of examples in the same directory. There are still things to do and to decide though. The main points are:
If you are interested in this issue, now would be a good time to test #1037 and give me feedback. |
👍
I don't see this as an issue. With osm-carto and styles based on it using Lua, the C transforms are not essential as long as we have reasonable default options. I recall a mention of geometryless tables but can't find it again, had you discussed this? |
@joto could you please add an example where all tags are sent to jsonb column? Is it possible to define postgresql db schema name for a table in lua config? (i.e. when you want to put tables to different schemas and tablespaces) |
@pnorman You can do geometryless tables simply by not defining a geometry. The @nextstopsun Filling a json(b) columns requires some Lua json library to create that json. You can then define a column with type There are some provisions in the flex backend code to set a schema for a table. But it doesn't work yet, because there are other changes that need to be done to the rest of the osm2pgsql code to make this work. I didn't want to do these changes before the flex backend is actually merged. So this is something for a bit later. Different tablespaces work as before, ie you can set a tablespace for all tables, but not individually. |
@nextstopsun I have added an example for how to deal with json here: |
Hi @joto A couple of thoughts about a more "flexible" backend that are not directly related to having multiple tables, but other options that might benefit usage:
<CUSTOM_PREFIX><osm_key_name>, e.g. "<MY_PREFIX>highway" or "aaa_highway" Why would I want this? A classic example of a key that is worth maintaining both as an original value and as tag converted one, is the OSM direction key. It may contain both angles specified in degrees (0-359) and cardinal directions (NWSE). You might then have two columns in the output table, one with the original values: And the other a tag converted one: The tag converted column "bbb_direction" could then be used directly in applications to show e.g. the direction of viewpoints, while the original column "aaa_direction" still shows the original user input.
E.g. "addr<CUSTOM_COLON_REPLACEMENT>housenumber". Of course, you could default to a single underscore, but allowing a custom string potentially allows setting a special marker and converting back to the original key. E.g., you might insert double underscore to signify the position of the colon versus the single underscore also used in openstreetmap keys. E.g. "addr__housenumber" with double underscore could then be easily recognized as being "addr:housenumber" as key. This allows all kind of nice wizardry and automation of conversion between openstreetmap key names and database column names.
|
@mboeringa With the flex backend, all names of tables and columns are set in your Lua code, so this is all just a question of writing a little bit of Lua code that does whatever naming you want. There is no need for any special prefixes or colon replacement characters. I suggest you give it a try and if there is something you can't do with Lua code, open an issue for that. |
Interesting. To be honest I have never messed with the Lua code up to now, so am unfamiliar with it yet. I do see references to what I presume are example table definitions in the pull's "files" view as linked below, e.g. in "simple.lua", that also include OpenStreetMap key names, but I don't see an example of a column being "renamed". That is of course understandable, because the current system also simply uses the key names as column names since PostgreSQL doesn't have a an issue with things like colon in the field's name, but it leaves me wondering where of how to specify an "alias" or alternative name for the output column name. https://github.com/openstreetmap/osm2pgsql/pull/1037/files |
Is it possible to setup "planarized" version of administrative borders table? @joto Does flex backend have the power to solve this case? |
@mboeringa You simply name the columns whatever you want and later fill them:
|
@nextstopsun Yes, in theory that should be possible. You'd have to store the |
@joto ,thanks. That explains it well and would work. |
The PR has now been merged and the flex backend is available. Please try it out and report problems or post ideas in new issues. Closing here. |
In the past weeks I have started working on a new output backend. I am calling this the "flex" backend, because it is more flexible than any of the other backends and should eventually be able to support all use cases currently supported by the other backends and many more. This flexibility is mostly due to an increase in the use of Lua scripts which are used for configuration as well as more powerfull callback functions.
This work has mostly been triggered by #230, but it also touches many other issues like #901 .
Here are some design thoughts:
Osm2pgsql design for a more flexible backend and Lua integration
We write a new backend according to the ideas presented in this document. The
new backend should be able to do everything the current backends can do and
would replace them in the long-term. (This also means that the C++-only
transforms without Lua will not be supported forever.)
Table configuration
The new setup can work with any number of database tables. We provide Lua
functions define tables and their columns. Different tables can have different
column setups, for instance some tables might have lots of columns for tags
("old-style"), some have only an HSTORE or JSONB column for tags.
Lua callbacks
For every OSM object (node, way, or relation) that is processed by osm2pgsql, a
Lua callback is called with the data from this object as parameter. The Lua
callback can process the data of the OSM object in any way it likes and emits
zero, one, or more table entries (for any table) that should be added by
osm2pgsql. This is done by calling an
add_row()
function on those tables.Processing in a Lua callback includes but is not limited to:
put several tags into an HSTORE or JSON(B) column, etc.)
calculating centroids.
Part of this functionality would be supported by C++ functions callable from
Lua (something like geom = MakeCentroid(geom)).
Getting data of related objects
For some use cases we need data from related objects that we want to add to the
currently processed object. We might, for instance, want the roles and tags
from all relations a way is a member of to be added to the way data in some
form.
This is solved by allowing Lua functions to
mark
the object)So in the mentioned use case the following would happen:
highway
tag, the Lua script tellsosm2pgsql: Please find all relations having this way as member and call
a Lua callback function later to re-process this way.
the input data and/or it gets relations out of the existing database) and
calls the Lua callback function with the data of the way from step 1 and
the relation data.
any table. In the mentioned use case it could add a PostgreSQL ARRAY
with relation tags to the way, or merge specific tags in a comma-separated
string, or whatever is needed.
Note that this design means that finding related OSM objects is restricted to
one level of relatedness. So you can bring together ways with their nodes and
relations with their member nodes, ways, or relations. But further "nesting"
of relations can't be resolved. This simplifies the design here considerably
and should support enough uses cases. More complicated processing has to be
done somewhere else.
Drawbacks of the design
The design has its problems:
If this is seen as a problem, we could always keep the C++ transforms and
users of that just don't have all possibilities then.
have performance impacts. From the experience with current osm2pgsql it
seems that this isn't too bad and we just have to be careful when
implementing things. The bottleneck is probably always going to be the
database anyway.
done in Lua, there might be more cases where you need a full re-import
after some changes. On the other hand: We want pre-processing where it
is cheap and easy instead of relying on complex SQL queries at rendering
time. This is a tradeoff that every user has to make themselves. We allow
lots of flexibility by having support for HSTORE and JSONB, so that's
always an option.
Advanced Issues
Extra data
For some uses cases it might be useful to allow extra data to be added to
the processing step. For instance we might want to have height data from
somewhere outside OSM and add it to geometries. Or we want to add to each
highway the information whether it is in a country with left- or
right-hand-traffic.
The design of the Lua API should keep this in mind and allow this extra data
to be made available somehow, but this is for a future step.
Prefiltering in C++
It might be useful for better performance to allow some kind of optional
pre-filtering in C++ before Lua functions are called. So we might want to
be able to define that a Lua function is only called for all OSM objects with
a
highway
tag. This way we can save us the expensive Lua calls for everythingthat's not a highway.
We can see how the performance looks before deciding whether we need this, but
it would change the configuration needed pretty fundamentally, so we have to do
this before finalizing the implementation.
Import vs. Update Mode and Separation of Passes
The Lua scripts should have the information whether we are in "create" or
in "append" mode. Or we have separate scripts for the modes.
We might also want to tell the Lua scripts whether they are run in pass 1
(when reading the data) or in pass 2 (when working on "marked" data).
It is currently unclear how the two passes needed for append mode and the
two passed needed for the more complex relationship processing will interact.
The text was updated successfully, but these errors were encountered: