-
Notifications
You must be signed in to change notification settings - Fork 39
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Define standard: segments & intersections #66
Comments
To clarify what we've done thus far: a 'segment' is either an intersection segment or a non-intersection segment. At the moment they are stored separately:
Current set up doesn't need to be the final say on how things are stored! |
With regard to connecting back to open street map, it's not currently done but I agree that it needs to be. I've been scoping this out, and here's what I've found for non-intersection segments: Openstreetmap has a whole bunch of extraneous nodes that we don’t care about. We’re using the osmnx package to simplify the road network. Osmnx has two main ways to simplify the network:
There are a number of reasons osm ids can change even though it’s not obviously an intersection
It seems to me that only the first case, and possibly only when there’s an actual traffic light should be two segments separated by an intersection. But it might be interesting to eventually look at some of the other nodes, for example is the likelihood of a crash higher when the number of lanes changes? Thus, a non intersection segment (as this project has historically been defining them) can have more than one osm way id. So I think we need to decide if we just want to store a list of osm ids for each non-intersection segment, or if we want to redefine non-intersection segments to correspond 1-to-1 to an osmid. That would mean that we'd have an additional set of nodes represented somewhere that are not intersections. I can see pluses and minuses to redefining them:
|
For intersection segments we will likely always want to have it represented in a way where each intersection corresponds to one or more osm nodes. This is because an intersection is by definition all the portions of ways that fall inside a bounding box of a set of close together intersection nodes. |
Here are the features I'm currently seeing that have different names for the same thing:
|
As discussed in the channel - if we're going to implement a segments data standard to make viz development much simpler and more consistent, it makes sense for it to be done when the segment data is being created, which is quite early on in the pipeline process (data_generation). This phase retrieves OSM segments, with optional overloading taking place if a city-supplied map is available (@j-t-t have I got this right?). It writes the segment data to file as part of make_canon_dataset, which in turn is used to train the model and generate predictions. I'm happy to have a go at standardizing the segments and following the pipeline through to update any references/uses of the current feature names, but I wanted to check first whether those who actually wrote these scripts (@bpben @j-t-t @andhint @alicefeng) had any concerns about this issue or wanted to handle specific parts? Thanks. |
This is an interesting question. The reason we did the Boston-specific features was that performance of the model was better for them than for the osm features. I understand the desire to standardize, certainly from a viz perspective, but also for cities to be able to share information. But I do think that there's value for cities to be able to use their own features. I think there's value to that, and that's certainly something that we told Boston we could do when we met with them in July. So if your plan is just to map features that we know about: speed limit and F_F_Class to the osm features, that seems fine. But I don't think we should drop existing Boston features, or prevent other cities from adding their features. There's not really any overloading happening. Right now, the only reason Boston doesn't use osm features is that we specify which features to use in config_boston.yml. We could use both easily enough. If we think that Boston's speed limit/F_F_Class info is more useful or informative than osm's info, we might want to switch to overriding that, but it seems worth looking into before doing so. If we do choose to override osm data, we should probably have this be something configurable, where cities can specify not just which features to use, but which features override other features. |
I agree on having a configurable setup where each city can optionally provide and where necessary overriding OSM features. I like the idea of saying to cities "bring your own features" but I wonder if actual use of those features needs to wait until a release after 2.0. In the short term at least, when generating predictions I can't see that we're going to make use of features beyond a fairly common set - speed, signals, lighting, lanes etc (@bpben please let us know if this isn't the case). These are all likely to be available via OSM, but if they aren't for a city and the city has them, we can use those. Either way, we can map their feature names to a common vocabulary, which has immediate pay-off for viz development and should only require minimal work to integrate into the existing modelling work (the longer we leave it the worse this could get). |
Oh definitely cities should be able to add in whatever custom data they have. But I think that should extend the list of features rather than, say, drop all of the osm features and only use city-provided data. If cities can entirely pick and choose what they want in their model that's going to limit how much the viz can display in a human-friendly manner. At any rate, these are the segment features the viz is currently using:
|
Great, I think we have the beginnings of a segment standard then. I'll try and get something together before the next meeting, where we can discuss this further and get Ben's input too. |
Okay, as far as I see it there are two straightforward things to do:
|
Echoing @j-t-t and @alicefeng on this: We definitely need to allow cities to add their own custom features. I think that's what @j-t-t is working on with the point based feature additions. I think it would also help me and other people working on the modeling to test out new features. Also: I don't see a problem with including both in the model. Generally, there's an issue of features giving the same information with parametric models (e.g. logistic regression), but our logistic model testing at the moment doesn't really take that into account. You could imagine the case that a segment's speed according to Boston and speed according to OSM both provide some kind of interesting information for the model. So, adding a third option to @j-t-t , just include everything. And maybe do smarter feature selection. |
#318 should address this, allowing customization of features based on points, we have other capabilities for pulling in other maps. I do still think we need some standardization here, but will remove help wanted, as it's more a job for a more core group of volunteers. |
Segments & intersections are now to be based off OSM data (nodes, ways and relations) to allow any city with OSM coverage to onboard into the project easily.
We need to define an initial standard of how the segments & intersections will be stored once built. Some questions that have been raised which may still be open -
what are the required & optional characteristics of a segment/intersection, including data typing?
are segments/intersections stored separately, or together? If stored together, how do we differentiate?
are we storing foreign key data for segments/intersections that connects them back to the OSM nodes/ways/relations from which they were built, so that we can detect & respond to changes in the source and possibly pass information back to OSM at some point?
The text was updated successfully, but these errors were encountered: