-
Notifications
You must be signed in to change notification settings - Fork 1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
OTP gives "INVALID_STOP_SEQUENCE" error for "overcomplete" GTFS-RT data #4702
Comments
Hi and welcome to OTP. I feel that this is definitely a data error and should be fixed in the GTFS-RT feed. Why does the feed contain stops that the train does no stop at? SKIPPED means that there was a plan to stop at this stop but for some reason that won't happen anymore. How is OTP supposed to tell the difference between a genuinely skipped stop and your example? Whilst the preferred option is to fix the RT input, dealing with messy feeds is just a fact of life that we as the OTP developer often have to deal with. We decided a while ago that we are ok with fixing the data in OTP itself as long as we make it explicit, visible and no longer automagically, perhaps through some kind of opt-in clean-up pipeline. If you're willing to help out with this, we would love you to join our dev meetings twice a week or the Gitter chat. In any case, even if you can't work on it, you can join as it would be good to have someone from the Dutch OV project talk to. |
@leonardehrenfried this is not a data error. The intermediate stops are available in the realtime data, but are not part of the regular GTFS feed, as they are not passenger data. The case is here an intercity train that passes along stops. This has been design already since 2013 as part of the original OTP realtime implementation. From my point of view OTP should not judge the data other than guard it time is increasing, and just apply the data from the stop times it knows about. The original implementation in OTP1 actually replaced the entire journey pattern with the new sequence which gives the same effect. |
Did it do that for trips of schedule relationship SCHEDULED? The code that produces this error was actually written by Jordan Verwer back in the day. We tried to preserve the original behaviour when refactoring all of this but maybe this case didn't have a test. I don't think it was changed on purpose. One thing to consider: the statistics about whether the update was successful are a recent addition. Is it possible that this never worked and you just didn't notice? In any case, we are very happy to talk about all of this and would love to have OV contribute to OTP2. |
Yes. At the time we basically defined the entire OTP Realtime interface to allow deviations to work natively.
Lets say I don't know if it ever worked with OTP2. It certainly did with OTP1. As it was part of the funding to have trip deviations, and a testset was created. https://github.com/plannerstack/testset You will recognise the person that made the last commit :-)
I'll add @sven4all in this discussion too. To add some rationale why the times are added in the first place. Some applications are drawing vehicle positions on a map by interpolating between stations (since not all trains in The Netherlands have locations). Having the intermediate times available allows linear interpolation at a better granularity. In addition you can do cool things with network analysis. So yes, it is a use case outside of the realm of "travel information" it is certainly not "inconsistent" with the generic GTFS feed, if only these stop_times would be updated. |
Hi Leonard, Thanks for the openess and quick response! I would definitely like to join one of those meetings, however, as you have probably noticed, @skinkie is much more well-versed in the Dutch public transport data world, so I'm not sure if I could add anything that he hasn't mentioned yet. Anyhow, I'll see if I can join some day so that you could ask some more questions! |
This a pure speculation but you could try the following: In #4424 I introduced these typed errors to replace the flood of logs. You could try and find out if it was this PR that broke your use case. This might give you an indication how to revert it. |
@leonardehrenfried I just tried reverting to 2.1.0 (according to github didn't have the commit you mentioned), and although it doesn't neatly specify the "success percentage", I still get flooded with quite a big amount of errors. It's difficult to say if it's less / more, but seeing that the OTP Webapp doesn't specify "too late" or "on time" for any trains, I feel like it did not work as expected. So I guess there's something more behind it next to that commit. |
After some further research, it seems like (some) of the errors come from the following situation: Say I've got Stop A which has tracks 1 (which is subdivided in 1b and 1a). In my scheduled data (stop_times) this trip is designated to start at track 1. However, the GTFS-RT stream gives the more specific track that is now known, Track 1a. This would result in OTP immidiately not being able to find the first stop, as it now has a different ID than the original planned stop. This makes OTP throw an STOP_SEQUENCE_ERROR. The problem is is that the match at
is not found. This is because the matching only looks at the stop sequence or stop_ids, however, without a stop sequence, in the situation described above, of course the stop_id matching at
fails. However, this "updating" of a new more specific stop as first stop is within the GTFS-RT spec. See: As long as the new stop (platform) is under the same "parentstation", it is still valid. It seems like OTP does not handle this case. EDIT: |
I think the way how tripUpdates are applied should be configurable per feed. Andrew and Jorden agreed before we need to establish some kind of quality statement per feed like "this feed is complete and allows trips to be replaced" versus "this feed only makes next stops delays available which should then be propagated for all stops to make them coherent again without time travel". |
After doing some more digging, it seems like the "Replacement" "ScheduleRelationship" case comes very close, as it just cancels the old trip and dumps all new stops into the graph. However, it seems like "REPLACEMENT" is depricated in the GTFS spec? |
The deprecation is a political move that forces a (complex) specification change. |
Even though it's deprecated, the It's worth trying if it still works. |
Hi Loenard, I tried customizing our GTFS-RT feed to make all trips "MODIFIED" instead of "SCHEDULED", however, although the import errors were much lower, it seemed like the OTP graph was actually removing most of the trips. As soon as OTP processed the pb file, the routing engine wasn't able to find any train trips anymore. Seems like something went wrong there, but can't put my finger on what. Anyhow, for my personal implementation, I have now just created my own custom Dutch GTFS-RT feed for trains which works almost perfectly with 95% import success. I noticed the other 5 percent is all international trains with a different stop sequence, so that is still something that would be worth figuring out. Anyhow, I can confirm that the problem with the official Dutch GTFS-RT feed is both the lack of sequence numbers and the many "unofficial" (skipped) stops. I've removed these from my own feed and as stated before, with 95% import success. I guess we have some digging to do on our side as well to make the official feed better! |
Good to hear that you managed to the errors down to a low number. BTW, you're welcome to ask even very detailed questions in our Gitter room: https://gitter.im/opentripplanner/OpenTripPlanner |
@koch-t ^^^ |
For the remaining errors, could it be due to stops that aren't part of the original feed (with the same id) and not in the graph as a result? That would prevent the update from being applied. |
The biggest problem (for now) seems to be the following: |
For a high quality feed. Is there ever a reason not to overwrite everything? |
@Arilith BTW, you don't need the sequence number. You can also just use the stop id. (Of course if you have a circular route this will not work, but these are pretty rare, particularly for trains.)
If your input data is good, I don't see a reason why you should not. |
Is it an idea to have a configuration option to set the default behavior? |
I think it is. We decided a while ago that OTP is now a sort of enterprise software and we don't shy away from having lots of configuration options. We also have automatic generation of the documentation for these config options. We prefer that over everybody having their own fork and implementing their custom logic downstream. You could also test/fix the REPLACEMENT schedule relationship again which does what you want. |
Also the way to define platform changes, such as from 1 to 1a, was recently adopted in the official GTFS-RT protobuf schema in google/transit#219. It is not currently implemented in OTP, but should be quite straight forward for you to implement. |
I added the sequence number as the current way of dealing with platform changes does not work with only stop ids. For example, when we have a train that is planned from Amsterdam Centraal platform 4 and is later updated to depart from platform 4b after more information has become available, that trip would instantly fail to be inserted, as the stop ids don't match anymore (I debugged OTP step for step and found this out, not sure if that's correct behaviour). This is the same problems for stops that are on the route and have changed platforms. This creates a massive issue, as platform changes are quite common. I've only re-created our trainUpdates.pb until now, and I feel like the 10% of errors in the tripUpdates.pb from other vehicles is mostly due to the same problem described above. I could try removing the stop sequence for international trains, but it also sometimes happens (very rarely) with local trains, that the planned sequence doesn't line up with the actual sequence. (Think of a train that was planned to go through station X, but there was a switch issue, which made it go through station Y and Z which were not even in the original planning). Another problem @skinkie and I actively discussed is how to deal with "TVV" (Train Replacing Transport). Currently, if I have a trip that is not in the pre-defined GTFS definition, but my realtime data has it, it can not be inserted into the OTP graph as the tripId won't be found. (And/or serviceId) (As even the handleAddedTrip method checks for these properties) This is quite a big problem for scenarios where a big disruption happens, and non-planned busses/replacement transport is actived. What is your guys' view on this? Is there a "to the book" solution for this? About changing everything to "Modified", this would (I think) only solve the stop sequence errors for the International train problem I described earlier (though, as described, there's no clear "I'm wrong" sign in our dataset). But for all new trips, this also won't work due to the problems with missing trip information. It would be nice to have a way of adding a whole new trip (with headsign, serviceId, etc) with a linked "originalTripId" so that it could still be referenced to the original (now cancelled trip). Now I'm very new to GTFS-RT, so maybe this is already in the works, but this is what I experienced over the last week. IMO, GTFS(RT) is quite limited, especially in comparison to the (realtime)data we work with nationally. |
For adding completely new route you will be interested in this PR: #4667 I'm using it for dynamically adding carpool "routes" but technically that is the same as an emergency rail replacement service. It adds the ability through a protobuf extension but if it's more than a single organization using it, it would be worth getting it into the official spec. Also @skinkie has lots of experience doing just that. |
Oh that PR looks good! Thanks for the heads up. As soon as it is accepted, I'll definitely build that version and try it out. |
Experimental feature. Would not be required if data is 'just' replaced. |
This issue is stale because it has been open 90 days with no activity. Remove stale label or comment or this will be closed in 30 days |
@Arilith did you have a patch? |
This issue is stale because it has been open 90 days with no activity. Remove stale label or comment or this will be closed in 30 days |
Keep open. |
I have not patched OTP directly (yet). I have been looking into #5230, to fix the current issue of there not being an "official" way to have platform changes under the same station using the SCHEDULED relationship, even though according to the spec this should be possible. However, this seems to be a little more complex than what I have had the time for to put into developing, so that'll take some time. Currently I'm using a modified GTFS-RT stream that makes use of the (sadly) deprecated MODIFIED schedulerelationship. For "ritbeeldmatchende (replacement / extended)" trains I simply use the CANCEL / ADDED combination. This generally results in a fairly stable way to make OTP consume our Dutch InfoPlus system, with about 99-100% import success, except for very specific cases. I can always try to take a look where exactly our official (openOV/OVAPI) GTFS-RT streams go wrong and try to patch that out, but it seems like that would require a substantial change in the way OTP deals with the SCHEDULED trips, as that currently does not allow any kind of stop modification (as far as I've seen). I'm having a discussion with Leonard today, so if I get any more information about this, I'll update it here. |
This issue is stale because it has been open 90 days with no activity. Remove stale label or comment or this will be closed in 30 days |
Keep open. |
This issue is stale because it has been open 90 days with no activity. Remove stale label or comment or this will be closed in 30 days |
Closure is stupid. |
With that kind of language you're not helping your cause. Resources are limited and without developers being funded to look into the code, there is no point in pretending that it will be magically fixed. |
@leonardehrenfried there is no inteligence in github-actions, it cannot be offended. Lets not forget this used to work in OTP1. Closure does not make issues disappear. That is a really stupid management paradigm. |
The good news is that you can keep using this non-standard feature in OTP1 forever. |
Expected behavior
The specified trip updates are inserted correctly. Maybe give a warning about not-found stops.
Observed behavior
An INVALID_STOP_SEQUENCE error is thrown for many trips, which causes OTP to fully ignore the message, even though there is valid data.
Version of OTP used (exact commit hash or JAR name)
otp-2.2.0-shaded.jar
Data sets in use (links to GTFS and OSM PBF files)
http://gtfs.ovapi.nl/nl/gtfs-nl.zip
Stripped version of netherlands-latest.osm.pbf
https://download.geofabrik.de/europe/netherlands-latest.osm.pbf
Command line used to start OTP
java -Xmx8G -jar otp-2.2.0-shaded.jar --load .
Router config and graph build config JSON
Router config:
Steps to reproduce the problem
Import the specified GTFS and use the given router config. Errors like these will be thrown:
I have already contacted our public transport authority and spoke with a contact. The "problem" is, is that our train-trips overspecifies any updates (even for stops which are not in the "stop-times" files)
Let's say we have a train from Point A to point E. During this the train passed at point B, C and D. The train stops at point C, but not at point B and D. Our GTFS-RT feed will still specify point C and D as "skipped" or "ignore" them. This is probably the cause of the above errors.
Now in the GTFS-RT specification it is specified the stops should be in order, however, the order is correct, there's just "Too many". It would be nice if OTP could ignore these errors and still update the graph as now over 50% of the updates are ignored.
Example trip update with error:
Looking up these stops in stop_times.txt gives:
So the specified stop of "2423251" in the GTFS-RT is not in the stop_times file as the train won't actually stop there, but simply passes it.
Kind regards,
Tristan
See attached files for some more data.
OTP Error.txt
TrainUpdatesErrors.csv
TrainUpdates.txt
The text was updated successfully, but these errors were encountered: