-
Notifications
You must be signed in to change notification settings - Fork 63
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
multiple trains in/waiting at spmutex enter region after server restart #492
Comments
Yeah I expect this, as the mutex locking state isnt saved to disk on shutdown. Maybe I should just do that and restore after reload |
Yeh, it's probably only really an issue because my server is set up to turn off overnight and when no players are connected for however many minutes, so it tends to turn on and off several times a day. |
As this is a much broader issue, Ill make fixing this a part of a partial 'trains.groupdata' rewrite fix. Trains need to remember what signs they had activated when they unloaded/server restarted. They should also remember any actions they had running at that time (launch actions). It's kind of an age-old bug that none of that works. I'll make saving the mutex zones the train has occupied/is waiting for a part of that metadata saving. |
traincarts-persistent-mutex-test-build.zip
I did some testing of my own and it seems to work fine. However, its possible something isn't working right when tested on a larger scale. If this build for some reason does not work, don't revert back to a spigot build as that will probably cause all trains to break. Downgrade instead to this build, which can handle the new group data format: |
It mostly appears to work for me with that test build, however I did have one train near an spmutex split apart on a point, leaving one cart on the wrong branch of the point:
I'll have a further play later to see if i can reliably reproduce it. |
You mean it split at a junction? So the junction toggled when it spawned in? |
it did, though i haven't seen that again. The issue I'm seeing regularly now is that a train that is already well into an spmutex zone, waiting at a station, thinks its approaching the zone rather than in it, with coordinates where it would have entered the zone. As a result another train enters the zone in front of it. |
to clarify, the split train was at a junction, but its unclear how the junction toggled. |
Are all these trains on the main "world" world, or is this a separate world that is loaded in by a plugin like multiverse? Want to make sure as later loading in of worlds could have an effect |
this is a multiverse world, that isn't called "world".
Perhaps OfflineWorld refers to a world that isn't loaded? Its unclear from the log when the worlds are loaded, presumably not until somebody joins. There is a The train at a station in the zone doesn't show any mutex zone in I can also confirm that after a load with a train waiting on an spmutex over a junction (switcher sign), the train following it didn't switch the points the other way as it should have and goes the wrong way. This is the train that later may get split. Something to note here is that it first goes over a conditional skip (skipping a destination, station, and animate doors, if the destination isn't the one that would be skipped). So it shouldn't be skipping them at all, but i'm wondering if its re-evaluating the skip on reload (after hitting destination sign and switching to next destination on route, but before moving fully past the skip sign, which would then make it skip when it shouldn't), and then its weirdly skipping the switcher even though thats the next sign after the 3 sign skip. I just saw a train come out of the skip area, having apparently done the door animation, but not stopped at the station and been switched correctly after the skip (or perhaps the junction was already pointing the right way). Sorry thats all a bit hard to describe, maybe it gives some useful pointers. If its easier for you to join again just let me know. |
offline world is a bkcl api to persistent knowledge of worlds even when worlds unload / load again. It shows name=club indicating the world is loaded at the time, so thats fine then and not a concern. Does the specific sign whose mutex didnt persist show up in the logging with a Load path mutex? Skip state is another thing Ive attempted to make persistent so yeah sounds like that part isn't working right either then. |
Found some issues, am working on fixing those first. They explain the switcher sign behavior you describe |
traincarts-test-persistence-v2.zip Unsure if it impacts the pathing mutex situation. Let's start here. Note that I did turn off that logging as it didnt add much value, I might add more specific logging if I need to dive deeper to fix the pathing mutex situation. On my server I didnt have this issue of the smart mutex zone disappearing on restart. One thing I did notice is that if a train is stopped inside a mutex and the server restart, it shortly forgets what the forward moving direction was, and so starts scanning the track in reverse. This causes the pathing mutex to reverse back in the opposite direction "behind" the sign. I can probably add some sort of blocker at the sign to prevent it going too far as that's annoying |
Thanks, things I've noticed running this version:
|
Confirmed the station thing, previous fix must have re-broken that. let me fix that |
traincarts-persistence-test3.zip Only issue that remains is the spmutex zone problem. You mentioned not immediately after restart. You mean you rule out that either train was inside the spmutex zone at the time the server restarted? Is this happening very long time after the server has booted, and all the time? Because if so I have to look at something different than serialization/deserialization of the state It's also worth checking whether that same issue occurs with https://ci.mg-dev.eu/job/TrainCarts/1537/ |
Thanks, station resume & skip appear to be fixed i think. The spmutex zone problem: Yes, it happens fully after server start. After it happens I destroyed one of the trains, it freed up, then happened again soon after. I just saw it happen a few more times, and think I may have a clue:
it doesn't happen every time, I guess it depends on the timing of the wait polling while the train is stopped at the routing station. (I haven't tried 1537 yet) |
I'll add the following (probably irrelevant, mutex already broken at that point): |
And one more observation: I've also seen when two trains are waiting at an spmutex, the zone becomes available, then one of them moves about a block towards the zone, then stops, and the other one proceeds into the zone instead. |
But are these issues that existed before (so the jenkins build I linked) too, or is that since these changes? I did do a minor rewrite of some logic to integrate "unloaded" trains as part of the mutex logic, and its possible this broke something. If the jenkins build is fine https://ci.mg-dev.eu/job/TrainCarts/1537/ then I know to review the commits... |
I thought it was fairly solid previously while the server was running, but I have just reproduced it on 1537 so I guess it must be a pre-existing issue, probably exposed by me increasing the frequency of trains to help reproduce the persistence issue. |
if its pre-existing then Ill have to figure out what triggers it. Does this issue exist at all for Smart Mutex signs? (not pathing) Does a non-smart pathing mutex have the same issue? |
Decided to push the changes as official as it looks stable enough. Besides a minor change of tracking unloaded trains by properties instead of name, there is no difference with the version you tested before. https://ci.mg-dev.eu/job/TrainCarts/1538/ I don't see these problems with mutex zones myself. If you have time, maybe I can hop on the server again to see for myself what is happening and where. |
As it saves me some time, can you /tcc export this track and send the link? Just select all the nodes of this test track and do /tcc export. I can place the spawner signs myself. That way I know for sure its 100% replicated. |
sure, here it is: |
There was a sign mistake: at one position it set a destination "test2" but the destination sign says "test_2". Is this something you fixed yourself as well? Idk if it matters |
whoops yeh that should be test_2, I don't think it makes much difference though. Between screenshot & export I had fiddled about with those signs to change storage carts from a route to just a plain destination, to see if the (note the reason I had different spawn rates was to avoid situation where normal minecarts would never progress into spmutex, maybe thats same thing you're seeing). Trying just now, pretty soon it got stuck with both storage and normal train cart entering the zone... once... so yeh i haven't nailed the reproducibility. I'll have another try later to nail it down. |
I can discover some issues that occur in your setup, some of it related to my issue of one lane never progressing. It's a little difficult to fix but I'm working on it. |
One problem is that the storage carts (in my case) end up losing mutex lock once they drive from the spawn sign to the station. It seems like somehow they exit the mutex too soon. This can be verified by only spawning a single cart there and spamming /train debug mutex - the pathing mutex disappears once it stops on the station |
yes, I've seen that a number of times, usually by the train already in the mutex zone saying that its approaching that same mutex zone since another got in during the gap and says to have already entered it |
Found one bug: https://ci.mg-dev.eu/job/TrainCarts/1540/ However, Im not quite satisfied with the current 'forAllBlocks' logic as it sees all vanilla rails as having two or more actual blocks due to them going from 0.0 to 1.0. So I need to fix that too. |
With this build vanilla track follow proper pathing mutex zone rules again. https://ci.mg-dev.eu/job/TrainCarts/1541/ Please let me know if at least the issue of two trains entering at the same time is fixed. The issue I found, where it seems one lane is always allowed to progress while the other isn't, isn't fixed yet. I have a fix but as it had a large rewrite of lots of things I want to test that more, first. |
Thanks! I noticed one case of two real trains in the same mutex region, but didn't spot how it happened. I haven't seen other issues despite watching the usual place where it happens with loads of trains spawning for quite a while. I'll keep them spawning and report back if there are still issues. |
Yeh its happening still, basically as before, the trains stop momentarily at a station sign to reverse direction, and apparently release the mutex for long enough for another train to enter the region from a different branch.
and the trains have routes that direct them back out of the station the way they came. |
The path creation only occurs the moment the train rolls over the pathing mutex sign. Once it has exited, that particular path is locked in and won't change with future rerouting. I guess the only way around this would be to take destination signs into account, but simulating all of that gets far too complex really quickly. Take skip signs and conditional switcher power signs for example. Would it be possible to set the next destination it goes to before it stops on the station? It might also help to add another pathing mutex sign near the station maybe? Or just use a traditional mutex zone with smart routing so it covers the entire area regardless of path the train takes. |
ok fair enough. I'm trying to workaround it, but the tricky thing is that the route into the platform is the same as the route out, so there's no destination before the spmutex that would make it route into the platform rather than driving straight past it. I've tried putting a destination after the spmutex, and changing the destination sign with the station to also set the destination to route it out of the mutex zone, but it doesn't seem to make a difference to the zone shown by /train debug mutex. I was going to try explicitly setting destination on the way into the platform and doing spmutex again, but I've noticed that the spmutex state (in /train status) seems to now consistently clear once it hits that first station pause to turn around so I'm not sure that would be safe. I'll attempt to change it around a bit so that the exit from platform doesn't pass over the same track where the train reverses direction, so that it can avoid a change of destination within the mutex zone... |
Info
Bug
Description
When the server is restarted, trains waiting on an spmutex are permitted to enter the mutex region even if there is already a train in the region.
Expected behaviour
Waiting trains should continue to wait as if the server hadn't been restarted.
Actual behaviour
Trains can enter an spmutex region already occupied by other trains.
Steps to reproduce
Queue up trains at an spmutex and
/stop
the server, on start up watch the trains enter the spmutex region.Here's an example that reproduces it:
Click button twice, one cart stops at station, one waits. then
/stop
, restart server, and the 2nd cart has entered region and connected to the first.The text was updated successfully, but these errors were encountered: