-
Notifications
You must be signed in to change notification settings - Fork 4
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Unified Variable Rate Service Queue #15
Comments
I added DeadlineQueue.lua as a gist to illustrate the approach. |
I was going to try something similar once I had a base big enough to encounter performance issues. My idea is based on this thread from the item network mod. The gist is that you have a fast queue and a slow queue (in the case of ARR these would be a fast and slow queue for each of the existing queues) and move entities between them based on whether or not they are active. I think I got as far as having the handlers return true if they did some work. This of course would have the issue of entities taking a while to "wake up" after being placed in the slow queue. As well as requiring lots of tuning to figure out the optimal rates for the queues. The deadline queue would probably solve this, as the entity handlers would control when their next service happens.
It does, the queues are constant latency queues, meaning the service rate will increase if there are more entities such that every entity is serviced at a constant interval. factorio-auto-resource-redux/auto-resource-redux/src/EntityManager.lua Lines 124 to 130 in 28cb234
This produces something akin to ordered dithering, so having to service 10 entities with an interval of 20 ticks would produce an update pattern of I'm not sure how your deadline queue idea would deal with spreading services across updates, I think we'd need a to limit the max number of updates per tick based on how many entities there are in the queue - the idea being that if there were, for example, 1000 entities that want to update now, we delay some of them to prevent lag spikes, then over time the update rate would end up somewhat consistent. Anyways, I'd be happy to merge your changes if they bring a measurable performance improvement to the mod. |
The number of entities serviced per tick is limited, so the load will naturally spread out. Let's go with a real example. Assume that the entities per tick is set to 100. I have a map that consists of 1000 assemblers. The max entities to process per tick needs to be user adjustable to match the performance of the machine. On my machine, the UPS starts to drop around 100 entities per tick. I'd expect most to start lagging a bit around 50 entities per tick. The default for Item Network was 20 per tick. A bit conservative, but that was plenty with my variable timing modifications. With ARR, an assembler has a desired tick service period of 120 ticks. There are two ways to support more entities -- either increase the # per tick (at the cost of UPS) or we extend the service period to longer than 120 ticks. This scheme doesn't really have benefits until the service period is extended. Or rather, it has a consistent service rate until overload. |
I see what you mean about the dithering. That solution is quite clever. I'll admit that I had to plug it into a spreadsheet before seeing how it worked. I don't know if that could be used if the service rate is variable. I agree that the number of serviced entities per tick should scale with the total number of entities. |
It could be a running average, but that would involve keeping a table of the previous N number of entities it's serviced, but for a similar computational cost it could calculate the average of the next N buckets in the queue. Though it is hard to tell how such an algorithm would work in practice, so I built a crude simulation to see how it would look when started with 1000 entities scheduled at tick 120. These are all using my adaptive limit idea (with the weight of the new term being 1/120), which is plotted alongside the average of the previous 120 ticks. It seems to follow the average reasonably well for being a single calculation.
regular_short.mp4
overflow_to_avg.mp4This will take quite a while to stabilise, and I don't really like how it constantly has to run behind by a couple ticks (because the only smoothing mechanism is delaying entities) - so what if I tried the opposite?
extra_to_avg_bug.mp4My fix for this was to reduce the number of extra entities that can serviced by how far in the future the next set of entities is, so if we have extra capacity for 50 services, but the next set of services is due in 10 ticks, then we can only service 40 extra entities. It seems to solve the issue reasonably well. extra_to_avg.mp4Now the only problem is that because it only focuses on increasing the number of items in a queue to meet the average, small spikes above the average take a while to flatten out. Which brings me to the last version of what I came up with.
combined.mp4The last two do have the downside of always running a couple ticks earlier than they should, but it's more desirable than being late. These are pretty data visualisations, but how long do they actually take to stabilise? I plotted the standard deviation of the number of entities processed by each of the algorithms over time (a sliding window of 120, for 3600 ticks): The combination algorithm does cause a faster smoothing but doesn't achieve a better final result, so it's probably better to stick with something similar to 3 (process extra items to meet the average). Not sure how it could be tweaked to get rid of the small hump that can be seen at the end of the video, maybe incorporate the maximum value into the average somehow? Or perhaps it's good enough for now. Though this is a simulation of one type of entity, I don't know how it will fare with multiple types with changing delays in a closer to reality simulation. I also had the idea to plot each entity separately in the colours of the rainbow so that any shuffling would be visible, but didn't have the time to try that out. I put the code used to generate these in a gist: https://gist.github.com/udf/8b156d8c9fc6f7ae0778a2194aad9108 |
Really interesting visualization. Super-helpful. The 'entities-per-tick' is the true average over the last 180 cycles plus a factor based on how far behind we are. function EntityManager.on_tick()
local ra = global.deadline_average
if ra == nil then
ra = RunningAverage.new(180)
global.deadline_average = ra
end
local now = game.tick
local dlq = global.deadline_queue
local ept = Util.clamp(math.ceil(ra.get_average() + TICK_FACTOR*(now - dlq.deadline)), SERVICE_PER_TICK_MIN, SERVICE_PER_TICK_MAX)
local processed_cnt = 0
for _ = 1, ept do
local entity_id, _ = DeadlineQueue.next(dlq)
... handle entity and re-queue, increment processed_cnt ...
end
ra.add_sample(processed_cnt)
end The average length and tick factor can be adjusted, but average length=180 and TICK_FACTOR=1 seems to work well. My DeadlineQueue implementation pulls up to that many entries off the queue for each tick. It 'falls behind' when the deadline for the current queue is in the past. The tick factor helps 'catch-up' without messing up the running average too much. The running average is computed using the usual circular queue implementation, where the CPU cost is constant regardless of the number of samples: local RunningAverage = {}
function RunningAverage.new(size)
if size < 1 then
size = 1
end
local inst = {
sum = 0,
idx = 1
}
for i = 1, size do
inst[i] = 0
end
return inst
end
function RunningAverage.reset(self)
self.sum = 0
end
function RunningAverage.add_sample(self, value)
-- subtract the old sample and add the new sample
self.sum = self.sum - self[self.idx] + value
self[self.idx] = value
-- point to the next sample
self.idx = 1 + (self.idx % #self)
end
function RunningAverage.get_sum(self)
return self.sum
end
function RunningAverage.get_average(self)
return self.sum / #self
end
return RunningAverage I'll see if I can clean up the script and post it tonight. |
That running average code is really clever, I never thought of doing it like that. I like the queue processing algorithm too. |
I uploaded the files here:
|
Servicing items only when they need it can save a bunch of CPU time.
The Item Network has a unified service queue. It would pull off and service 20 items per tick.
That scales really well, but the service rate gets a bit slow. 2000 entities at 120 entities per second => 16.6 seconds between service.
It was observed that most items don't need to be serviced that often. We considered using multiple queues to separate out high-priority items from low-priority.
My fork added a priority based queue, with 32 queues that span 20 ticks each. Servicing 20 entities per tick means 400 items per queue or 12,800 entities before it can no longer keep up. In my game runs, I didn't get near that limit, so I deemed that 'good enough'.
The max time between services was ~10.6 seconds (32 * 20 / 60).
Most items don't need to be serviced more often than every 10 seconds.
Some could go even longer. Some need to be serviced every few seconds.
The entity queue is a simple circular queue of tables, each with a 'deadline'. Each entity has a priority and the bucket is determined by the priority. If the current table index is 1, then priority=0 would put it at index 2, priority=1 is at index 3, etc.
Entities are pulled from the current queue until empty AND the deadline has been reached.
Then it goes to the next queue index.
Pretty simple stuff, but I had trouble fine-tuning the priorities.
I ended up adjusting the priority up or down by 1 depending on the state. It would also peg the priority to the max in some instanced.
For example, if a provider chest was full at service time and was empty after service, the priority would be increased (-1), as it is filling faster than it could be emptied. If the chest was over half-full and then emptied, then the priority would stay the same. If it was less than half-full, then the priority would be lowered (+1). If it couldn't be emptied (global item limit), then the priority would go to the lowest value.
Assemblers were treated in a similar manner. Ingredients were calculated based on the elapsed time since the last service. The priority was only increased if the input ingredients were maxed out and still ran out before service. Likewise with the output, If it was full, then the priority would increase to service the assembler more often.
Since then, I changed the Service Queue again to be a tick-based deadline queue. When an entity is queued, it will be serviced on the requested tick or later (due to overload.)
Anyway, this is getting a bit long, but ARR is using a bunch of independent service queues. It doesn't appear to vary the service rate. The net effect is that this will run into performance issues when there are too many entities.
I'll probably end up ripping out the current queues and replacing them with a unified queue at some point in my fork. That change will break any hope of keeping my fork in sync. Same thing happened with my fork of Item Network. My changes were too much to merge back upstream.
Anyway, so my question is: would a unified deadline-based service queue this be something of interest for this mod?
We'd have to:
If interested, I'll try to implement it in small review-able chunks. The various queues will be merged one at a time.
The text was updated successfully, but these errors were encountered: