-
Notifications
You must be signed in to change notification settings - Fork 2.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
bulk queue with timer - how to construct? cargo? timeout? #1490
Comments
I guess I could implement by using |
The idea of a "throttled queue" has been discussed many times before (#1314), but we have punted on it. It is a surprisingly hard problem -- usually you use something like Redis or a message queue to manage it. It's hard to do it properly in Node. We've decided we don't want to build or support throttling in Async, instead we just offer the concurrency-limited methods. You can approximate the throttle rate by multiplying the concurrency and the average I/O response time. |
Also, we are missing a |
I was actually thinking the reverse. I don't want my cargo/queue to process one at a time, but instead batch them up. I am not so much throttling as I am providing minimum batch size. I am perfectly fine if I pass through 1MM per sec, as long as each handler:
It sort of looks like this (if ugly): const tmpQ = [], maxTime = 5000, minCargo = 10;
let lastRun = 0;
const handler = (cargo, callback) => {
tmpQ.push(...cargo);
const now = new Date().getTime();
if (tmpQ.length >= minCargo || now - lastRun > maxTime) {
// process function here
// empty Q
tmpQ.splice(0);
// reset time since last run
lastRun = now;
}
callback()
}; |
Interesting. So normally |
This animation from the docs explains it best, a cargo with There is only ever one worker, but internally, it is implemented so there could be many workers. (A But it seems like i misunderstood what you needed -- a buffering cargo with a time limit. A little bit different than a throttled queue. Do you really need the buffering? Seems like a basic cargo would accomplish most of what you need. |
It would, mostly. For now, that is exactly what I used. The difference, as you pointed out, is the buffering. If the worker is ready and only one task is in the queue, then I want it to wait up to the specified time limit. Essentially:
Now that I think of it, you probably could implement this as a |
Back at this. We went with it, and when it went from test environment to production load, the downstream i/o became an issue. The particular case in point has about 100 tasks/second. The downstream receiver is far more efficient processing a batch of 100 than 5 batches of 20. When I just use Is there any way to create a "buffering cargo" (I like your term) with minimum buffer size and max wait to flush? |
In the end, my code looks something like this. const bufferedHandler = (minBuffer, timeout, handler) => {
let buffer = [];
return (tasks, callback) => {
const processor = () => {
timeoutFn = null;
handler(buffer, (err) => {
buffer = [];
callback(err);
});
};
let timeoutFn = null;
buffer = buffer.concat(tasks);
if (minBuffer < 1 || buffer.length >= minBuffer) {
processor();
} else if (timeoutFn === null) {
timeoutFn = setTimeout(processor,timeout);
}
};
},
// set bufferSize and bufferTimeout separately
cargo = async.cargo(bufferedHandler(bufferSize, bufferTimeout, (tasks, callback) => {
// do processing here as if a regular queue/cargo handler
callback();
})); Would like it to be more elegant, but this is what I have. |
I would like to 👍adding the In my usecase I'm streaming in data from a file very rapidly and then I'm sending it into a kinesis stream. The kinesis stream api prefers bulk item insert but I need to be able to do it more than 1 at a time. So I'm nabbing the internal queue and doing:
|
The implementation is already there, it just needs to be exposed publicly, and docs and tests. |
Closing in favor of #1555 |
I am trying to figure out how to use
async
to create a "cargo-timeout-queue".Essentially, I am trying to cache and throttle downstream i/o behaviour. Think of it as "cargo with minimum batch size and timeout to override."
The above would:
handler
(first arg) until it has 10 (second arg) items to send togetherhandler
with what we haveThis allows me to combine input objects for efficient downstream i/o processing by
handler
, but ensures that it doesn't wait too long. There is a tension between batch size - larger can be better - and delay - I don't want to wait too long.Is there some way to combine
async
functions to get this behaviour? Or do I need to build it?Thanks. Once again,
async
is a mainstay in my JS work. :-)The text was updated successfully, but these errors were encountered: