Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Right Way to End a Process in Sandbox #1432

Closed
it-fm opened this issue Aug 19, 2019 · 9 comments
Closed

Right Way to End a Process in Sandbox #1432

it-fm opened this issue Aug 19, 2019 · 9 comments

Comments

@it-fm
Copy link

it-fm commented Aug 19, 2019

I would like to know how to terminate the process of a job in progress through another server. I have an express route that gets the id of a job in progress, is there any official or temporary way to finish the job?
I really need this kind of resource, because my jobs are too consuming, I have no reason to keep a 5-10 minute job running even after the client cancels it.

Server 1 - API Client:

let job = await queue.getJob(8)

job.discard();
job
    .moveToFailed(new Error('cancelled'), true)

Server 2 - Jobs:

queue.process(Job', 2, __dirname + '/job.js')

queue.on('global:failed', function (job, err) {
    queue.getJob(job).then(function (job) {
     // how to kill?
      console.log(job.queue.childPool)
    })
  })

I have some remarks:

1 - Is using a global event to capture the canceled job correct? Will it not cause failures in case of multiple job instances?
2 - How do I cancel the job using job.queue.childPool? I tried using process.kill but it ends the task list.

@stansv
Copy link
Contributor

stansv commented Aug 20, 2019

I think the best solution is to use ioredis directly to check some flag stored in Redis by your own key inside job itself. For example, if you have a long-running loop in your job, query Redis on each iteration.

UPD if moveToFailed really works even when the job is active, you can try to create Queue instance in job.js and listen for global failed event here. Then update some global variable which you periodically check in job code.

UPD2 Just checked moveToFailed implementation, indeed it should work.

@it-fm
Copy link
Author

it-fm commented Aug 20, 2019

And if I have multiple processes running with job.js, how can I end a specific process by listening to a global event?
if i try to do child.kill or process.kill i get the following error: Error [ERR_IPC_CHANNEL_CLOSED]: Channel closed

@stansv
Copy link
Contributor

stansv commented Aug 20, 2019

I mean you should not kill any processes. Instead, check status from inside of your job code and complete job by exiting from job callback (or calling done()).

@it-fm
Copy link
Author

it-fm commented Aug 20, 2019

But how can I check inside job.js the status of the job? even though I finish the status is still the same, so I have no way of knowing that the job has already been canceled

job.js

module.exports = async function (job, done) {
  try {
    let file = job.data.file
    let result = await 3d.run(file)
    return Promise.resolve(result)
  } catch (err) {
    return Promise.reject(err)
  }
}

Another thing, I tried to do a test by setting setTimeout with a done (), but I have no return, just using done (new Error ('canceled')), but the job is restarted, even calling moveToFailed earlier

job.js

module.exports = async function (job, done) {
  try {
    let file = job.data.file

    setTimeout(function(){
      done(new Error('error'));
    },3000)

    let result = await 3d.run(file)
    return Promise.resolve(result)
  } catch (err) {
    return Promise.reject(err)
  }
}

@stansv
Copy link
Contributor

stansv commented Aug 20, 2019

Did you tried to import Bull and create a Queue instance in job.js? Then you can as I suggested try to listen for global events. I know it would take more Redis connections, but this solution should be easy to implement.
You should complete job, not fail, if you want to cancel execution (Bull allows undefined result).

@it-fm
Copy link
Author

it-fm commented Aug 20, 2019

thanks for the help, is there any way to capture the queue event without recreate?

@stansv
Copy link
Contributor

stansv commented Aug 20, 2019

I have some idea but not sure it will work.

NodeJS enables some event-driven IPC with child_process module used in Bull. You can try to get all retained subprocesses from queue.childPool (object defined in child-pool.js) and send some command to each as it made in sandbox.js. In the code of your job attach listener to process object as done in master.js. And again set some global flag to gracefully finish job, do not try to kill process; child processes are reused by Bull.

@manast
Copy link
Member

manast commented Aug 20, 2019

one thing, although child processes are reused, if you kill one of them, a new one will be created when needed automatically, so it will just incur in a slight performance hit (due to the time needed to spawn a new process).

@rivernews
Copy link

rivernews commented Mar 7, 2020

@stansv

Did you tried to import Bull and create a Queue instance in job.js? Then you can as I suggested try to listen for global events. I know it would take more Redis connections, but this solution should be easy to implement.
You should complete job, not fail, if you want to cancel execution (Bull allows undefined result).

I see you mentioned re-creating the queue instance in sandbox will take more redis connection. One thing I want to clarify is, is re-creating the queue of same queue name in sandbox a common practice, when you need access to the queue in sandbox process? Or is there any other way Bull officially recommends?

A use case is dispatching (adding) jobs in sandbox process like in this issue. I think #714 is eventually doing the same thing since sandbox processes will load queues.js for themselves thus re-creating the queues there.

I guess this surprised me a bit that queues needed to be re-created, but after thinking the sandbox process as an entire new process separate from the master process, everything make sense now.

There's also a caveat for using sandbox process is that if you try to access job.queue in sandbox process to reverse lookup the queue information, the .queue property will not be available.

For "it would take more Redis connections", I made an experiment and verified the additional redis connections created by re-creating queues in sandbox process. I have three queues in master process (my express server), and when I start w/ no job running yet, I can see 6 client connections via redis-cli client list:

8179
8180
8176
8177
8178
8181

When I hit my endpoint, which adds a job, now the client list count adds up to 6 plus 4 (2 queues re-created in job) + 2 (2 pubsub clients created in job)

8185
8177
8178
8181
8179
8186
8187
8180
8188
8189
8176
8184

Which matches the expected result.

One problem is just from the UI bull-board I'm using. Before I started my job, I see the UI shows 15 connected clients, 3 blocked clients. After the job activated, where queues are re-created, the UI still shows 15 connected clients / 3 blocked. But I think it's an issue of bull-board so will not go further here.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants