Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Shortcircuit workflows #65

Open
linkyndy opened this issue Feb 8, 2019 · 16 comments
Open

Shortcircuit workflows #65

linkyndy opened this issue Feb 8, 2019 · 16 comments

Comments

@linkyndy
Copy link

linkyndy commented Feb 8, 2019

Let's say we have the following, simple, workflow:

A -> B -> C

And Job A checks for a value, which if false, the whole workflow should be stopped immediately. Is this possible at the moment with Gush?

As an example, let's say we process a new user signup with the above workflow. Job A retrieves the newest user from the database. If the user has been already processed, we'd like to call the workflow complete; the subsequent steps are not necessary anymore, since the user has been already processed.

@pokonski
Copy link
Contributor

pokonski commented Feb 8, 2019

Hey @linkyndy! Currently it's not possible but there was a suggestion about branching workflows in run time depending on their result, but I haven't had much time tinker with a solution for it.

Right now you could spawn a new Workflow inside A if conditions are met, but it's not a nice solution IMO

@linkyndy
Copy link
Author

linkyndy commented Feb 8, 2019

Thanks for your fast answer! Curious to see how this will develop.

@ace-subido
Copy link

ace-subido commented May 10, 2019

EDIT: I just re-read OP's shortcircuit scneario, my original comment is irrelevant and doesn't help at all 😄

@ace-subido
Copy link

ace-subido commented May 14, 2019

As an example, let's say we process a new user signup with the above workflow. Job A retrieves the newest user from the database. If the user has been already processed, we'd like to call the workflow complete; the subsequent steps are not necessary anymore, since the user has been already processed.

Hi @linkyndy I've just reimplemented something along those lines, this is how I went about it:

# example_workflow.rb
class ExampleWorkflow < Gush::Workflow

  def configure(user_id)
    run(FirstJob, params: { user_id: user_id })
    run(SecondJob, {
      params: { user_id: user_id },
      after: FirstJob,
    })
  end

end

# first_job.rb
class FirstJob < Gush::Job

  def perform
    user = User.find params[:user_id]

    if user.processed?
      self.fail!
      return
    end

    # continue code ...
  end

end

However this marks the flow.status as :failed. But it doesn't run the SecondJob once I call self.fail (I found this after snooping around the source).

@pokonski is this how .fail! could be used? I could put this in the README if you like, would be glad to contribute a PR! 😄

@pokonski
Copy link
Contributor

I don't think this can be used, because fail! will stop the whole workflow 🤔

@ace-subido
Copy link

ace-subido commented May 14, 2019

Yep! That's what I was aiming to achieve 😄the subsequent actions would be skipped, I think based on @linkyndy said:

If the user has been already processed, we'd like to call the workflow complete; the subsequent steps are not necessary anymore, since the user has been already processed.

I figured, that's what he/she is also aiming for.

there was a suggestion about branching workflows in run time depending on their result, but I haven't had much time tinker with a solution for it.

What would this look like? Could I take a stab at this, just pseudo-code and how the usage would look like? I'll think up of something later today, some sort of proposal for this.

@pokonski
Copy link
Contributor

pokonski commented May 14, 2019

One of the good examples had an idea to allow providing two paths, like so:

    run SomeJob,
        before_success: MainJob,
        before_failure: AlternativeJob,

Though the naming is rather unfortunate because it suggest the MainJob will ran before SomeJob is succeesful, so needs better naming 💃

@linkyndy
Copy link
Author

@ace-subido Marking the whole workflow as failed may be too harsh; @pokonski Indeed, it's not really clear from the DSL 😊

@pokonski
Copy link
Contributor

But coming back to the problem from the original issue by @linkyndy, one way would be to introduce a skipped state in jobs. Similar to how Gitlab CI works. I would be fine with accepting such resolution

@ace-subido
Copy link

ace-subido commented May 14, 2019

@pokonski so something like this:

class SkippedJob < Gush::Job
  def perform
    # marks the job as 'skipped', this would also do a 'return'
    self.skip! 
  end
end

We could also do something like skip_remaining! which skips all of the other jobs too, marking the entire workflow as :skipped.

class ExampleWorkflow < Gush::Workflow
  def configure(user_id)
    run SkipRemainingJob
    run SecondJob, after: SkipRemainingJob
  end
end

class SkipRemainingJob < Gush::Job
  def perform
    # marks the job as 'skipped' and all other jobs after, marks the workflow itself as 'skipped' too
    self.skip_remaining! 
  end
end

@pokonski What do you think?

@pokonski
Copy link
Contributor

Yeah this sounds good! One clarification here:

which skips all of the other jobs too

I assume you mean skip all the jobs that were supposed to run after the job skip_remaining! is called from, right?

@ace-subido
Copy link

Yeah, probably cascade throughout job.outgoing in Gush::Worker#enqueue_outgoing_jobs and all it's descendants.

@linkyndy
Copy link
Author

linkyndy commented Jun 2, 2019

I don't really see any value in this job.skip!. If I want to "skip" the current job and go to the next one, I can simply return.

My initial question was related to cancelling the entire workflow from that moment on, and dealing somehow with the remaining jobs.

@ace-subido
Copy link

ace-subido commented Jun 2, 2019

If that's the case, I'll add more to the PR. Something like: skip_remaining!, which would tag everything from that point as skipped

@rajaravivarma-r
Copy link

This particular feature looks very useful for an application we are developing. I would rather be very explicit and name the methods as skip_workflow! for skipping the whole workflow and skip_descendants! for skipping the jobs which should run after the current job.

@natemontgomery
Copy link

natemontgomery commented Jul 24, 2024

I have been using this pattern in my own workflows for a bit now and I was thinking it would be good to go over this discussion again and clear up the goals.

For me, having a state change that occurs when you want to stop executing a job and move on in the workflow is useful on its own, ie the 'skip' call on a job marks that job 'skipped'.
That state represents a job halting but not failing. Since it could happen anywhere in a job the state does end up being a bit ambiguous in meaning, depending on your use case.

I use it as an indicator that the job halted without completing intentionally due to some conditional checks on the state of a set of records, ie 'Invoices'. So, any record (ie Invoice) whose processing job is 'skipped' can then know when it is reprocessed anywhere other than the normal workflow path that the job did or did not finish intentionally.

This lets me avoid needing to track such a state within an Invoice record, isolating it to the job. This seems to lead to good data management and semantics in my opinion. The job state is tracked enough that there is no need to track additional state in a record processed within that job.

I think this is meaningfully distinct from 'return' in a job as returning early does not maintain any state information that could be used for another process to look at.
Really there could be cases where you skip a job in one conditional branch and return early and continue the workflow without any state change in another conditional branch.
I haven't used such a pattern but I think it is a reasonable idea.

I think for this use case a 'skip_remaining' method would not accomplish my needs but I do think there is still room to do both.
Skipping an individual job without halting the whole workflow, and also being able to make the entirety of the remaining workflow jobs as skipped. I would be glad to push forward some ideas on that also but this is already a long post.

If that is too long for anyone:
TL;DR There is room for both skipping an individual job and also skipping the rest of a workflow.

Look forward to having some fresh takes on these ideas!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants