Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add ability to "pause" or disable tasks (recurring) #27

Open
kagkarlsson opened this issue Apr 20, 2018 · 15 comments
Open

Add ability to "pause" or disable tasks (recurring) #27

kagkarlsson opened this issue Apr 20, 2018 · 15 comments

Comments

@kagkarlsson
Copy link
Owner

kagkarlsson commented Apr 20, 2018

Add a method SchedulerClient.pause(..) to enable temporarily pausing the execution, i.e. not run when due.

Consider adding status/state field to facilitate this. Candidates:

  • active
  • stopped/paused/inactive
  • dead-letter (for manual followups or delayed error-handling)
  • done (for history, a form of "recently ran"-log)

Additional use-cases this enables:

  • Keeping a history of completed executions (i.e. do not delete on completion)
  • DLQ (end-state after a certain number of retries)
  • Log of recent work
@codewithrajesh
Copy link

Hi, @kagkarlsson how is currently a recurring task is stopped?

@kagkarlsson
Copy link
Owner Author

If you deploy a new version of the code where the Scheduler is not started with knowledge of that task-type it will be removed after 14days (default value). (and it will not be executed in the meantime)

Or are you talking of more temporary solutions?

@codewithrajesh
Copy link

Apologies for the late response.
Actually, I have two questions now.

  1. Say, the scheduler is started with task-name & implementation mapping. However, I want to stop the future occurrence of the schedule. how can we achieve that?
  2. In another scenario, say, I have scheduled a job(onetime or recurring) and task-name & implementation mapping is also in place(before scheduler start) but after some time the scheduler instance is killed because of some issue, on restart will the scheduler be able to pick the task which was to schedule in the future?

@kagkarlsson
Copy link
Owner Author

kagkarlsson commented May 19, 2020

  1. If you don't want to redeploy a version of your app where the task is removed (the easy way, as mentioned above), you can for recurring tasks use Scheduler.cancel(TaskInstanceId.of("your-task-name", "recurring")) (for non-recurring, you need to use the correct instance-id (it may be added again though, if an instance that still has it under startTasks(...) is restarted..)

  2. Yes the future executions are persistent, they live in the database table scheduled_tasks. However, if the scheduler is killed (non-gracefully) while it is running an execution, the execution will not be unlocked until it some scheduler-instance consider it to be "dead". You can read about that under Dead executions. Default behavior is that "dead" executions get rescheduled to a new time, but that can be overridden in the builder.

@codewithrajesh
Copy link

Thanks for the response.
For us, The task is an I/O bound operation. basically, it does a POST request at a given time on a given endpoint. I don't see that information(the endpoint) stored in the "scheduled_tasks" table.
Maybe we need to make use of "task_data" field?

@kagkarlsson
Copy link
Owner Author

Yes, if you have some context data at scheduling-time that you want the task to have access to at execution-time you need to either encode it in the instance-id string, or use custom task-data

@setraj
Copy link

setraj commented May 22, 2020

Thanks, i was able to persist the task data.
But still, i am not clear on how on the startup the scheduler will pick up the future tasks. I keep getting this error.
"Failed to find implementation for task with name "task-name". Execution will be excluded from due. Either delete the execution from the database, or add an implementation for it. The scheduler may be configured to automatically delete unresolved tasks after a certain period of time"
i see here
you say, that we need to provide mapping with .create(dataSource, myAdhocTask). do i need to recreate the myAdhocTask object again somehow and map it to the task-name? or i am missing something here?

@runeflobakk
Copy link
Contributor

runeflobakk commented Feb 17, 2021

I was thinking about something like this:
Support a "kill switch" in the scheduled_task-table using a column called, say, a enabled of a boolean type. This value of this column should then be inspected to see if a task should actually be executed at the time it is due. And it avoids any interference with mechanisms for handling dead executions, and such.

Setting enabled = false would stop any future executions of the specific task to take place. Ongoing executions when setting to false would finish as usual, and rescheduling (if applicable) would happen as usual. But the task would not be executed until enabled = true.

This would enable to not necessarily have your application and task implementations to be aware of a "pause" feature (but they could, if they needed to, either by support i SchedulerClient or just updating scheduled_task table directly), but you would have the opportunity to pause further task execution by manually just updating the table using your favourite RDBMS client.

In addition (and this is obviously further expanding the scope), db-scheduler could perhaps do some initial inspection on startup of the scheduled_task table to determine if the enabled-column is present, and adapt its queries accordingly. If the column is not present, then the feature will not be supported, and tasks are executed as they are today without evaluating an enabled-flag. More importantly, it would eliminate the need for a new major version to add support for such a "persistent kill-switch".

Edit: maybe a paused-column is more appropriate, depending on how you prefer to talk about it as a feature.

@kagkarlsson
Copy link
Owner Author

This enhancement would likely also enable this feature-request: #216

@kagkarlsson
Copy link
Owner Author

We might want to consider using something like state = (active, stopped, dead-letter) or something like that to enable more features without requiring new fields.

@prem-cse
Copy link

prem-cse commented Jan 7, 2024

@kagkarlsson Are there any plans to support the pause/resume feature ?

@kagkarlsson
Copy link
Owner Author

@prem-cse Yes, but I cannot say when

@prem-cse
Copy link

@kagkarlsson Is anybody working on it? If not then I would like to implement this as explained by @runeflobakk

@runeflobakk
Copy link
Contributor

runeflobakk commented Jan 16, 2024

My original suggestion may be too specific for implementing as-is in db-scheduler, as I primarily use a certain subset of its features.

@prem-cse It is possible to implement such a facility "outside db-scheduler", and I am happy to share how I have done it. The implementation does not change anything in db-scheduler, is fully separate from db-scheduler internals, and only uses its official API.

We currently use the VoidExecutionHandler for most, if not all, our tasks, which is configured to run recurring at given intervals, but it should be possible to adapt this approach outlined below to the other execution handler types as well.

I have a separate table which acts as an "extension" to the standard scheduled_tasks table (which is unchanged). You can "connect" its rows however you want to scheduled_tasks, we just use "task_name" to link each row to a db-scheduler task:

         Table "scheduled_tasks_ext"
┌───────────┬─────────┬───────────┬──────────┬─────────┐
│  Column   │  Type   │ Collation │ Nullable │ Default │
├───────────┼─────────┼───────────┼──────────┼─────────┤
│ task_name │ text    │           │ not null │         │
│ enabled   │ boolean │           │ not null │ true    │
└───────────┴─────────┴───────────┴──────────┴─────────┘

So, to make each execution triggered by db-scheduler "aware" of this added "enabled" flag, I have created an implementation of VoidExecutionHandler which wraps and delegates to another VoidExecutionHandler. The wrapping handler looks like this:

public class EnableSupportedVoidExecutionHandler<T> implements VoidExecutionHandler<T> {

    private static final Logger LOG = LoggerFactory.getLogger(EnableSupportedVoidExecutionHandler.class);

    private final TaskExecutionEnabler enabler;
    private final VoidExecutionHandler<T> delegatedHandler;
    private final AtomicBoolean loggedSkippedExecution = new AtomicBoolean(false);

    public EnableSupportedVoidExecutionHandler(TaskExecutionEnabler enabler, VoidExecutionHandler<T> delegatedHandler) {
        this.enabler = enabler;
        this.delegatedHandler = delegatedHandler;
    }

    @Override
    public void execute(TaskInstance<T> taskInstance, ExecutionContext executionContext) {
        if (enabler.isExecutionEnabled()) {
            loggedSkippedExecution.set(false);
            delegatedHandler.execute(taskInstance, executionContext);
        } else if (loggedSkippedExecution.compareAndSet(false, true)) {
            LOG.info("Skipping execution of task '{}' ({}), because it is disabled", taskInstance.getTaskName(), taskInstance.getId());
        }
    }
}

It contains the delegated execution handler, and an instance of ´TaskExecutionEnabler´ which only responsibility is to decide on each execution whether to proceed or not. Also has some facilities to log skipped execution at most once per instance, but it is not important.

The ´TaskExecutionEnabler´ is a functional interface:

@FunctionalInterface
public interface TaskExecutionEnabler {
    boolean isExecutionEnabled();
}

We have a class to interface with the scheduled_tasks_ext table:

public class ScheduledTasksExtension {

    private final DataSource dataSource;

    public ScheduledTasksExtension(DataSource dataSource) {
        this.dataSource = dataSource;
    }

    public boolean isTaskEnabled(String taskName) {
        String sql =
            """
            SELECT enabled FROM scheduled_tasks_ext
            WHERE task_name = ?
            """;
        try (Connection connection = dataSource.getConnection(); PreparedStatement stmt = connection.prepareStatement(sql)) {
            stmt.setString(1, taskName);
            stmt.execute();
            try (ResultSet resultSet = stmt.getResultSet()) {
                if (resultSet.next()) {
                    return resultSet.getBoolean("enabled");
                } else {
                    return true; // nothing explicitly configured, defaults to enabled: true
                }
            }
        } catch (SQLException e) {
            throw new RuntimeException(
                    "Unable to determine if task '" + taskName + "' is enabled, " +
                    "because " + e.getClass().getSimpleName() + ": '" + e.getMessage() + "'", e);
        }
    }

    public TaskExecutionEnabler createTaskEnablerFor(String taskName) {
        return () -> isTaskEnabled(taskName);
    }
}

So to tie all this together, here is an example for a made-up execution handler to send emails which is due:

// typically a singleton, e.g. a Spring bean
ScheduledTasksExtension scheduledTasksExtension = new ScheduledTasksExtension(dataSource); 

// per task
SendEmailsExecutionHandler sendEmails = new SendEmailsExecutionHandler();
String taskName = "send-emails-task";
VoidExecutionHandler<Void> sendEmailsWithEnableSupport = new EnableSupportedVoidExecutionHandler<>(
    () -> scheduledTasksExtension.isTaskEnabled(taskName), sendEmails);
RecurringTask<?> sendEmailsTask = Tasks.recurring(
    taskName, Schedules.fixedDelay(Duration.ofMinutes(1)).execute(sendEmailsWithEnableSupport));

If we want to disable a recurring task, we just insert a new row:

INSERT INTO scheduled_tasks_ext (task_name, enabled) VALUES ("send_emails_task", false);

To reenable it, we can flip the flag to true, or just remove the row, as the default is always that the task is enabled.

Example query (PostgreSQL) to list scheduled tasks including the enabled flag if they are enabled or not:

SELECT task_name, execution_time, enabled FROM scheduled_tasks
LEFT JOIN scheduled_tasks_ext USING (task_name);

So to sum up, this is realized with a small database table, two classes, one functional interface, and every execution handler of ours needs to be wrapped in the EnableSupportedVoidExecutionHandler when instantiated.

A bit of code, so there may be a typo here and there, and perhaps some left out details, but this may get you and anyone else going, should you want to build support for this yourselves outside db-scheduler.


The extra query which is performed for each execution triggered by db-scheduler obviously comes with a small performance hit, but for our cases this is negligible. So this is obviously not a suggestion on how first-class support for this could be implemented in db-scheduler, but to show that it is possible to implement this separately using only the APIs already offered by db-scheduler. We have had this mechanism running in production for 3 years with no problems. The way it is implemented should also make it easy to remove, should support for this be part of db-scheduler at a later time.

@KangoV
Copy link

KangoV commented Apr 12, 2024

Just wondering on the status of this issue. For our purposes, we just need to stop a task instance from being executed at the normal time (disabled) and then to enable it again. I'm using the scheduler to pull in cyber security feeds. If we see an issue with a feed, we would need to temporarily pause to check it and then enable it again. This way all the task history is kept intact.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

6 participants