Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Scheduler is not triggered or triggered 2 times randomly with native image #17724

Closed
jmpdev34 opened this issue Jun 7, 2021 · 6 comments · Fixed by #17763
Closed

Scheduler is not triggered or triggered 2 times randomly with native image #17724

jmpdev34 opened this issue Jun 7, 2021 · 6 comments · Fixed by #17763
Assignees
Labels
area/scheduler kind/bug Something isn't working
Milestone

Comments

@jmpdev34
Copy link

jmpdev34 commented Jun 7, 2021

Describe the bug

Randomly, scheduler is not triggered or triggered 2 times with native image
This behavior is not observed in "jvm mode"

Expected behavior

Scheduler should trigger 1 time and doesn't miss any trigger event with native image

To Reproduce

https://github.com/jmpdev34/reproducer-scheduler-native

Steps to reproduce the behavior:
The reproducer should schedule every second based on a cron expression.
When the scheduler method is fired, a log with the datetime is printed
If the scheduler triggers 2 times, a log is printed : 'Already triggered ..."

  1. Build a native image of the reproducer project
  2. Start runner
  3. Wait for a "Already triggered' log and verify if a second hasn't been triggered

Note : if none "Already triggered" log appears, try to shutdown et re-start the runner and/or start another activity on the computer (IT tests on a project for example)

Configuration

# Add your application.properties here, if applicable.
reproducer.scheduler.cron=0/1 * * ? * *

Environment (please complete the following information):

Output of uname -a or ver

Linux 344910zx2006 4.15.0-142-generic #146-Ubuntu SMP Tue Apr 13 01:11:19 UTC 2021 x86_64 x86_64 x86_64 GNU/Linux

Output of java -version

openjdk version "11.0.9" 2020-10-20
OpenJDK Runtime Environment AdoptOpenJDK (build 11.0.9+11)
OpenJDK 64-Bit Server VM AdoptOpenJDK (build 11.0.9+11, mixed mode)

GraalVM version (if different from Java)

graalvm-ce-java11-20.1.0

Quarkus version or git rev

1.11.5.Final

Build tool (ie. output of mvnw --version or gradlew --version)

maven version 3.6.3

@jmpdev34 jmpdev34 added the kind/bug Something isn't working label Jun 7, 2021
@quarkus-bot
Copy link

quarkus-bot bot commented Jun 7, 2021

/cc @mkouba

@gsmet
Copy link
Member

gsmet commented Jun 7, 2021

Are you sure it's not just the event being slightly delayed and you have 2 events in the same second because the one before was a bit delayed?

That being said, a one second delay looks like a lot.

@mkouba mkouba self-assigned this Jun 7, 2021
@jmpdev34
Copy link
Author

jmpdev34 commented Jun 7, 2021 via email

@mkouba
Copy link
Contributor

mkouba commented Jun 7, 2021

Well, I was able to reproduce the Already triggered ... behavior, by slowing down my machine significantly. But it's not that the triggers would be fired at once. If you modify this line to something like System.out.println("Triggered at : " + now + " but already triggered : " + maps.get(time)); you'll see that the second trigger is fired > 900ms after the first one:

Triggered at : 2021-06-07T16:40:11.004974
Triggered at : 2021-06-07T16:40:11.999259 but already triggered : 2021-06-07T16:40:11.004974 executor-thread-1 4

That's far from being ideal of course but at least it's fired "roughly every second".

We will need to analyze the scheduling logic and find out if we can do better...

@mkouba
Copy link
Contributor

mkouba commented Jun 7, 2021

Yes I am sure because in production, we observe the same behavior with scheduler witch fire every 15 minutes

That's a more serious problem though. I think that I know where the problem is. I'll try to find a fix tomorrow...

@mkouba
Copy link
Contributor

mkouba commented Jun 8, 2021

TLDR>

This PR should fix the problem, I've replaced the logic used in the CronTrigger. With this fix I was not able to reproduce the problem with interval > 1s (tested the 0/5 * * ? * * cron expression). For 0/1 * * ? * * you may still observe an Already triggered message but that's all right because the timestamp used as the key in the map is the time the ReproducerScheduler.schedule(ScheduledExecution) method was executed which is not necessarily the time the trigger was checked against. In other words, it's OK because the trigger is fired correctly but there's some delay before the @Scheduled method is executed; and there's not much we can do about this.

What caused the problem?

The previous algorithm checked the difference between the last execution time truncated to seconds (e.g. 2021-06-08T07:01:20) and the time the trigger was checked against. And if the diff was < 1s the trigger was fired. However, in some cases () the check passed for two subsequent checks, e.g. 021-06-08T07:01:20.004 and 2021-06-08T07:01:20.999.

The algorithm was changed:

  1. the CronTrigger is now stateful and holds the lastFireTime (just as the IntervalTrigger)
  2. the following condition must be met:
    a. the time the trigger is checked against must be after the previous time extracted from the cron expression
    b. the lastFireTime stored on the trigger must be before the previous time extracted from the cron expression

I believe that this change should be robust enough. Comments and feedback is appreciated...

@mkouba mkouba added this to the 2.1 - main milestone Jun 8, 2021
@gsmet gsmet modified the milestones: 2.1 - main, 1.13.7.Final Jun 8, 2021
gsmet pushed a commit to gsmet/quarkus that referenced this issue Jun 8, 2021
gsmet pushed a commit to gsmet/quarkus that referenced this issue Jun 10, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/scheduler kind/bug Something isn't working
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants