-
Notifications
You must be signed in to change notification settings - Fork 386
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Cluster missing executions #71
Comments
Weir one, could you plese enable the debug flag and send me the log of the three nodes? Thanks! |
Found an interesting part in the logs: I did not have a rule in the firewall configuration for that cluster to allow connections on port 6868 (RPC). After opening that port I get reports for all runs until now. Though there are jobs assigned to a machine, which left the cluster (status Logs from the current leader after its takeover from the node which left the cluster at 13:37:
Logs from 15:00:
Both job executions vanished because they were sent to machines which already left the cluster (exchanged by the autoscaling-group). I think this may be the case because in |
@Luzifer both things makes sense
This is necessary for the clients to send back execution reports with a tolerable size as serf doesn't support large messages (yet) I'll make a statement in the docs. Regarding the vanished executions due to autoscaling, this is a good catch! indeed a bug, I missed the status when getting the cluster members. Expect a hotfix release the following days. |
Currently we're including in the node filtering, nodes that could potentially be gone. fixes #71
My setup:
3 AWS EC2s running CoreOS stable, all three machines being part of an etcd cluster, all three machines running dkron 0.6.3 as a cluster. All machines does have
role: executor
tag and the following cron:Expected result:
The cron is executed every 5 minutes and will succeed every single time as it just calls
/usr/bin/date
(which is the location of thedate
utility on CoreOS).Observed result:
The task is sent to one of the machines every 5 minutes and gets eventually executed. The cron is missing several executions having no end date, no running
date
utility on the machine and afailed
state.Question
Can you help me with this / tell me what I'm doing wrong? Is there anything I can do to prevent this?
The text was updated successfully, but these errors were encountered: