Delay between fleet service submit and actual systemd service start #1362

paveltiunov · 2015-09-26T09:56:07Z

Hi guys!
Great job on CoreOS!
We're using fleet to run high frequency services: there are many starts and stops and they should be performed as fast as possible.
We're experiencing significant delays about 2-3 seconds between submit of a service to fleet and it's actual start with systemd.
Tried to tweak engine_reconcile_interval and no effect so far.
What I see in logs with verbosity=10 and journalctl -f -u fleet.service | grep Reconciler:

Sep 26 11:15:48 core-01 fleetd[2225]: INFO reconciler.go:161: EngineReconciler completed task: {Type: AttemptScheduleUnit, JobName: allcountjs-app-allcount-tasks.service, MachineID: 22c5d548565e4643b76d2119ab925394, Reason: "target state launched and unit not scheduled"}
Sep 26 11:15:49 core-01 fleetd[2225]: DEBUG reconcile.go:84: Reconciler tick
Sep 26 11:15:50 core-01 fleetd[2225]: DEBUG reconcile.go:84: Reconciler tick
Sep 26 11:15:51 core-01 fleetd[2225]: DEBUG reconcile.go:84: Reconciler tick
Sep 26 11:15:51 core-01 fleetd[2225]: DEBUG reconcile.go:321: AgentReconciler attempting tasks [{LoadUnit unit scheduled here but not loaded %!s...
Sep 26 11:15:51 core-01 fleetd[2225]: INFO reconcile.go:330: AgentReconciler completed task: type=LoadUnit job=allcountjs-app-allcount-tasks.service reason="unit scheduled here but not loaded"
Sep 26 11:15:51 core-01 fleetd[2225]: INFO reconcile.go:330: AgentReconciler completed task: type=ReloadUnitFiles job=N/A reason="always reload unit files"
Sep 26 11:15:51 core-01 fleetd[2225]: INFO reconcile.go:330: AgentReconciler completed task: type=StartUnit job=allcountjs-app-allcount-tasks.service reason="unit currently loaded but desired state is launched"
Sep 26 11:15:51 core-01 fleetd[2225]: DEBUG reconcile.go:61: AgentReconciler completed reconciliation in 158.214102ms

Looks like there is lack of "Reconciler triggered" message for AgentReconciler.
Do you have any suggestions we can start from to solve this problem?
Fleet version is 0.10.2.

The text was updated successfully, but these errors were encountered:

AfterIndex watcher option usage ensures registry event watcher won't miss events that trigger Agent reconciliation. Fixes coreos#1362

paveltiunov · 2015-09-26T23:12:47Z

Fixed problem for us just by adding AfterIndex watcher option.

paveltiunov · 2015-11-21T17:29:21Z

Hi guys!
Any thoughts on it? Would you fix it or it seems to be not critical?

jonboulle · 2016-02-10T14:02:22Z

Could you explain in a little more detail how the afterindex change helped? (how were the events getting missed?). thanks!

jonboulle · 2016-02-10T14:06:34Z

history: https://github.com/coreos/fleet/pull/751/commits /cc @tixxdz

jonboulle · 2016-02-10T14:14:57Z

The concern with this approach is running into #408 again. I think we'd need to add a mitigation, similar to #411 , but could be much simpler: rather than having to "catch up", we can just go straight to 0 again (i.e. latest)

jonboulle · 2016-02-10T14:15:35Z

Would love to know more about how you're missing events, though

paveltiunov · 2016-02-10T14:58:56Z

Hi @jonboulle !
Looks like when 0 is used as AfterIndex then when multiple events arrived at same time you could miss some of events in case event processing time is big enough. Particularly there's a situation where "Reconciler triggered" event is missed and reconciliation is delayed for couple of seconds.

paveltiunov · 2016-02-10T15:54:33Z

Very tricky thing here is that const in agent/reconcile.go

const (
    // time between triggering reconciliation routine
    reconcileInterval = 5 * time.Second
)

^ Update is tied to that time. So no matter what engine_reconcile_interval you have you won't be able to update more frequently than once per 5 second.

paveltiunov · 2016-02-10T16:21:19Z

This commit bcwaldon@bdf5f72 states

Since the content of events is no longer used to drive the Engine or the Agent, handling all events in order no longer matters.

Actually content of events is used here:

    switch path.Base(res.Node.Key) {
    case "target-state":
        ev = JobTargetStateChangeEvent
        ok = true
    case "target":
        ev = JobTargetChangeEvent
        ok = true
    }

So 0 AfterIndex can be used if this code will return always ok = true -- it's where events are missed I believe.

kayrus · 2016-02-18T14:07:11Z

@paveltiunov which etcd version do you use for fleet?

paveltiunov · 2016-02-22T21:47:45Z

etcd version 0.4.9

kayrus · 2016-02-24T13:55:08Z

@paveltiunov etcd 0.4.x has lots of bugs and architectural flaws. can you upgrade your etcd cluster to the latest 2.2?

Please note that you have to upgrade 0.4.x to 2.0.x, then 2.0.x to 2.1.x, and then 2.1.x to 2.2.x.

kayrus · 2016-02-26T11:07:20Z

Relates to #1363

kayrus · 2016-02-29T08:18:35Z

@paveltiunov were you able to upgrade etcd cluster and test fleet with the new one?

jonboulle · 2016-03-01T14:30:51Z

Bumping milestone for now.

@paveltiunov As @kayrus mentioned, we'd love to know if you can replicate the problem with a recent (2.2+) version of etcd

paveltiunov · 2016-03-01T14:38:54Z

@kayrus is there some automatic way to upgrade? Upgrading production environment with proposed method doesn't seem reasonable for us. What release version of CoreOS is using etcd 2.2?

kayrus · 2016-03-01T15:33:51Z

@paveltiunov latest coreos stable release ships etcd 2.2.0. and I would like to inform you that etcd v0.4.x binary will be removed in future coreos releases.

kayrus · 2016-03-01T15:49:25Z

@paveltiunov, automatic upgrades work in 2.x branch only. but I strongly recommend you to create a backup before each upgrade.

paveltiunov · 2016-03-01T17:32:55Z

@kayrus Sorry. My bad. I forgot etcd 2 is etcd2 binary. Thanks for pointing this out. Our version is

etcd Version: 2.2.0
Git SHA: e4561dd
Go Version: go1.5.1
Go OS/Arch: linux/amd64

So we're up to date here.

kayrus · 2016-03-01T18:00:44Z

@paveltiunov great. so we have to get more info at this point.

how many units are scheduled in your cluster?
how many etcd members do you have?
how many fleet machines do you have?
do you use custom configuration for etcd and fleet? can you provide your cloud-config you use to configure your coreos nodes?
which coreos release do you use?

dongsupark · 2016-10-31T17:26:04Z

I had a closer look into this issue.

Even if etcd.WatcherOptions.AfterIndex becomes non-zero, it doesn't mean that it could prevent events from being missed. The WatcherOptions is passed to kAPI.Watcher in etcd, and httpWatcher.Next() always updates hw.nextWait.WaitIndex the ModifiedIndex, regardless of the initial AfterIndex.

Also it's true that fleet doesn't make use of content of the returned event, in agent or engine, as noted in bcwaldon@bdf5f72. I think it's hard to find a logical relation between AfterIndex and the missing events.

The only scenario I can imagine is etcd being overloaded due to a well-known issue, which results in temporary event misses. That could be eventually fixed by using etcd v3 and fleet's recent versions.

Even if it's worth setting AfterIndex to non-zero, I think the long term plan should be just to change fleet to use new API from etcd v3, based on protobuf.

So I would suggest closing this issue and #1363, unless the original author could give another feedback.

paveltiunov added a commit to paveltiunov/fleet that referenced this issue Sep 26, 2015

registry: add AfterIndex event watcher option

f22f1aa

AfterIndex watcher option usage ensures registry event watcher won't miss events that trigger Agent reconciliation. Fixes coreos#1362

paveltiunov mentioned this issue Sep 26, 2015

registry: add AfterIndex event watcher option #1363

Closed

jonboulle added the kind/support label Feb 10, 2016

jonboulle added this to the v0.12.0 milestone Feb 10, 2016

jonboulle modified the milestones: v1.0.0, v0.12.0 Mar 1, 2016

dongsupark closed this as completed Nov 1, 2016

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Delay between fleet service submit and actual systemd service start #1362

Delay between fleet service submit and actual systemd service start #1362

paveltiunov commented Sep 26, 2015

paveltiunov commented Sep 26, 2015

paveltiunov commented Nov 21, 2015

jonboulle commented Feb 10, 2016

jonboulle commented Feb 10, 2016

jonboulle commented Feb 10, 2016

jonboulle commented Feb 10, 2016

paveltiunov commented Feb 10, 2016

paveltiunov commented Feb 10, 2016

paveltiunov commented Feb 10, 2016

kayrus commented Feb 18, 2016

paveltiunov commented Feb 22, 2016

kayrus commented Feb 24, 2016

kayrus commented Feb 26, 2016

kayrus commented Feb 29, 2016

jonboulle commented Mar 1, 2016

paveltiunov commented Mar 1, 2016

kayrus commented Mar 1, 2016

kayrus commented Mar 1, 2016

paveltiunov commented Mar 1, 2016

kayrus commented Mar 1, 2016

dongsupark commented Oct 31, 2016

Delay between fleet service submit and actual systemd service start #1362

Delay between fleet service submit and actual systemd service start #1362

Comments

paveltiunov commented Sep 26, 2015

paveltiunov commented Sep 26, 2015

paveltiunov commented Nov 21, 2015

jonboulle commented Feb 10, 2016

jonboulle commented Feb 10, 2016

jonboulle commented Feb 10, 2016

jonboulle commented Feb 10, 2016

paveltiunov commented Feb 10, 2016

paveltiunov commented Feb 10, 2016

paveltiunov commented Feb 10, 2016

kayrus commented Feb 18, 2016

paveltiunov commented Feb 22, 2016

kayrus commented Feb 24, 2016

kayrus commented Feb 26, 2016

kayrus commented Feb 29, 2016

jonboulle commented Mar 1, 2016

paveltiunov commented Mar 1, 2016

kayrus commented Mar 1, 2016

kayrus commented Mar 1, 2016

paveltiunov commented Mar 1, 2016

kayrus commented Mar 1, 2016

dongsupark commented Oct 31, 2016