[CCR] Add total fetch time took stat #34577

martijnvg · 2018-10-18T07:40:52Z

This new stat keeps track how much time was spent on fetches from the leader cluster perspective.

The current name of this new stat is total_fetch_took_time_millis and it doesn't describe well what this stat represents. I'm not sure what would be a good name. Maybe total_fetch_leader_time_millis or total_fetch_remote_time_millis to indicate it is the same as total_fetch_time_millis but from a different perspective? cc: @bleskes / @jasontedor

keeps track how much time was spent on fetches from the leader cluster perspective.

elasticmachine · 2018-10-18T07:40:54Z

Pinging @elastic/es-distributed

bleskes · 2018-10-18T14:38:16Z

I think I'm good with the following:

Changes response has a took time field
Shard Follow Node Task sums the value of that feed under total_fetch_leader_time_millis
We have a total_fetch_time_millis as we do today.

martijnvg · 2018-10-18T16:07:37Z

yes, this pr keeps the original total_fetch_time_millis and adds a took time field to the shard changes api response. I will rename total_fetch_took_time_millis to total_fetch_leader_time_millis .

…

On Thu, Oct 18, 2018 at 16:38 Boaz Leskes ***@***.***> wrote: I think I'm good with the following: 1. Changes response has a took time field 2. Shard Follow Node Task sums the value of that feed under total_fetch_leader_time_millis 3. We have a total_fetch_time_millis as we do today. — You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub <#34577 (comment)>, or mute the thread <https://github.com/notifications/unsubscribe-auth/AAjbRXSuFRdq-nFYYflgBkrVwWdxVh6Zks5umJJrgaJpZM4Xs2D6> .

jasontedor

I wrote more words than necessary to say we should start the clock sooner.

jasontedor · 2018-10-18T21:55:48Z

x-pack/plugin/ccr/src/main/java/org/elasticsearch/xpack/ccr/action/ShardChangesAction.java

        }

        @Override
        protected void asyncShardOperation(
                final Request request,
                final ShardId shardId,
                final ActionListener<Response> listener) throws IOException {
+            request.relativeStartNanos = System.nanoTime();


I think this is the wrong place to start the clock. The goal with this metric is to capture the non-network time on the leader so we want to start the clock as soon as possible. Right now these requests are handled on the network thread before we go async, so there is not much in terms of queuing time, but it would be a bug if somehow this ever changed and handling of this request was not on the network thread before going async, this would miss queuing time in that other thread pool (we miss internal Netty queuing, but I don't think there's anything obvious that we can do about that). This is also something to think about in the context of the NIO work. I also don't like that we are mutating the request, it's doing to make it harder to have immutable request objects.

I think we should start the clock the moment the request is de-serialized.

…llis`.

martijnvg · 2018-10-22T08:19:31Z

@bleskes @jasontedor I've updated this PR.

jasontedor

LGTM.

Add total fetch time leader stat, that keeps track how much time was spent on fetches from the leader cluster perspective.

* master: (24 commits) ingest: better support for conditionals with simulate?verbose (elastic#34155) [Rollup] Job deletion should be invoked on the allocated task (elastic#34574) [DOCS] .Security index is never auto created (elastic#34589) CCR: Requires soft-deletes on the follower (elastic#34725) re-enable bwc tests (elastic#34743) Empty GetAliases authorization fix (elastic#34444) INGEST: Document Processor Conditional (elastic#33388) [CCR] Add total fetch time leader stat (elastic#34577) SQL: Support pattern against compatible indices (elastic#34718) [CCR] Auto follow pattern APIs adjustments (elastic#34518) [Test] Remove dead code from ExceptionSerializationTests (elastic#34713) A small typo in migration-assistance doc (elastic#34704) ingest: processor stats (elastic#34724) SQL: Implement IN(value1, value2, ...) expression. (elastic#34581) Tests: Add checks to GeoDistanceQueryBuilderTests (elastic#34273) INGEST: Rename Pipeline Processor Param. (elastic#34733) Core: Move IndexNameExpressionResolver to java time (elastic#34507) [DOCS] Force Merge: clarify execution and storage requirements (elastic#33882) TESTING.asciidoc fix examples using forbidden annotation (elastic#34515) SQL: Implement `CONVERT`, an alternative to `CAST` (elastic#34660) ...

Add total fetch time leader stat, that keeps track how much time was spent on fetches from the leader cluster perspective.

[CCR] Add total fetch time took stat, that

df6f966

keeps track how much time was spent on fetches from the leader cluster perspective.

martijnvg added >non-issue WIP :Distributed Indexing/CCR Issues around the Cross Cluster State Replication features labels Oct 18, 2018

jasontedor requested changes Oct 18, 2018

View reviewed changes

martijnvg added 3 commits October 22, 2018 09:20

Merge remote-tracking branch 'es/master' into ccr_total_fetch_took_stat

77caf20

start the clock the moment the request is de-serialized.

2d752cf

Renamed total_fetch_took_time_millis to `total_fetch_leader_time_mi…

d0a6177

…llis`.

martijnvg removed the WIP label Oct 23, 2018

Merge remote-tracking branch 'es/master' into ccr_total_fetch_took_stat

d6c7532

jasontedor approved these changes Oct 23, 2018

View reviewed changes

martijnvg merged commit e6d87cc into elastic:master Oct 23, 2018

martijnvg added a commit that referenced this pull request Oct 23, 2018

[CCR] Add total fetch time leader stat (#34577)

ce80b92

Add total fetch time leader stat, that keeps track how much time was spent on fetches from the leader cluster perspective.

kcm pushed a commit that referenced this pull request Oct 30, 2018

[CCR] Add total fetch time leader stat (#34577)

3e1c17f

Add total fetch time leader stat, that keeps track how much time was spent on fetches from the leader cluster perspective.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[CCR] Add total fetch time took stat #34577

[CCR] Add total fetch time took stat #34577

martijnvg commented Oct 18, 2018

elasticmachine commented Oct 18, 2018

bleskes commented Oct 18, 2018

martijnvg commented Oct 18, 2018 via email

jasontedor left a comment

jasontedor Oct 18, 2018

martijnvg commented Oct 22, 2018

jasontedor left a comment

[CCR] Add total fetch time took stat #34577

[CCR] Add total fetch time took stat #34577

Conversation

martijnvg commented Oct 18, 2018

elasticmachine commented Oct 18, 2018

bleskes commented Oct 18, 2018

martijnvg commented Oct 18, 2018 via email

jasontedor left a comment

Choose a reason for hiding this comment

jasontedor Oct 18, 2018

Choose a reason for hiding this comment

martijnvg commented Oct 22, 2018

jasontedor left a comment

Choose a reason for hiding this comment