Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Adapt docs about metrics of Streams according to KIP-444 #8171

Merged
merged 4 commits into from
Feb 27, 2020

Conversation

cadonna
Copy link
Contributor

@cadonna cadonna commented Feb 26, 2020

Adapts the docs about metrics of Streams according to https://cwiki.apache.org/confluence/display/KAFKA/KIP-444%253A+Augment+metrics+for+Kafka+Streams

Committer Checklist (excluded from commit message)

  • Verify design and implementation
  • Verify test coverage and CI build status
  • Verify documentation (including upgrade notes)

@cadonna
Copy link
Contributor Author

cadonna commented Feb 26, 2020

Call for review @guozhangwang

@bbejeck
Copy link
Member

bbejeck commented Feb 26, 2020

Ok to test

Copy link
Contributor

@guozhangwang guozhangwang left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @cadonna , made a pass.

docs/ops.html Outdated
@@ -85,16 +85,16 @@ <h4><a id="basic_ops_restarting" href="#basic_ops_restarting">Graceful shutdown<

<h4><a id="basic_ops_leader_balancing" href="#basic_ops_leader_balancing">Balancing leadership</a></h4>

Whenever a broker stops or crashes, leadership for that broker's partitions transfers to other replicas. When the broker is restarted it will only be a follower for all its partitions, meaning it will not be used for client reads and writes.
Whenever a broker stops or crashes leadership for that broker's partitions transfers to other replicas. This means that by default when the broker is restarted it will only be a follower for all its partitions, meaning it will not be used for client reads and writes.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are those non-streams changes piggy-backed or it was due to unclean rebase?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My bad! Will remove.

<td>poll-rate</td>
<td>The average number of polls per second.</td>
<td>kafka.streams:type=stream-metrics,client-id=([-.\w]+)</td>
</tr>
<td>kafka.streams:type=stream-thread-metrics,thread-id=([-.\w]+)</td>
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: polls -> consumer poll calls? Ditto below.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done!

docs/ops.html Outdated
<tr>
<td>process-rate</td>
<td>The average number of processed records per second across all source processor nodes of this task.</td>
<td>kafka.streams:type=stream-thread-metrics,thread-id=([-.\w]+),task-id=([-.\w]+),processor-node-id=all</td>
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should be stream-task-metrics right?

Also do you remember what's the rationale of adding the processor-node-id=all tag, could we just drop this from this task sensor -- I know it maybe too late to drop, but wondering why we needed this tag in the map.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good catch!

I guess we have that tag because the metric is a roll-up metric which measures across all processor nodes. I took over the tag from the previous built-in metrics version.

I removed the tag from the docs.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually in the source code we do add the processor-node-id=all so we cannot just remove it from the docs -- what I was asking is, if we can remove this tag in the code, so that we can remove it in the docs -- anyways we need to make the doc consistent with code, which currently add this tag to the map, and I'm wondering if it is necessary or can be removed (in both code and then docs).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Currently, I do not see a reason we would need this tag. However, I have to try to remove it to be completely sure.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cool, could you try removing that in the source code and then see if that it still okay from unit tests?

<tr>
<td>process-rate</td>
<td>The average number of process operations per second. </td>
<td>kafka.streams:type=stream-processor-node-metrics,client-id=([-.\w]+),task-id=([-.\w]+),processor-node-id=([-.\w]+)</td>
<td>The average number of records processed by a source processor node per second.</td>
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's better to add a note at the beginning of Processor Node Metrics that the following four metrics are only available in certain types of nodes, e.g. process-rate and process-total would only be available for source nodes.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

<td>restore-total</td>
<td>The total number of restore calls for this store.</td>
<td>kafka.streams:type=stream-[store-scope]-metrics,client-id=([-.\w]+),task-id=([-.\w]+),[store-scope]-id=([-.\w]+)</td>
<td>suppression-buffer-size-avg</td>
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ditto here, mention that the following metrics are only available for certain types of stores.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

Copy link
Contributor

@abbccdda abbccdda left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Changes overall LGTM

<tr>
<td>commit-latency-avg</td>
<td>The average commit time in ns for this task. </td>
<td>kafka.streams:type=stream-task-metrics,client-id=([-.\w]+),task-id=([-.\w]+)</td>
<td>The average execution time in ns, for committing.</td>
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just for my own education, why this one is set to ns instead of ms?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It has been always that way. This was not part of KIP-444. I guess because since this metric is on task level, the values may be smaller than on thread level where the metrics measure ms.

<table class="data-table">
<tbody>
<tr>
<td>suppression-buffer-size-current</td>
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Where did we document the deprecation of suppression related metrics?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

They were not deprecated but moved to state store metrics.

@guozhangwang
Copy link
Contributor

Just a headsup #8175 is merged now.

@guozhangwang guozhangwang merged commit d2c4212 into apache:trunk Feb 27, 2020
guozhangwang pushed a commit that referenced this pull request Feb 27, 2020
@guozhangwang
Copy link
Contributor

LGTM! Merged to trunk and 2.5

@mumrah
Copy link
Member

mumrah commented Feb 28, 2020

Does this PR resolve https://issues.apache.org/jira/browse/KAFKA-9606 ?

@guozhangwang
Copy link
Contributor

Yes it does, I'm resolving that ticket now. Thanks for reminding!

@mjsax mjsax added the kip Requires or implements a KIP label Jun 12, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kip Requires or implements a KIP streams
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants