rafthttp: probe connection for Raft message transport #10022

gyuho · 2018-08-17T23:46:29Z

In our production cluster, we found one TCP connection had >8-sec
latencies to a remote peer, but "etcd_network_peer_round_trip_time_seconds"
metrics shows <1-sec latency distribution, which means we
weren't sampling enough, or all the latency spikes happen
outside of snapshot pipeline connection. The later is most likely
the case, since the cluster had leader elections from missing
heartbeats.

This PR adds another probing routine to monitor the connection
for Raft message transports.

We need to track which connection had high latency spikes.

etcd_network_peer_round_trip_time_seconds_bucket{ConnectionType="ROUND_TRIPPER_RAFT_MESSAGE",To="729934363faa4a24",le="0.0001"} 0
etcd_network_peer_round_trip_time_seconds_bucket{ConnectionType="ROUND_TRIPPER_RAFT_MESSAGE",To="729934363faa4a24",le="0.0002"} 1
etcd_network_peer_round_trip_time_seconds_bucket{ConnectionType="ROUND_TRIPPER_SNAPSHOT",To="729934363faa4a24",le="0.0001"} 0
etcd_network_peer_round_trip_time_seconds_bucket{ConnectionType="ROUND_TRIPPER_SNAPSHOT",To="729934363faa4a24",le="0.0002"} 1

@jpbetz Would adding ConnectionType="ROUND_TRIPPER_SNAPSHOT" label break anything in your monitoring systems? I want to backport this as well. Currently, probing doesn't tell much, since it only tracks snapshot sender connection...

…econds" metric Currently, only v2 metrics ("stats.FollowerStats") tracks Raft message send latencies. Add Prometheus histogram to track Raft messages for writes, since heartbeats are probed (see etcd-io#10022) and snapshots are already being tracked via etcd-io#9997. ``` etcd_network_raft_send_total_duration_seconds_bucket{To="7339c4e5e833c029",Type="MsgProp",le="0.0001"} 1 etcd_network_raft_send_total_duration_seconds_bucket{To="7339c4e5e833c029",Type="MsgProp",le="0.0002"} 1 etcd_network_raft_send_total_duration_seconds_bucket{To="729934363faa4a24",Type="MsgApp",le="0.0001"} 9 etcd_network_raft_send_total_duration_seconds_bucket{To="729934363faa4a24",Type="MsgApp",le="0.0002"} 9 etcd_network_raft_send_total_duration_seconds_bucket{To="7339c4e5e833c029",Type="MsgAppResp",le="0.0001"} 8 etcd_network_raft_send_total_duration_seconds_bucket{To="7339c4e5e833c029",Type="MsgAppResp",le="0.0002"} 8 ``` Signed-off-by: Gyuho Lee <[email protected]>

gyuho · 2018-08-18T00:43:32Z

Address #9438.

codecov-io · 2018-08-18T03:23:04Z

Codecov Report

Merging #10022 into master will decrease coverage by 0.03%.
The diff coverage is 94.44%.

@@            Coverage Diff             @@
##           master   #10022      +/-   ##
==========================================
- Coverage   71.59%   71.56%   -0.04%     
==========================================
  Files         390      390              
  Lines       36258    36266       +8     
==========================================
- Hits        25960    25952       -8     
- Misses       8488     8505      +17     
+ Partials     1810     1809       -1

Impacted Files	Coverage Δ
etcdserver/api/rafthttp/metrics.go	`100% <ø> (ø)`	⬆️
etcdserver/api/rafthttp/transport.go	`83.87% <100%> (+0.45%)`	⬆️
etcdserver/api/rafthttp/probing_status.go	`57.44% <83.33%> (-0.34%)`	⬇️
clientv3/leasing/util.go	`91.66% <0%> (-6.67%)`	⬇️
pkg/netutil/netutil.go	`63.11% <0%> (-6.56%)`	⬇️
etcdctl/ctlv3/command/lease_command.go	`65.34% <0%> (-5.95%)`	⬇️
pkg/transport/listener.go	`58.67% <0%> (-4.09%)`	⬇️
etcdserver/v2_server.go	`80.76% <0%> (-3.85%)`	⬇️
proxy/grpcproxy/watch.go	`89.44% <0%> (-3.11%)`	⬇️
lease/leasehttp/http.go	`63.97% <0%> (-2.95%)`	⬇️
... and 21 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 34fcaba...37cf84c. Read the comment docs.

…econds" metric Currently, only v2 metrics ("stats.FollowerStats") tracks Raft message send latencies. Add Prometheus histogram to track Raft messages for writes, since heartbeats are probed (see etcd-io#10022) and snapshots are already being tracked via etcd-io#9997. ``` etcd_network_raft_send_total_duration_seconds_bucket{To="7339c4e5e833c029",Type="MsgProp",le="0.0001"} 1 etcd_network_raft_send_total_duration_seconds_bucket{To="7339c4e5e833c029",Type="MsgProp",le="0.0002"} 1 etcd_network_raft_send_total_duration_seconds_bucket{To="729934363faa4a24",Type="MsgApp",le="0.0001"} 9 etcd_network_raft_send_total_duration_seconds_bucket{To="729934363faa4a24",Type="MsgApp",le="0.0002"} 9 etcd_network_raft_send_total_duration_seconds_bucket{To="7339c4e5e833c029",Type="MsgAppResp",le="0.0001"} 8 etcd_network_raft_send_total_duration_seconds_bucket{To="7339c4e5e833c029",Type="MsgAppResp",le="0.0002"} 8 ``` Signed-off-by: Gyuho Lee <[email protected]>

Preliminary work to add prober to "streamRt" Signed-off-by: Gyuho Lee <[email protected]>

Signed-off-by: Gyuho Lee <[email protected]>

In our production cluster, we found one TCP connection had >8-sec latencies to a remote peer, but "etcd_network_peer_round_trip_time_seconds" metrics shows <1-sec latency distribution, which means we weren't sampling enough, or all the latency spikes happen outside of snapshot pipeline connection. The later is most likely the case, since the cluster had leader elections from missing heartbeats. This PR adds another probing routine to monitor the connection for Raft message transports. Signed-off-by: Gyuho Lee <[email protected]>

We need to track which connection had high latency spikes. ``` etcd_network_peer_round_trip_time_seconds_bucket{ConnectionType="ROUND_TRIPPER_RAFT_MESSAGE",To="729934363faa4a24",le="0.0001"} 0 etcd_network_peer_round_trip_time_seconds_bucket{ConnectionType="ROUND_TRIPPER_RAFT_MESSAGE",To="729934363faa4a24",le="0.0002"} 1 etcd_network_peer_round_trip_time_seconds_bucket{ConnectionType="ROUND_TRIPPER_SNAPSHOT",To="729934363faa4a24",le="0.0001"} 0 etcd_network_peer_round_trip_time_seconds_bucket{ConnectionType="ROUND_TRIPPER_SNAPSHOT",To="729934363faa4a24",le="0.0002"} 1 ``` Signed-off-by: Gyuho Lee <[email protected]>

Fix ``` panic: runtime error: invalid memory address or nil pointer dereference [recovered] panic: runtime error: invalid memory address or nil pointer dereference ``` Signed-off-by: Gyuho Lee <[email protected]>

gyuho · 2018-08-29T22:51:22Z

@wenjiaswe @jpbetz I meant to ask about this metrics as well. Would adding ConnectionType break anything in GKE?

gyuho · 2018-08-29T23:02:50Z

AWS does not use this metrics, so this should be safe for us.

jpbetz · 2018-08-29T23:18:37Z

Let's hold on this until we can confirm how label additions handled. We're primarily concerned with how https://github.com/GoogleCloudPlatform/k8s-stackdriver/tree/master/prometheus-to-sd handles this case. @wenjiaswe is checking now.

gyuho · 2018-08-29T23:25:47Z

@jpbetz No rush. Thanks for checking!

wenjiaswe

Hello @gyuho , I checked in prometheus-to-sd translator, @jpbetz is correct, label addition does break metric push in GKE. Please refer to code review for details. Everything else LGTM.

wenjiaswe · 2018-09-05T18:09:49Z

etcdserver/api/rafthttp/metrics.go

@@ -143,7 +143,7 @@ var (
 		// highest bucket start of 0.0001 sec * 2^15 == 3.2768 sec
 		Buckets: prometheus.ExponentialBuckets(0.0001, 2, 16),
 	},
-		[]string{"To"},
+		[]string{"ConnectionType", "To"},


@gyuho unfortunately, label addition will break GKE. I checked https://github.com/GoogleCloudPlatform/k8s-stackdriver/blob/master/prometheus-to-sd, for metrics with prefix "container.googleapis.com" (e.g., etcd metrics), if definition (for example, label is one of the definition) of the metric was changed, then the metric is marked as broken and the metric is not going to be pushed: https://github.com/GoogleCloudPlatform/k8s-stackdriver/blob/a39287c62ca7c1ffaa3cac4887f6a92c1943abab/prometheus-to-sd/translator/metric_descriptor_cache.go#L72. prometheus-to-sd only UpdateMetricDescriptors if it contains "custom.googleapis.com" prefix:https://github.com/GoogleCloudPlatform/k8s-stackdriver/blob/1047589c380eed64c94484e7ab989db53d1b907b/prometheus-to-sd/main.go#L141.

gyuho · 2018-09-05T18:26:04Z

@wenjiaswe @jpbetz Thanks for checking!

Then, let's close this.

I will add a separate metrics.

jpbetz · 2018-09-05T19:51:36Z

@wenjiaswe Would you also file an issue somewhere to track this limitation. Many label additions are conceptually backward compatible since their introduction would not break existing readers that are unaware of the label, so improving our metrics processing pipeline tolerant of label additions seems desirable and useful.

gyuho requested review from jpbetz and xiang90 August 17, 2018 23:46

gyuho added WIP labels Aug 17, 2018

gyuho mentioned this pull request Aug 18, 2018

rafthttp: add Raft send latency metric for writes #10023

Closed

gyuho added 5 commits August 29, 2018 15:17

etcdserver/api/rafthttp: rename to "pipelineProber"

8446d14

Preliminary work to add prober to "streamRt" Signed-off-by: Gyuho Lee <[email protected]>

etcdserver/api/rafthttp: display roundtripper name in warnings

f0d7043

Signed-off-by: Gyuho Lee <[email protected]>

etcdserver/api/rafthttp: fix prober test failures

37cf84c

Fix ``` panic: runtime error: invalid memory address or nil pointer dereference [recovered] panic: runtime error: invalid memory address or nil pointer dereference ``` Signed-off-by: Gyuho Lee <[email protected]>

gyuho force-pushed the ggg branch from 441d7c0 to 37cf84c Compare August 29, 2018 22:17

gyuho removed the WIP label Aug 29, 2018

wenjiaswe requested changes Sep 5, 2018

View reviewed changes

gyuho closed this Sep 5, 2018

gyuho removed the Release-Note label Sep 5, 2018

gyuho mentioned this pull request Oct 7, 2018

rafthttp: probe all raft transports #10155

Merged

wenjiaswe mentioned this pull request Oct 9, 2018

Label addition to metrics with prefix "container.googleapis.com" should not break GCP/GKE? GoogleCloudPlatform/k8s-stackdriver#214

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

rafthttp: probe connection for Raft message transport #10022

rafthttp: probe connection for Raft message transport #10022

gyuho commented Aug 17, 2018

gyuho commented Aug 18, 2018

codecov-io commented Aug 18, 2018 •

edited

Loading

gyuho commented Aug 29, 2018

gyuho commented Aug 29, 2018

jpbetz commented Aug 29, 2018

gyuho commented Aug 29, 2018

wenjiaswe left a comment

wenjiaswe Sep 5, 2018

gyuho commented Sep 5, 2018

jpbetz commented Sep 5, 2018

rafthttp: probe connection for Raft message transport #10022

rafthttp: probe connection for Raft message transport #10022

Conversation

gyuho commented Aug 17, 2018

gyuho commented Aug 18, 2018

codecov-io commented Aug 18, 2018 • edited Loading

Codecov Report

gyuho commented Aug 29, 2018

gyuho commented Aug 29, 2018

jpbetz commented Aug 29, 2018

gyuho commented Aug 29, 2018

wenjiaswe left a comment

Choose a reason for hiding this comment

wenjiaswe Sep 5, 2018

Choose a reason for hiding this comment

gyuho commented Sep 5, 2018

jpbetz commented Sep 5, 2018

codecov-io commented Aug 18, 2018 •

edited

Loading