Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Check outbound connectivity to other nodes #4

Merged
merged 11 commits into from
Sep 14, 2022
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
19 changes: 14 additions & 5 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,22 +1,27 @@
# Quorum Node Metrics Exporter

A Docker image based on a [Python script](./source/main.py) to gather additional information about peers of a Quorum Node via the RPC endpoint and to provide metrics in Prometheus format.
A Docker image based on a [Python script](./source/main.py) to provide additional metrics of a Quorum node in Prometheus format.

- Information about peers
- Information if the quorum node can establish a TCP connection to another peer (good for finding out firewall misconfigurations).

## Howto

1. Build the docker image, e.g. `docker build -t REGISTYR/REPO:TAG .`
2. Push to your registry - `docker push REGISTYR/REPO:TAG`
3. There is no helm chart yet as of 2022-July-15 !
3. There is no helm chart yet !
4. Set the image `.spec.template.spec.containers[0].image` in file [deployment.yaml](./k8s/deployment.yaml).
5. Set `rpc_url` and `peers` in file [configmap.yaml](./k8s/configmap.yaml).
6. Deploy to Kubernetes`
5. Set `namespace`, `deployment`, `rpc_url` and `peers` in file [configmap.yaml](./k8s/configmap.yaml).
6. Set the `metadata.namespace` in all Kubernetes yaml files. Must be deployed into the same namespace as Quorum is running!
7. Deploy to Kubernetes`

```bash
kubectl apply -n=my-custom-namespace k8s/configmap.yaml
kubectl apply -n=my-custom-namespace k8s/rbac.yaml
kubectl apply -n=my-custom-namespace k8s/deployment.yaml
```

7. In case you are using network policies, take a look at [netpol.yaml](./k8s/netpol.yaml) and modify the policies according to your needs.
8. In case you are using network policies, take a look at [netpol.yaml](./k8s/netpol.yaml) and modify the policies according to your needs.

## Grafana Dashboard

Expand All @@ -40,6 +45,10 @@ Metrics are provided for current connected peers and for well known peers define
- Description: Quorum peers head block by enode and protocol eth or istanbul
- Labels: instance, instance_name, enode, enode_short, name, protocol
- Values: The latest block of the connected peer
- `quorum_tcp_egress_connectivity`:
- Description: Quorum TCP egress connectivity to other nodes by enode.
- Labels: instance, instance_name, enode, enode_short, name
- Values: 0=no connectivity/an outbound connection cannot be established, 1=connection can be established

### Metric Labels

Expand Down
32 changes: 24 additions & 8 deletions k8s/configmap.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -4,41 +4,57 @@ metadata:
labels:
app.kubernetes.io/name: quorum-node-metrics-exporter
name: quorum-node-metrics-exporter
namespace: epi-poc-quorum-metrics
namespace: epi-poc-quorum
data:
# The config.
# Required attributes: "rpc_url" and "peers"
# - "namespace" = The name of the k8s namespace where the Quorum deployment is located
# - "deployment" = The name of the Quorum k8s deployment
# - "rpc_url" = The full URL of the RPC endpoint of the quorum node, e.g. "http://quorum-node-0-rpc.quorum:8545"
# - "peers" = A list of all known peers via their "enode".#
# "peers" contains an array of objects. Each object must have attributes "company-name" and "enode".
# "peers" contains an array of objects. Each object must have attributes "company-name", "enode", "enodeAddress" and "enodeAddressPort"
# Note: If there are no known peers, provide an empty array/list of peers.
config.json: |-
{
"namespace": "epi-poc-quorum",
"deployment": "quorum-node-0",
"rpc_url": "http://quorum-node-0-rpc.epi-poc-quorum:8545",
"peers": [
{
"company-name": "company_a",
"enode": "4312d5056db7edf8b6..."
"enode": "4312d5056db7edf8b6...",
"enodeAddress": "1.2.3.4",
"enodeAddressPort": "30303"
},
{
"company-name": "company_a",
"enode": "a36ceb6ccdf5ff8a7c..."
"enode": "a36ceb6ccdf5ff8a7c...",
"enodeAddress": "2.3.4.5",
"enodeAddressPort": "30303"
},
{
"company-name": "company_a",
"enode": "4801af270f75e9352b..."
"enode": "4801af270f75e9352b...",
"enodeAddress": "3.4.5.6",
"enodeAddressPort": "30303"
},
{
"company-name": "company_a",
"enode": "456a860cb1275dd23..."
"enode": "456a860cb1275dd23...",
"enodeAddress": "4.5.6.7",
"enodeAddressPort": "30303"
},
{
"company-name": "company_b",
"enode": "bc03e0353fe10d0261..."
"enode": "bc03e0353fe10d0261...",
"enodeAddress": "5.6.7.8",
"enodeAddressPort": "30303"
},
{
"company-name": "company_c",
"enode": "b06bca847a8c27e7d..."
"enode": "b06bca847a8c27e7d...",
"enodeAddress": "6.7.8.9",
"enodeAddressPort": "30303"
},
]
}
5 changes: 3 additions & 2 deletions k8s/deployment.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -4,7 +4,7 @@ metadata:
labels:
app.kubernetes.io/name: quorum-node-metrics-exporter
name: quorum-node-metrics-exporter
namespace: epi-poc-quorum-metrics
namespace: epi-poc-quorum
spec:
replicas: 1
selector:
Expand All @@ -26,7 +26,8 @@ spec:
app.kubernetes.io/name: quorum-node-metrics-exporter
app.kubernetes.io/instance: quorum-node-metrics-exporter
spec:
automountServiceAccountToken: false
automountServiceAccountToken: true
serviceAccountName: quorum-node-metrics-exporter
containers:
- image: REGISTRY/REPO:TAG
imagePullPolicy: Always
Expand Down
39 changes: 33 additions & 6 deletions k8s/netpol.yaml
Original file line number Diff line number Diff line change
@@ -1,8 +1,8 @@
apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
name: ingress-from-prometheus
namespace: epi-poc-quorum-metrics
name: quorum-node-metrics-exporter-ingress-from-prometheus
namespace: epi-poc-quorum
spec:
ingress:
- from:
Expand All @@ -24,8 +24,8 @@ spec:
apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
name: egress-to-quorum
namespace: epi-poc-quorum-metrics
name: quorum-node-metrics-exporter-egress-to-quorum
namespace: epi-poc-quorum
spec:
egress:
- ports:
Expand All @@ -47,8 +47,8 @@ spec:
apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
name: egress-to-dns
namespace: epi-poc-quorum-metrics
name: quorum-node-metrics-exporter-egress-to-dns
namespace: epi-poc-quorum
spec:
egress:
- ports:
Expand All @@ -64,3 +64,30 @@ spec:
app.kubernetes.io/name: quorum-node-metrics-exporter
policyTypes:
- Egress
---
apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
name: quorum-node-metrics-exporter-egress-to-kubeapi
namespace: epi-poc-quorum
spec:
egress:
- ports:
- port: 443
protocol: TCP
to:
# The IP Address of the Kube API Service (see service kubernetes.default)
- ipBlock:
cidr: 172.20.0.1/32
# Determine Kube API Endpoint via
# kubectl get endpoints --namespace default kubernetes
# Also see https://pauldally.medium.com/accessing-kubernetes-api-server-when-there-is-an-egress-networkpolicy-af4435e005f9
- ipBlock:
cidr: 10.0.17.52/32
- ipBlock:
cidr: 10.0.58.124/32
podSelector:
matchLabels:
app.kubernetes.io/name: quorum-node-metrics-exporter
policyTypes:
- Egress
36 changes: 36 additions & 0 deletions k8s/rbac.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,36 @@
apiVersion: rbac.authorization.k8s.io/v1
kind: Role
metadata:
name: quorum-node-metrics-exporter
namespace: epi-poc-quorum
rules:
- apiGroups: ["apps"]
resources: ["deployments"]
verbs: ["get"]
- apiGroups: [""]
resources: ["pods"]
verbs: ["list","get"]
- apiGroups: [""]
resources: ["pods/exec"]
# https://github.com/kubernetes-client/python/issues/690
verbs: ["create","watch", "get"]
---
apiVersion: rbac.authorization.k8s.io/v1
kind: RoleBinding
metadata:
name: quorum-node-metrics-exporter
namespace: epi-poc-quorum
roleRef:
apiGroup: rbac.authorization.k8s.io
kind: Role
name: quorum-node-metrics-exporter
subjects:
- kind: ServiceAccount
name: quorum-node-metrics-exporter
---
apiVersion: v1
kind: ServiceAccount
metadata:
name: quorum-node-metrics-exporter
namespace: epi-poc-quorum
---
55 changes: 37 additions & 18 deletions source/main.py
Original file line number Diff line number Diff line change
@@ -1,38 +1,57 @@
"""Main program
"""
import logging
import signal
import sys
import threading
from utils.ConfigLoader import ConfigLoader
from utils.CustomCollector import CustomCollector
from utils.MetricsProvider import MetricsProvider

from prometheus_client import start_http_server
from prometheus_client.core import REGISTRY

if __name__ == '__main__':
logging.basicConfig(format='%(levelname)s: %(message)s', level=logging.INFO)
import utils.config
from utils.kube_exec_metrics_collector import KubeExecMetricsCollector
from utils.rpc_metrics_collector import RpcMetricsCollector


def main() -> int:
"""Main

Returns:
int: Return code
"""
logging.basicConfig(
format='%(levelname)s: %(message)s', level=logging.INFO)

# Load Config
sleep_time = 10.0
config = ConfigLoader.load()
if not config:
sys.exit(1)
SLEEP_TIME = 10.0
config = utils.config.load()
if config is None:
return 1

# Init MetricsProviders and register CustomCollectors
rpc_metrics_collector = RpcMetricsCollector(config)
REGISTRY.register(rpc_metrics_collector)

# Init MetricsProvider and register CustomCollector
metrics_provider = MetricsProvider(config=config)
custom_collector = CustomCollector(metrics_provider=metrics_provider)
REGISTRY.register(custom_collector)
kube_exec_metrics_collector = KubeExecMetricsCollector(config)
REGISTRY.register(kube_exec_metrics_collector)

# Start up the server to expose the metrics.
start_http_server(8000)

# Graceful and fast shutdown
quit_event = threading.Event()
# https://stackoverflow.com/questions/862412/is-it-possible-to-have-multiple-statements-in-a-python-lambda-expression
signal.signal(signal.SIGTERM, lambda *_args: (logging.info("SIGTERM received") and False) or quit_event.set())
signal.signal(signal.SIGTERM,
lambda *_args: (logging.info("SIGTERM received") and False) or quit_event.set())
while not quit_event.is_set():
logging.info("Preparing metrics - rpc_url=%s", config.rpc_url)
metrics_provider.process()
logging.info("Done. Sleeping for %s seconds", sleep_time)
quit_event.wait(timeout=sleep_time)
logging.info("Preparing metrics")
rpc_metrics_collector.process()
kube_exec_metrics_collector.process()
logging.info("Done. Sleeping for %s seconds", SLEEP_TIME)
quit_event.wait(timeout=SLEEP_TIME)

logging.info("Leaving - quit_event.is_set()=%s", quit_event.is_set())
return 0

if __name__ == '__main__':
sys.exit(main())
17 changes: 15 additions & 2 deletions source/requirements.txt
Original file line number Diff line number Diff line change
@@ -1,6 +1,19 @@
certifi==2022.6.15
charset-normalizer==2.1.0
charset-normalizer==2.1.1
idna==3.3
prometheus-client==0.14.1
requests==2.28.1
urllib3==1.26.10
urllib3==1.26.12
## The following requirements were added by pip freeze:
cachetools==5.2.0
google-auth==2.11.0
kubernetes==24.2.0
oauthlib==3.2.0
pyasn1==0.4.8
pyasn1-modules==0.2.8
python-dateutil==2.8.2
PyYAML==6.0
requests-oauthlib==1.3.1
rsa==4.9
six==1.16.0
websocket-client==1.4.1
51 changes: 0 additions & 51 deletions source/utils/Config.py

This file was deleted.

Loading