Skip to content

Commit

Permalink
Update telemetry adoption guide
Browse files Browse the repository at this point in the history
  • Loading branch information
yadneshk committed Mar 27, 2024
1 parent 949ddcf commit 7b4482f
Show file tree
Hide file tree
Showing 17 changed files with 244 additions and 92 deletions.
76 changes: 42 additions & 34 deletions docs_user/modules/openstack-autoscaling_adoption.adoc
Original file line number Diff line number Diff line change
Expand Up @@ -20,12 +20,12 @@ should be already adopted.

== Procedure - Autoscaling adoption

Patch OpenStackControlPlane to deploy autoscaling services:
=== Patch OpenStackControlPlane to deploy autoscaling services:

----
cat << EOF > aodh_patch.yaml
oc patch openstackcontrolplane openstack --type=merge --patch '
spec:
autoscaling:
telemetry:
enabled: true
prometheus:
deployPrometheus: false
Expand Down Expand Up @@ -53,29 +53,45 @@ endif::[]
EOF
----


____
If you have previously backed up your OpenStack services configuration file from the old environment, you can use os-diff to compare and make sure the configuration is correct. For more information, see xref:pulling-the-openstack-configuration_{context}[Pulling the OpenStack configuration].
____
=== Install cluster-observability-operator

----
pushd os-diff
./os-diff cdiff --service aodh -c /tmp/collect_tripleo_configs/aodh/etc/aodh/aodh.conf -o aodh_patch.yaml
oc create -f - <<EOF
apiVersion: operators.coreos.com/v1alpha1
kind: Subscription
metadata:
name: cluster-observability-operator
namespace: openshift-operators
spec:
channel: development
installPlanApproval: Automatic
name: cluster-observability-operator
source: redhat-operators
sourceNamespace: openshift-marketplace
EOF
----

____
This will producre the difference between both ini configuration files.
____
=== Wait for the installation to succeed

Patch OpenStackControlPlane to deploy Aodh services:
----
oc wait --for jsonpath="{.status.phase}"=Succeeded csv --namespace=openshift-operators -l operators.coreos.com/cluster-observability-operator.openshift-operators
----

=== Enable metrics storage backend
----
oc patch openstackcontrolplane openstack --type=merge --patch-file aodh_patch.yaml
oc patch openstackcontrolplane openstack --type=merge --patch '
spec:
telemetry:
enabled: true
template:
metricStorage:
enabled: true
'
----

== Post-checks

=== If autoscaling services are enabled inspect Aodh pods
=== Verify Aodh pods and service endpoints

----
AODH_POD=`oc get pods -l service=aodh | tail -n 1 | cut -f 1 -d' '`
Expand All @@ -86,29 +102,21 @@ oc exec -t $AODH_POD -c aodh-api -- cat /etc/aodh/aodh.conf

----
openstack endpoint list | grep aodh
| 6a805bd6c9f54658ad2f24e5a0ae0ab6 | regionOne | aodh | network | True | public | http://aodh-public-openstack.apps-crc.testing |
| b943243e596847a9a317c8ce1800fa98 | regionOne | aodh | network | True | internal | http://aodh-internal.openstack.svc:9696 |
| f97f2b8f7559476bb7a5eafe3d33cee7 | regionOne | aodh | network | True | admin | http://192.168.122.99:9696 |
| d05d120153cd4f9b8310ac396b572926 | regionOne | aodh | alarming | True | internal | http://aodh-internal.openstack.svc:8042 |
| d6daee0183494d7a9a5faee681c79046 | regionOne | aodh | alarming | True | public | http://aodh-public.openstack.svc:8042 |
----

=== Create sample resources

You can test whether you can create alarms.
=== Verify metric storage pods are deployed

----
openstack alarm create \
--name low_alarm \
--type gnocchi_resources_threshold \
--metric cpu \
--resource-id b7ac84e4-b5ca-4f9e-a15c-ece7aaf68987 \
--threshold 35000000000 \
--comparison-operator lt \
--aggregation-method rate:mean \
--granularity 300 \
--evaluation-periods 3 \
--alarm-action 'log:\\' \
--ok-action 'log:\\' \
--resource-type instance
oc get pods -l alertmanager=metric-storage
NAME READY STATUS RESTARTS AGE
alertmanager-metric-storage-0 2/2 Running 0 17h
alertmanager-metric-storage-1 2/2 Running 0 17h
oc get pods -l prometheus=metric-storage
NAME READY STATUS RESTARTS AGE
prometheus-metric-storage-0 3/3 Running 0 17h
----

//=== (TODO)
Expand Down
22 changes: 16 additions & 6 deletions docs_user/modules/openstack-backend_services_deployment.adoc
Original file line number Diff line number Diff line change
Expand Up @@ -279,14 +279,24 @@ spec:
metallb.universe.tf/loadBalancerIPs: 172.17.0.86
spec:
type: LoadBalancer
ceilometer:
telemtry:
enabled: false
template: {}
template:
ceilometer:
enabled: false
template: {}
autoscaling:
enabled: false
template: {}
autoscaling:
enabled: false
template: {}
metricStorage:
enabled: false
template: {}
logging:
enabled: false
template: {}
EOF
----

Expand Down
1 change: 1 addition & 0 deletions docs_user/modules/openstack-edpm_adoption.adoc
Original file line number Diff line number Diff line change
Expand Up @@ -277,6 +277,7 @@ spec:
- nova-compute-extraconfig
- ovn
- neutron-metadata
- telemetry
env:
- name: ANSIBLE_CALLBACKS_ENABLED
value: "profile_tasks"
Expand Down
2 changes: 2 additions & 0 deletions docs_user/modules/openstack-heat_adoption.adoc
Original file line number Diff line number Diff line change
Expand Up @@ -78,6 +78,8 @@ spec:
authEncryptionKey: HeatAuthEncryptionKey
database: HeatDatabasePassword
service: HeatPassword
rabbitMqClusterName: rabbitmq
serviceUser: heat
'
----

Expand Down
13 changes: 12 additions & 1 deletion docs_user/modules/openstack-stop_openstack_services.adoc
Original file line number Diff line number Diff line change
Expand Up @@ -98,7 +98,14 @@ environmental variables and function:
----
# Update the services list to be stopped
ServicesToStop=("tripleo_horizon.service"
ServicesToStop=("tripleo_aodh_api.service"
"tripleo_aodh_api_cron.service"
"tripleo_aodh_evaluator.service"
"tripleo_aodh_listener.service"
"tripleo_aodh_notifier.service"
"tripleo_ceilometer_agent_central.service"
"tripleo_ceilometer_agent_notification.service"
"tripleo_horizon.service"
"tripleo_keystone.service"
"tripleo_barbican_api.service"
"tripleo_barbican_worker.service"
Expand All @@ -108,7 +115,11 @@ ServicesToStop=("tripleo_horizon.service"
"tripleo_cinder_scheduler.service"
"tripleo_cinder_volume.service"
"tripleo_cinder_backup.service"
"tripleo_collectd.service"
"tripleo_glance_api.service"
"tripleo_gnocchi_api.service"
"tripleo_gnocchi_metricd.service"
"tripleo_gnocchi_statsd.service"
"tripleo_manila_api.service"
"tripleo_manila_api_cron.service"
"tripleo_manila_scheduler.service"
Expand Down
5 changes: 4 additions & 1 deletion docs_user/modules/openstack-stop_remaining_services.adoc
Original file line number Diff line number Diff line change
Expand Up @@ -64,7 +64,10 @@ ComputeServicesToStop=(
"tripleo_nova_virtproxyd.service"
"tripleo_nova_virtqemud.service"
"tripleo_nova_virtsecretd.service"
"tripleo_nova_virtstoraged.service")
"tripleo_nova_virtstoraged.service"
"tripleo_ceilometer_agent_compute.service"
"tripleo_ceilometer_agent_ipmi.service"
"tripleo_collectd.service")
PacemakerResourcesToStop=(
"galera-bundle"
Expand Down
78 changes: 29 additions & 49 deletions docs_user/modules/openstack-telemetry_adoption.adoc
Original file line number Diff line number Diff line change
Expand Up @@ -19,61 +19,44 @@ This guide also assumes that:

== Procedure - Telemetry adoption

Patch OpenStackControlPlane to deploy Ceilometer services:
Create OpenStackControlPlane to deploy Ceilometer services:

// TODO(jistr): There are still some quay.io images in the downstream build.

----
cat << EOF > ceilometer_patch.yaml
oc patch openstackcontrolplane openstack --type=merge --patch '
spec:
ceilometer:
telemetry:
enabled: true
template:
ceilometer:
ifeval::["{build}" != "downstream"]
centralImage: quay.io/podified-antelope-centos9/openstack-ceilometer-central:current-podified
computeImage: quay.io/podified-antelope-centos9/openstack-ceilometer-compute:current-podified
customServiceConfig: |
[DEFAULT]
debug=true
ipmiImage: quay.io/podified-antelope-centos9/openstack-ceilometer-ipmi:current-podified
nodeExporterImage: quay.io/prometheus/node-exporter:v1.5.0
notificationImage: quay.io/podified-antelope-centos9/openstack-ceilometer-notification:current-podified
secret: osp-secret
sgCoreImage: quay.io/infrawatch/sg-core:v5.1.1
centralImage: quay.io/podified-antelope-centos9/openstack-ceilometer-central:current-podified
computeImage: quay.io/podified-antelope-centos9/openstack-ceilometer-compute:current-podified
customServiceConfig: |
[DEFAULT]
debug=true
ipmiImage: quay.io/podified-antelope-centos9/openstack-ceilometer-ipmi:current-podified
nodeExporterImage: quay.io/prometheus/node-exporter:v1.5.0
notificationImage: quay.io/podified-antelope-centos9/openstack-ceilometer-notification:current-podified
secret: osp-secret
sgCoreImage: quay.io/infrawatch/sg-core:v5.1.1
endif::[]
ifeval::["{build}" == "downstream"]
centralImage: registry.redhat.io/rhosp-dev-preview/openstack-ceilometer-central-rhel9:18.0
computeImage: registry.redhat.io/rhosp-dev-preview/openstack-ceilometer-compute-rhel9:18.0
customServiceConfig: |
[DEFAULT]
debug=true
ipmiImage: registry.redhat.io/rhosp-dev-preview/openstack-ceilometer-ipmi-rhel9:18.0
nodeExporterImage: quay.io/prometheus/node-exporter:v1.5.0
notificationImage: registry.redhat.io/rhosp-dev-preview/openstack-ceilometer-notification-rhel9:18.0
secret: osp-secret
sgCoreImage: quay.io/infrawatch/sg-core:v5.1.1
centralImage: registry.redhat.io/rhosp-dev-preview/openstack-ceilometer-central-rhel9:18.0
computeImage: registry.redhat.io/rhosp-dev-preview/openstack-ceilometer-compute-rhel9:18.0
customServiceConfig: |
[DEFAULT]
debug=true
ipmiImage: registry.redhat.io/rhosp-dev-preview/openstack-ceilometer-ipmi-rhel9:18.0
nodeExporterImage: quay.io/prometheus/node-exporter:v1.5.0
notificationImage: registry.redhat.io/rhosp-dev-preview/openstack-ceilometer-notification-rhel9:18.0
secret: osp-secret
sgCoreImage: quay.io/infrawatch/sg-core:v5.1.1
endif::[]
EOF
----

____
If you have previously backed up your OpenStack services configuration file from the old environment, you can use os-diff to compare and make sure the configuration is correct. For more information, see xref:pulling-the-openstack-configuration_{context}[Pulling the OpenStack configuration].
____

----
pushd os-diff
./os-diff cdiff --service ceilometer -c /tmp/collect_tripleo_configs/ceilometer/etc/ceilometer/ceilometer.conf -o ceilometer_patch.yaml
----

____
This will produce the difference between both ini configuration files.
____

Patch OpenStackControlPlane to deploy Ceilometer services:

----
oc patch openstackcontrolplane openstack --type=merge --patch-file ceilometer_patch.yaml
----

== Post-checks

Expand All @@ -84,19 +67,13 @@ CEILOMETETR_POD=`oc get pods -l service=ceilometer | tail -n 1 | cut -f 1 -d' '`
oc exec -t $CEILOMETETR_POD -c ceilometer-central-agent -- cat /etc/ceilometer/ceilometer.conf
----

=== Inspect the resulting Ceilometer IPMI agent pod on Data Plane nodes

----
podman ps | grep ceilometer-ipmi
----

=== Inspecting enabled pollsters

----
oc get secret ceilometer-config-data -o jsonpath="{.data['polling\.yaml']}" | base64 -d
----

=== Enabling pollsters according to requirements
=== Create polling.yaml with required pollsters

----
cat << EOF > polling.yaml
Expand All @@ -108,8 +85,11 @@ sources:
- volume.size
- image.size
- cpu
- memory
- memory.usage
EOF
----

=== Update ceilometer configuration with new pollsters
----
oc patch secret ceilometer-config-data --patch="{\"data\": { \"polling.yaml\": \"$(base64 -w0 polling.yaml)\"}}"
----
2 changes: 2 additions & 0 deletions tests/roles/autoscaling_adoption/meta/main.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,2 @@
dependencies:
- role: common_defaults
Loading

0 comments on commit 7b4482f

Please sign in to comment.