Skip to content

Commit

Permalink
Update telemetry adoption guide
Browse files Browse the repository at this point in the history
  • Loading branch information
yadneshk committed Feb 27, 2024
1 parent ea0558f commit 57c9890
Show file tree
Hide file tree
Showing 18 changed files with 277 additions and 111 deletions.
105 changes: 63 additions & 42 deletions docs_user/modules/openstack-autoscaling_adoption.adoc
Original file line number Diff line number Diff line change
Expand Up @@ -20,62 +20,71 @@ should be already adopted.

== Procedure - Autoscaling adoption

Patch OpenStackControlPlane to deploy autoscaling services:
=== Patch OpenStackControlPlane to deploy autoscaling services:

----
cat << EOF > aodh_patch.yaml
oc patch openstackcontrolplane openstack --type=merge --patch '
spec:
autoscaling:
telemetry:
enabled: true
prometheus:
deployPrometheus: false
aodh:
customServiceConfig: |
[DEFAULT]
debug=true
secret: osp-secret
ifeval::["{build}" == "upstream"]
apiImage: "quay.io/podified-antelope-centos9/openstack-aodh-api:current-podified"
evaluatorImage: "quay.io/podified-antelope-centos9/openstack-aodh-evaluator:current-podified"
notifierImage: "quay.io/podified-antelope-centos9/openstack-aodh-notifier:current-podified"
listenerImage: "quay.io/podified-antelope-centos9/openstack-aodh-listener:current-podified"
endif::[]
ifeval::["{build}" == "downstream"]
apiImage: "registry.redhat.io/rhosp-dev-preview/openstack-aodh-api-rhel9:18.0"
evaluatorImage: "registry.redhat.io/rhosp-dev-preview/openstack-aodh-evaluator-rhel9:18.0"
notifierImage: "registry.redhat.io/rhosp-dev-preview/openstack-aodh-notifier-rhel9:18.0"
listenerImage: "registry.redhat.io/rhosp-dev-preview/openstack-aodh-listener-rhel9:18.0"
endif::[]
passwordSelectors:
databaseUser: aodh
databaseInstance: openstack
memcachedInstance: memcached
EOF
template:
autoscaling:
enabled: true
heatInstance: heat
aodh:
customServiceConfig: |
[DEFAULT]
debug=true
passwordSelector:
aodhService: AodhPassword
database: AodhDatabasePassword
service: CeilometerPassword
secret: osp-secret
databaseInstance: openstack
memcachedInstance: memcached
databaseUser: aodh
'
----


____
If you have previously backed up your OpenStack services configuration file from the old environment, you can use os-diff to compare and make sure the configuration is correct. For more information, see xref:pulling-the-openstack-configuration_{context}[Pulling the OpenStack configuration].
____
=== Install cluster-observability-operator

----
pushd os-diff
./os-diff cdiff --service aodh -c /tmp/collect_tripleo_configs/aodh/etc/aodh/aodh.conf -o aodh_patch.yaml
oc create -f - <<EOF
apiVersion: operators.coreos.com/v1alpha1
kind: Subscription
metadata:
name: cluster-observability-operator
namespace: openshift-operators
spec:
channel: development
installPlanApproval: Automatic
name: cluster-observability-operator
source: redhat-operators
sourceNamespace: openshift-marketplace
EOF
----

____
This will producre the difference between both ini configuration files.
____
=== Wait for the installation to succeed

Patch OpenStackControlPlane to deploy Aodh services:
----
oc wait --for jsonpath="{.status.phase}"=Succeeded csv --namespace=openshift-operators -l operators.coreos.com/cluster-observability-operator.openshift-operators
----

=== Enable metrics storage backend
----
oc patch openstackcontrolplane openstack --type=merge --patch-file aodh_patch.yaml
oc patch openstackcontrolplane openstack --type=merge --patch '
spec:
telemetry:
enabled: true
template:
metricStorage:
enabled: true
'
----

== Post-checks

=== If autoscaling services are enabled inspect Aodh pods
=== Verify Aodh pods and service endpoints

----
AODH_POD=`oc get pods -l service=aodh | tail -n 1 | cut -f 1 -d' '`
Expand All @@ -86,9 +95,8 @@ oc exec -t $AODH_POD -c aodh-api -- cat /etc/aodh/aodh.conf

----
openstack endpoint list | grep aodh
| 6a805bd6c9f54658ad2f24e5a0ae0ab6 | regionOne | aodh | network | True | public | http://aodh-public-openstack.apps-crc.testing |
| b943243e596847a9a317c8ce1800fa98 | regionOne | aodh | network | True | internal | http://aodh-internal.openstack.svc:9696 |
| f97f2b8f7559476bb7a5eafe3d33cee7 | regionOne | aodh | network | True | admin | http://192.168.122.99:9696 |
| d05d120153cd4f9b8310ac396b572926 | regionOne | aodh | alarming | True | internal | http://aodh-internal.openstack.svc:8042 |
| d6daee0183494d7a9a5faee681c79046 | regionOne | aodh | alarming | True | public | http://aodh-public.openstack.svc:8042 |
----

=== Create sample resources
Expand All @@ -111,6 +119,19 @@ openstack alarm create \
--resource-type instance
----

=== Verify metric storage pods are deployed

----
oc get pods -l alertmanager=metric-storage
NAME READY STATUS RESTARTS AGE
alertmanager-metric-storage-0 2/2 Running 0 17h
alertmanager-metric-storage-1 2/2 Running 0 17h
oc get pods -l prometheus=metric-storage
NAME READY STATUS RESTARTS AGE
prometheus-metric-storage-0 3/3 Running 0 17h
----

//=== (TODO)

//* Include adopted autoscaling heat templates
Expand Down
22 changes: 16 additions & 6 deletions docs_user/modules/openstack-backend_services_deployment.adoc
Original file line number Diff line number Diff line change
Expand Up @@ -262,14 +262,24 @@ spec:
metallb.universe.tf/loadBalancerIPs: 172.17.0.86
spec:
type: LoadBalancer
ceilometer:
telemtry:
enabled: false
template: {}
template:
ceilometer:
enabled: false
template: {}
autoscaling:
enabled: false
template: {}
autoscaling:
enabled: false
template: {}
metricStorage:
enabled: false
template: {}
logging:
enabled: false
template: {}
EOF
----

Expand Down
1 change: 1 addition & 0 deletions docs_user/modules/openstack-edpm_adoption.adoc
Original file line number Diff line number Diff line change
Expand Up @@ -262,6 +262,7 @@ spec:
- nova-compute-extraconfig
- ovn
- neutron-metadata
- telemetry
env:
- name: ANSIBLE_CALLBACKS_ENABLED
value: "profile_tasks"
Expand Down
2 changes: 2 additions & 0 deletions docs_user/modules/openstack-heat_adoption.adoc
Original file line number Diff line number Diff line change
Expand Up @@ -77,6 +77,8 @@ spec:
authEncryptionKey: HeatAuthEncryptionKey
database: HeatDatabasePassword
service: HeatPassword
rabbitMqClusterName: rabbitmq
serviceUser: heat
'
----

Expand Down
2 changes: 1 addition & 1 deletion docs_user/modules/openstack-keystone_adoption.adoc
Original file line number Diff line number Diff line change
Expand Up @@ -79,7 +79,7 @@ control plane (everything except Keystone service and endpoints):
----
openstack endpoint list | grep keystone | awk '/admin/{ print $2; }' | xargs ${BASH_ALIASES[openstack]} endpoint delete || true
for service in aodh cinderv3 glance manila manilav2 neutron nova placement swift; do
for service in aodh cinderv3 glance gnocchi manila manilav2 neutron nova placement swift; do
openstack service list | awk "/ $service /{ print \$2; }" | xargs ${BASH_ALIASES[openstack]} service delete || true
done
----
Expand Down
22 changes: 12 additions & 10 deletions docs_user/modules/openstack-stop_openstack_services.adoc
Original file line number Diff line number Diff line change
Expand Up @@ -67,13 +67,24 @@ These steps can be automated with a simple script that relies on the previously
----
# Update the services list to be stopped
ServicesToStop=("tripleo_horizon.service"
ServicesToStop=("tripleo_aodh_api.service"
"tripleo_aodh_api_cron.service"
"tripleo_aodh_evaluator.service"
"tripleo_aodh_listener.service"
"tripleo_aodh_notifier.service"
"tripleo_ceilometer_agent_central.service"
"tripleo_ceilometer_agent_notification.service"
"tripleo_horizon.service"
"tripleo_keystone.service"
"tripleo_cinder_api.service"
"tripleo_cinder_api_cron.service"
"tripleo_cinder_scheduler.service"
"tripleo_cinder_backup.service"
"tripleo_collectd.service"
"tripleo_glance_api.service"
"tripleo_gnocchi_api.service"
"tripleo_gnocchi_metricd.service"
"tripleo_gnocchi_statsd.service"
"tripleo_manila_api.service"
"tripleo_manila_api_cron.service"
"tripleo_manila_scheduler.service"
Expand All @@ -86,15 +97,6 @@ ServicesToStop=("tripleo_horizon.service"
"tripleo_nova_metadata.service"
"tripleo_nova_scheduler.service"
"tripleo_nova_vnc_proxy.service"
"tripleo_aodh_api.service"
"tripleo_aodh_api_cron.service"
"tripleo_aodh_evaluator.service"
"tripleo_aodh_listener.service"
"tripleo_aodh_notifier.service"
"tripleo_ceilometer_agent_central.service"
"tripleo_ceilometer_agent_compute.service"
"tripleo_ceilometer_agent_ipmi.service"
"tripleo_ceilometer_agent_notification.service"
"tripleo_ovn_cluster_northd.service")
PacemakerResourcesToStop=("openstack-cinder-volume"
Expand Down
5 changes: 4 additions & 1 deletion docs_user/modules/openstack-stop_remaining_services.adoc
Original file line number Diff line number Diff line change
Expand Up @@ -58,7 +58,10 @@ ComputeServicesToStop=(
"tripleo_nova_virtproxyd.service"
"tripleo_nova_virtqemud.service"
"tripleo_nova_virtsecretd.service"
"tripleo_nova_virtstoraged.service")
"tripleo_nova_virtstoraged.service"
"tripleo_ceilometer_agent_compute.service"
"tripleo_ceilometer_agent_ipmi.service"
"tripleo_collectd.service")
PacemakerResourcesToStop=(
"galera-bundle"
Expand Down
72 changes: 23 additions & 49 deletions docs_user/modules/openstack-telemetry_adoption.adoc
Original file line number Diff line number Diff line change
Expand Up @@ -19,60 +19,37 @@ This guide also assumes that:

== Procedure - Telemetry adoption

Patch OpenStackControlPlane to deploy Ceilometer services:
Create OpenStackControlPlane to deploy Ceilometer services:

// TODO(jistr): There are still some quay.io images in the downstream build.

----
cat << EOF > ceilometer_patch.yaml
oc patch openstackcontrolplane openstack --type=merge --patch '
spec:
ceilometer:
telemetry:
enabled: true
template:
ifeval::["{build}" == "upstream"]
centralImage: quay.io/podified-antelope-centos9/openstack-ceilometer-central:current-podified
computeImage: quay.io/podified-antelope-centos9/openstack-ceilometer-compute:current-podified
customServiceConfig: |
[DEFAULT]
debug=true
ipmiImage: quay.io/podified-antelope-centos9/openstack-ceilometer-ipmi:current-podified
nodeExporterImage: quay.io/prometheus/node-exporter:v1.5.0
notificationImage: quay.io/podified-antelope-centos9/openstack-ceilometer-notification:current-podified
secret: osp-secret
sgCoreImage: quay.io/infrawatch/sg-core:v5.1.1
endif::[]
ifeval::["{build}" == "downstream"]
centralImage: registry.redhat.io/rhosp-dev-preview/openstack-ceilometer-central-rhel9:18.0
computeImage: registry.redhat.io/rhosp-dev-preview/openstack-ceilometer-compute-rhel9:18.0
customServiceConfig: |
[DEFAULT]
debug=true
ipmiImage: registry.redhat.io/rhosp-dev-preview/openstack-ceilometer-ipmi-rhel9:18.0
nodeExporterImage: quay.io/prometheus/node-exporter:v1.5.0
notificationImage: registry.redhat.io/rhosp-dev-preview/openstack-ceilometer-notification-rhel9:18.0
secret: osp-secret
sgCoreImage: quay.io/infrawatch/sg-core:v5.1.1
endif::[]
EOF
ceilometer:
enabled: true
customServiceConfig: |
[DEFAULT]
debug=true
secret: osp-secret
'
----

____
If you have previously backed up your OpenStack services configuration file from the old environment, you can use os-diff to compare and make sure the configuration is correct. For more information, see xref:pulling-the-openstack-configuration_{context}[Pulling the OpenStack configuration].
____

----
pushd os-diff
./os-diff cdiff --service ceilometer -c /tmp/collect_tripleo_configs/ceilometer/etc/ceilometer/ceilometer.conf -o ceilometer_patch.yaml
----

____
This will produce the difference between both ini configuration files.
____

Patch OpenStackControlPlane to deploy Ceilometer services:
Enable Logging

----
oc patch openstackcontrolplane openstack --type=merge --patch-file ceilometer_patch.yaml
oc patch openstackcontrolplane openstack --type=merge --patch '
spec:
telemetry:
enabled: true
template:
logging:
enabled: true
'
----

== Post-checks
Expand All @@ -84,19 +61,13 @@ CEILOMETETR_POD=`oc get pods -l service=ceilometer | tail -n 1 | cut -f 1 -d' '`
oc exec -t $CEILOMETETR_POD -c ceilometer-central-agent -- cat /etc/ceilometer/ceilometer.conf
----

=== Inspect the resulting Ceilometer IPMI agent pod on Data Plane nodes

----
podman ps | grep ceilometer-ipmi
----

=== Inspecting enabled pollsters

----
oc get secret ceilometer-config-data -o jsonpath="{.data['polling\.yaml']}" | base64 -d
----

=== Enabling pollsters according to requirements
=== Create polling.yaml with required pollsters

----
cat << EOF > polling.yaml
Expand All @@ -110,6 +81,9 @@ sources:
- cpu
- memory
EOF
----

=== Update ceilometer configuration with new pollsters
----
oc patch secret ceilometer-config-data --patch="{\"data\": { \"polling.yaml\": \"$(base64 -w0 polling.yaml)\"}}"
----
2 changes: 2 additions & 0 deletions tests/roles/autoscaling_adoption/meta/main.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,2 @@
dependencies:
- role: common_defaults
Loading

0 comments on commit 57c9890

Please sign in to comment.