diff --git a/app/alarm/Readme.md b/app/alarm/Readme.md index 012bb8c54b..4a686885c5 100644 --- a/app/alarm/Readme.md +++ b/app/alarm/Readme.md @@ -3,10 +3,10 @@ Alarm System Update of the alarm system that originally used RDB for configuration, JMS for updates, RDB for persistence of most recent state. - + This development uses Kafka to handle both, using "Compacted Topics". For an "Accelerator" configuration, a topic of that name holds the configuration and state changes. -When clients subscribe, they receive the most recent configuration and state, and from then on updates. +When clients subscribe, they receive the most recent configuration and state, and from then on updates. Kafka Installation @@ -23,12 +23,12 @@ kafka in `/opt/kafka`. # The 'examples' folder of this project contains some example scripts # that can be used with a kafka server in the same directory cd examples - + # Use wget, 'curl -O', or web browser wget http://ftp.wayne.edu/apache/kafka/2.3.0/kafka_2.12-2.3.0.tgz tar -vzxf kafka_2.12-2.3.0.tgz ln -s kafka_2.12-2.3.0 kafka - + Check `config/zookeeper.properties` and `config/server.properties`. By default these contain settings for keeping data in `/tmp/`, which works for initial tests, but risks that Linux will delete the data. @@ -37,9 +37,9 @@ For a production setup, change `zookeeper.properties`: # Suggest to change this to a location outside of /tmp, # for example /var/zookeeper-logs or /home/controls/zookeeper-logs dataDir=/tmp/zookeeper - + Similarly, change the directory setting in `server.properties` - + # Suggest to change this to a location outside of /tmp, # for example /var/kafka-logs or /home/controls/kafka-logs log.dirs=/tmp/kafka-logs @@ -85,7 +85,7 @@ for initial tests: sh start_kafka.sh # If kafka is started first, with the default zookeeper.connection.timeout of only 6 seconds, - # it will fail to start and close with a null pointer exception. + # it will fail to start and close with a null pointer exception. # Simply start kafka after zookeeper is running to recover. @@ -104,7 +104,7 @@ for running Zookeeper, Kafka and the alarm server as Linux services: sudo systemctl enable kafka.service sudo systemctl enable alarm_server.service - + Kafka Demo ---------- @@ -141,10 +141,10 @@ but simply meant to learn about Kafka or to test connectivity. Stop local instance: # Either in the kafka terminal, then in the zookeeper terminal - + # Or: sh stop_all.sh - + For more, see https://kafka.apache.org/documentation.html @@ -160,7 +160,7 @@ It will create these topics: * "Accelerator": Alarm configuration and state (compacted) * "AcceleratorCommand": Commands like "acknowledge" from UI to the alarm server (deleted) * "AcceleratorTalk": Annunciations (deleted) - + The command messages are unidirectional from the alarm UI to the alarm server. The talk messages are unidirectional from the alarm server to the alarm annunciator. Both command and talk topics are configured to delete older messages, because only new messages are relevant. @@ -183,8 +183,8 @@ More on this in http://www.shayne.me/blog/2015/2015-06-25-everything-about-kafka You can track the log cleaner runs via tail -f logs/log-cleaner.log - - + + Start Alarm Server ------------------ @@ -226,8 +226,8 @@ The messages in the config topic consist of a path to the alarm tree item that i Example key: config:/Accelerator/Vacuum/SomePV - -The message always contains the user name and host name of who is changing the configuration. + +The message always contains the user name and host name of who is changing the configuration. The full config topic JSON format for a alarm tree leaf: @@ -270,7 +270,7 @@ Deleting an item consists of marking a path with a value of null. This "tombston For example: config:/path/to/pv : null - + This process variable is now marked as deleted. However, there is an issue. We do not know why, or by whom it was deleted. To address this, a message including the missing relevant information is sent before the tombstone is set. This message consists of a user name, host name, and a delete message. The delete message may offer details on why the item was deleted. @@ -282,12 +282,12 @@ The config delete message JSON format: "host": String, "delete": String } - + The above example of deleting a PV would then look like this: config:/path/to/pv : {"user":"user name", "host":"host name", "delete": "Deleting"} config:/path/to/pv : null - + The message about who deleted the PV would obviously be compacted and deleted itself, but it would be aggregated into the long term topic beforehand thus preserving a record of the deletion. ______________ - Type `state:`, State Topic: @@ -317,7 +317,7 @@ The state topic JSON format for an alarm tree node: "mode": String, } -At minimum, state updates this always contain a "severity". +At minimum, state updates this always contain a "severity". The "latch" entry will only be present when an alarm that is configured to latch is actually latching, i.e. entering an alarm severity @@ -334,7 +334,7 @@ Example messages that could appear in a state topic: In this example, the first message is issued when the alarm latches to the MAJOR severity. The following update indicates that the PV's current severity dropped to MINOR, while the alarm severity, message, time and value continue to reflect the latched state. - + ________________ - Type `command:`, Command Topic: @@ -347,7 +347,7 @@ The command topic JSON format: "host": String, "command": String } - + An example message that could appear in a command topic: command:/path/to/pv : {"user":"user name", "host":"host name", "command":"acknowledge"} @@ -406,6 +406,150 @@ it can lock the UI while the internal TreeView code gets to traverse all 'siblin This has been observed if there are 10000 or more siblings, i.e. direct child nodes to one node of the alarm tree. It can be avoided by for example adding sub-nodes. +Encryption, Authentication and Authorization +-------------------------------------------- + +The default setup as described so far connects to Kafka without encryption nor authentication. +While this may be acceptable for a closed control system network, you can enable encryption, +authentication and authorization for extended security. +Kafka allows many authentication schemes. Below outlines the setup for SSL encryption with +either two-way TSL authentication or user/password (a.k.a SASL PLAIN). + +### Prerequistes + +To enable SSL encryption at least the kafka server requires a SSL certificate. +You can create your own self signed root CA to sign these certificates. +Then add this rootCA to a truststore, create a certificate for the server, sign it +and add it to a keystore. +Confluent provides a good [step-by-step documentation](https://docs.confluent.io/platform/current/security/security_tutorial.html#creating-ssl-keys-and-certificates). +Here is a short version. + +Create the root CA +``` +openssl req -new -x509 -keyout rootCAKey.pem -out rootCACert.pem -days 365 +``` + +Add it to a truststore +``` +keytool -keystore kafka.truststore.jks -alias CARoot -importcert -file rootCACert.pem +``` + +Create a certificate for the server (your name should be the FQDN) and export the certificate signing request: +``` +keytool -keystore kafka.server.keystore.jks -alias localhost -keyalg RSA -genkey +keytool -keystore kafka.server.keystore.jks -alias localhost -certreq -file server.csr +``` +Sign the csr: +``` +openssl x509 -req -CA rootCACert.pem -CAkey rootCAKey.pem -in server.csr -out serverCert.pem -days 365 -CAcreateserial +``` + +Import the signed certificate and the root CA into the keystore: +``` +keytool -keystore kafka.server.keystore.jks -alias localhost -importcert -file serverCert.pem +keytool -keystore kafka.server.keystore.jks -alias CARoot -importcert -file rootCACert.pem +``` + +If you want two-way TSL authentication repeat the certificate creation for the clients +so that you also have a `kafka.client.keystore.jks` file + + +### Configure Kafka + +In `/opt/kafka/config/server.properties` add an SSL and/or SASL_SSL listener like: +``` +listeners=PLAINTEXT://:9092,SSL://:9093,SASL_SSL://:9094 +``` +SSL will use SSL encryption and possibly two-way authentication (clients having their own certificates). +SASL_SSL will use SSL encryption and SASL authentication, which we will configure below for username/password. +You may also remove the PLAINTEXT listner if you want to disallow unencrypted communication. + +In `/opt/kafka/config/server.properties` add the SSL configuration +``` +# If you want the brokers to authenticate to each other with SASL, use SASL_SSL here +security.inter.broker.protocol=SSL + +ssl.truststore.location=/opt/kafka/config/kafka.truststore.jks +ssl.truststore.password= +ssl.keystore.location=/opt/kafka/config/kafka.server.keystore.jks +ssl.keystore.password= +ssl.key.password= + +# uncomment if clients must provide certificates (two-way TLS) +#ssl.client.auth=required + +# Below configures SASL authentication, remove if not needed +sasl.enabled.mechanisms=PLAIN +sasl.mechanism.inter.broker.protocol=PLAIN + +listener.name.sasl_ssl.plain.sasl.jaas.config=org.apache.kafka.common.security.plain.PlainLoginModule required \ + username="admin" \ + password="admin-secret" \ + user_admin="admin-secret" \ + user_kafkaclient1="kafkaclient1-secret"; + +``` + +Restart Kafka for these settings to take effect. + +### Configure CS-Studio UI, Alarm Server, Alarm Logger + +Create a `kafka.properties` file with the following content. +For SSL: +``` +security.protocol=SSL +ssl.truststore.location=/opt/kafka/config/kafka.truststore.jks +ssl.truststore.password= +# Uncomment these for SSL-authentication (two-way TLS) +#ssl.keystore.location=/opt/kafka/config/kafka.client.keystore.jks +#ssl.keystore.password= +#ssl.key.password= +``` + +For SSL with SASL: +``` +sasl.mechanism=PLAIN +security.protocol=SASL_SSL + +ssl.truststore.location=/opt/kafka/config/kafka.truststore.jks +ssl.truststore.password=client + +sasl.jaas.config=org.apache.kafka.common.security.plain.PlainLoginModule required \ + username="kafkaclient1" \ + password="kafkaclient1-secret"; +``` + +Adjust the port of the kafka server in your phoebus settings and preferably +use the FQDN instead of `localhost` for SSL connections. Otherwise certificate +validation might fail. +Edit the preferences to add +``` +org.phoebus.applications.alarm/kafka_properties=kafka.properties +``` +or pass it with `-kafka_properties kafka.properties` to the service. + +### Authorization + +With authenticated clients you could then enable authorization for fine grained control. +In your kafka server add to `/opt/kafka/config/server.properties`: + +``` +# enable the authorizer +authorizer.class.name=kafka.security.authorizer.AclAuthorizer +# default to no restrictions +allow.everyone.if.no.acl.found=true +#set brokers as superusers +super.users=User:broker.your-accelerator.org,User:admin +``` + +Then run for example +``` +./kafka-acls.sh --bootstrap-server broker.your-accelerator.org:9093 --command-config ../config/client.properties --add --allow-principal User:* --operation read --topic Accelerator --topic AcceleratorCommand --topic AcceleratorTalk +./kafka-acls.sh --bootstrap-server broker.your-accelerator.org:9093 --command-config ../config/client.properties --add --allow-principal User:special-client.your-accelerator.org --operation read --operation write --topic Accelerator --topic AcceleratorCommand --topic AcceleratorTalk +``` +to allow anybody to see the active alarms, but only the special-client to acknowledge them and to change the configuration. +The `../config/client.properties` must have credentails to authenticate the client as a super user. +So, admin or broker.your-accelerator.org in this case. Issues @@ -415,11 +559,11 @@ In earlier versions of Kafka, the log cleaner sometimes failed to compact the lo The file `kafka/logs/log-cleaner.log` would not show any log-cleaner action. The workaround was to top the alarm server, alarm clients, kafka, then restart them. When functional, the file `kafka/logs/log-cleaner.log` shows periodic compaction like this: - + [2018-06-01 15:01:01,652] INFO Starting the log cleaner (kafka.log.LogCleaner) [2018-06-01 15:01:16,697] INFO Cleaner 0: Beginning cleaning of log Accelerator-0. (kafka.log.LogCleaner) ... Start size: 0.1 MB (414 messages) End size: 0.1 MB (380 messages) 8.9% size reduction (8.2% fewer messages) - +