Production Kafka Deployment Using Ansible
General Notes
LOG_LEVEL
values can be found https://docs.python.org/3/library/logging.html#logging-levels
- confluent_kafka
- boto3
- google-cloud-storage
- pendulum
- azure-storage-blob
- minio
- It will take backup of given topic and store that into either local filesystem or S3 or Azure.
- It will auto resume from same point from where it died if given consumer group name is same before and after crash.
it will uploadcurrent.bin
file to s3 which contains messages uptoNUMBER_OF_MESSAGE_PER_BACKUP_FILE
but will only upload with other backup files.RETRY_UPLOAD_SECONDS
controls upload to cloud storage.NUMBER_OF_KAFKA_THREADS
is used to parallelise reading from kafka topic. It should not be more than number of partitions.NUMBER_OF_MESSAGE_PER_BACKUP_FILE
will try to keep this number consistent in file but if application got restarted then it may be vary for first back file.
- it will restore from backup dir into given topic.
RETRY_SECONDS
controls when to rereadFILESYSTEM_BACKUP_DIR
for new files.RESTORE_PARTITION_STRATEGY
controls, in which partition it will restore messages. ifsame
is mentioned then it will restore into same topic partition but ifrandom
is mentioned then it will restore to all partitions randomly.
Known Issues
- NA