v0.2.0 Sentinel Config ReadOnly #35

Paic · 2018-03-26T13:18:00Z

I deployed redis-operator v0.2.0 on a brand new GKE cluster, added RBAC permissions (also, is there a list of those permissions ?) and the Sentinels seems to crash.

Redis Sentinel logs:

# Sentinel config file /redis/sentinel.conf is not writable: Read-only file system. Exiting...

Am I missing something ?

The text was updated successfully, but these errors were encountered:

jchanam · 2018-03-26T15:48:38Z

Hi!

For enabling the RBAC on the operator, I recommend you to use the helm chart. The list of the permissions needed are here.

Apart from that, I'm not able to reproduce your issue. Here is what I've done:

Launch a fresh minikube with RBAC active:
sudo minikube start --vm-driver=none --extra-config=apiserver.Authorization.Mode=RBAC
Create a clusterRoleBinding to the default service-account on kube-system namespace:
kubectl create clusterrolebinding add-on-cluster-admin --clusterrole=cluster-admin --serviceaccount=kube-system:default
Install tiller on my cluster:
helm init
Deploy the redis-operator with the given chart:
helm install charts/redisoperator/ --name redis-operator --set "rbac.install=true"
Create the example redis-failover:
kubectl create -f example/redisfailover.yaml
After a couple of minutes All the pods are ready:

rfr-redisfailover-0                             2/2       Running   0          4m
rfr-redisfailover-1                             2/2       Running   0          4m
rfr-redisfailover-2                             2/2       Running   0          3m
rfs-redisfailover-7d9f479b65-bw27m              1/1       Running   0          4m
rfs-redisfailover-7d9f479b65-ks9kz              1/1       Running   0          4m
rfs-redisfailover-7d9f479b65-pvn6m              1/1       Running   0          4m

If I check the status of the redises, I get this:

Node 0:

# Replication
role:master
connected_slaves:2
slave0:ip=172.17.0.11,port=6379,state=online,offset=12663,lag=0
slave1:ip=172.17.0.10,port=6379,state=online,offset=12663,lag=0
master_repl_offset:12663
repl_backlog_active:1
repl_backlog_size:1048576
repl_backlog_first_byte_offset:2
repl_backlog_histlen:12662

Node 1:

# Replication
role:slave
master_host:172.17.0.6
master_port:6379
master_link_status:up
master_last_io_seconds_ago:2
master_sync_in_progress:0
slave_repl_offset:12663
slave_priority:100
slave_read_only:1
connected_slaves:0
master_repl_offset:0
repl_backlog_active:0
repl_backlog_size:1048576
repl_backlog_first_byte_offset:0
repl_backlog_histlen:0

Node 2:

# Replication
role:slave
master_host:172.17.0.6
master_port:6379
master_link_status:up
master_last_io_seconds_ago:0
master_sync_in_progress:0
slave_repl_offset:12798
slave_priority:100
slave_read_only:1
connected_slaves:0
master_repl_offset:0
repl_backlog_active:0
repl_backlog_size:1048576
repl_backlog_first_byte_offset:0
repl_backlog_histlen:0

The status of the sentinels are this:

Node A:

# Sentinel
sentinel_masters:1
sentinel_tilt:0
sentinel_running_scripts:0
sentinel_scripts_queue_length:0
sentinel_simulate_failure_flags:0
master0:name=mymaster,status=ok,address=172.17.0.6:6379,slaves=2,sentinels=3

Node B:

# Sentinel
sentinel_masters:1
sentinel_tilt:0
sentinel_running_scripts:0
sentinel_scripts_queue_length:0
sentinel_simulate_failure_flags:0
master0:name=mymaster,status=ok,address=172.17.0.6:6379,slaves=2,sentinels=3

Node C:

sentinel_masters:1
sentinel_tilt:0
sentinel_running_scripts:0
sentinel_scripts_queue_length:0
sentinel_simulate_failure_flags:0
master0:name=mymaster,status=ok,address=172.17.0.6:6379,slaves=2,sentinels=3

Here is an example of the output of a sentinel logs:

                _._                                                  
           _.-``__ ''-._                                             
      _.-``    `.  `_.  ''-._           Redis 3.2.11 (00000000/0) 64 bit
  .-`` .-```.  ```\/    _.,_ ''-._                                   
 (    '      ,       .-`  | `,    )     Running in sentinel mode
 |`-._`-...-` __...-.``-._|'` _.-'|     Port: 26379
 |    `-._   `._    /     _.-'    |     PID: 1
  `-._    `-._  `-./  _.-'    _.-'                                   
 |`-._`-._    `-.__.-'    _.-'_.-'|                                  
 |    `-._`-._        _.-'_.-'    |           http://redis.io        
  `-._    `-._`-.__.-'_.-'    _.-'                                   
 |`-._`-._    `-.__.-'    _.-'_.-'|                                  
 |    `-._`-._        _.-'_.-'    |                                  
  `-._    `-._`-.__.-'_.-'    _.-'                                   
      `-._    `-.__.-'    _.-'                                       
          `-._        _.-'                                           
              `-.__.-'                                               

1:X 26 Mar 15:32:11.089 # WARNING: The TCP backlog setting of 511 cannot be enforced because /proc/sys/net/core/somaxconn is set to the lower value of 128.
1:X 26 Mar 15:32:11.095 # Sentinel ID is 5d56f5cc89225e5431a4cac20514e2ce04f4aa0b
1:X 26 Mar 15:32:11.095 # +monitor master mymaster 127.0.0.1 6379 quorum 2
1:X 26 Mar 15:32:12.106 # +sdown master mymaster 127.0.0.1 6379
1:X 26 Mar 15:35:06.719 # -monitor master mymaster 127.0.0.1 6379
1:X 26 Mar 15:35:06.729 # +monitor master mymaster 172.17.0.6 6379 quorum 2
1:X 26 Mar 15:35:06.758 # +reset-master master mymaster 172.17.0.6 6379
1:X 26 Mar 15:35:06.778 * +slave slave 172.17.0.11:6379 172.17.0.11 6379 @ mymaster 172.17.0.6 6379
1:X 26 Mar 15:35:07.344 # +reset-master master mymaster 172.17.0.6 6379
1:X 26 Mar 15:35:08.742 * +sentinel sentinel af9b6b10ed64532d73903c1aed80772b3090d459 172.17.0.8 26379 @ mymaster 172.17.0.6 6379
1:X 26 Mar 15:35:08.763 * +sentinel sentinel 7a0ff75331388455c357baeedc915cc730676a8b 172.17.0.7 26379 @ mymaster 172.17.0.6 6379
1:X 26 Mar 15:35:16.844 * +slave slave 172.17.0.11:6379 172.17.0.11 6379 @ mymaster 172.17.0.6 6379
1:X 26 Mar 15:35:16.849 * +slave slave 172.17.0.10:6379 172.17.0.10 6379 @ mymaster 172.17.0.6 6379

Could you give me more information of how to reproduce the issue you're facing?

Thank you for opening this issue!

Paic · 2018-03-27T08:02:50Z

Thanks for the quick reply!
I did not even notice a chart was available, awesome !

After recreating a cluster (GKE - OS: cos - Kubernetes version: 1.9.4-gke.1) and following your steps about installing Helm, deploying the operator with helm and using the example failover, I still have the same error on the sentinels :

                _._
           _.-``__ ''-._
      _.-``    `.  `_.  ''-._           Redis 3.2.11 (00000000/0) 64 bit
  .-`` .-```.  ```\/    _.,_ ''-._
 (    '      ,       .-`  | `,    )     Running in sentinel mode
 |`-._`-...-` __...-.``-._|'` _.-'|     Port: 26379
 |    `-._   `._    /     _.-'    |     PID: 1
  `-._    `-._  `-./  _.-'    _.-'
 |`-._`-._    `-.__.-'    _.-'_.-'|
 |    `-._`-._        _.-'_.-'    |           http://redis.io
  `-._    `-._`-.__.-'_.-'    _.-'
 |`-._`-._    `-.__.-'    _.-'_.-'|
 |    `-._`-._        _.-'_.-'    |
  `-._    `-._`-.__.-'_.-'    _.-'
      `-._    `-.__.-'    _.-'
          `-._        _.-'
              `-.__.-'

1:X 27 Mar 07:36:55.265 # WARNING: The TCP backlog setting of 511 cannot be enforced because /proc/sys/net/core/somaxconn is set to the lower value of 128.
1:X 27 Mar 07:36:55.265 # Sentinel config file /redis/sentinel.conf is not writable: Read-only file system. Exiting...

NAME                                            READY     STATUS             RESTARTS   AGE
redis-operator-redisoperator-57d74cd97c-2wvff   1/1       Running            0          5m
rfr-redisfailover-0                             1/1       Running            0          3m
rfr-redisfailover-1                             1/1       Running            0          3m
rfr-redisfailover-2                             1/1       Running            0          3m
rfs-redisfailover-5645bc4c57-hjgst              0/1       CrashLoopBackOff   5          3m
rfs-redisfailover-5645bc4c57-j6ld6              0/1       CrashLoopBackOff   5          3m
rfs-redisfailover-5645bc4c57-sxbsr              0/1       CrashLoopBackOff   5          3m

The redis-operator keeps outputing these :

time="2018-03-27T07:56:47Z" level=info msg="configMap updated" configMap=rfs-redisfailover namespace=default service=k8s.configMap src="configmap.go:58"
time="2018-03-27T07:56:48Z" level=info msg="configMap updated" configMap=rfr-redisfailover namespace=default service=k8s.configMap src="configmap.go:58"
time="2018-03-27T07:56:48Z" level=info msg="podDisruptionBudget updated" namespace=default podDisruptionBudget=rfr-redisfailover service=k8s.podDisruptionBudget src
="poddisruptionbudget.go:58"
time="2018-03-27T07:56:48Z" level=info msg="statefulSet updated" namespace=default service=k8s.statefulSet src="statefulset.go:77" statefulSet=rfr-redisfailover
time="2018-03-27T07:56:48Z" level=info msg="podDisruptionBudget updated" namespace=default podDisruptionBudget=rfs-redisfailover service=k8s.podDisruptionBudget src
="poddisruptionbudget.go:58"
time="2018-03-27T07:56:48Z" level=info msg="deployment updated" deployment=rfs-redisfailover namespace=default service=k8s.deployment src="deployment.go:77"
time="2018-03-27T07:56:49Z" level=error msg="Error processing default/redisfailover: dial tcp 10.60.2.13:26379: getsockopt: connection refused" controller=redisfail
over operator=redisfailover src="generic.go:158"

However, the redises seems to be configured properly :

Redis 0 :

# Replication
role:master
connected_slaves:2
slave0:ip=10.60.1.11,port=6379,state=online,offset=617,lag=0
slave1:ip=10.60.0.14,port=6379,state=online,offset=617,lag=0
master_repl_offset:617
repl_backlog_active:1
repl_backlog_size:1048576
repl_backlog_first_byte_offset:2
repl_backlog_histlen:616

Redis 1 :

# Replication
role:slave
master_host:10.60.2.12
master_port:6379
master_link_status:up
master_last_io_seconds_ago:3
master_sync_in_progress:0
slave_repl_offset:813
slave_priority:100
slave_read_only:1
connected_slaves:0
master_repl_offset:0
repl_backlog_active:0
repl_backlog_size:1048576
repl_backlog_first_byte_offset:0
repl_backlog_histlen:0

Redis 2 :

# Replication
role:slave
master_host:10.60.2.12
master_port:6379
master_link_status:up
master_last_io_seconds_ago:5
master_sync_in_progress:0
slave_repl_offset:841
slave_priority:100
slave_read_only:1
connected_slaves:0
master_repl_offset:0
repl_backlog_active:0
repl_backlog_size:1048576
repl_backlog_first_byte_offset:0
repl_backlog_histlen:0

jchanam · 2018-03-28T11:53:41Z

Hi @Paic,

That file is from a volume taken from a configmap. I don't know if it works different on the images that Google provides, and that's why I cannot reproduce it.

The mode can be set with this: https://kubernetes.io/docs/concepts/storage/volumes/#example-pod-with-multiple-secrets-with-a-non-default-permission-mode-set

Could you edit that deployment and add mode: 666 to the volume called sentinel-config?

About the error that is seen on the logs, it's because the operator is trying to connect to the sentinels to check their status and fix it if needed.

jchanam · 2018-03-28T12:02:50Z

Hi again,

As this is a ConfigMap, it seems based on the K8S API Reference, that it is: defaultMode.

It is weird, because at it's said on the API Reference, de default one is 644, and root should be able to write on it.

Please, keep us updated with this :)

adamresson · 2018-03-28T12:09:12Z

This likely has to do with the security issue and fix in 1.9.4 that requires configmaps to be readonly. See overarching issue here: kubernetes/kubernetes#61563

jchanam · 2018-03-28T14:14:34Z

@Paic please use version 0.2.1 and confirm us this problem is solved.

Thanks for helping us improving this!

Paic · 2018-03-29T14:28:39Z

Can confirm it's working on a fresh GKE 1.9.4 cluster.

Thanks guys.

Paic changed the title ~~Sentinel Config ReadOnly~~ v0.2.0 Sentinel Config ReadOnly Mar 26, 2018

jchanam mentioned this issue Mar 28, 2018

[Devops-738] Make sentinel.conf writable #36

Merged

jchanam closed this as completed in #36 Mar 28, 2018

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

v0.2.0 Sentinel Config ReadOnly #35

v0.2.0 Sentinel Config ReadOnly #35

Paic commented Mar 26, 2018 •

edited

Loading

jchanam commented Mar 26, 2018

Paic commented Mar 27, 2018

jchanam commented Mar 28, 2018

jchanam commented Mar 28, 2018

adamresson commented Mar 28, 2018

jchanam commented Mar 28, 2018 •

edited

Loading

Paic commented Mar 29, 2018 •

edited

Loading

v0.2.0 Sentinel Config ReadOnly #35

v0.2.0 Sentinel Config ReadOnly #35

Comments

Paic commented Mar 26, 2018 • edited Loading

jchanam commented Mar 26, 2018

Paic commented Mar 27, 2018

jchanam commented Mar 28, 2018

jchanam commented Mar 28, 2018

adamresson commented Mar 28, 2018

jchanam commented Mar 28, 2018 • edited Loading

Paic commented Mar 29, 2018 • edited Loading

Paic commented Mar 26, 2018 •

edited

Loading

jchanam commented Mar 28, 2018 •

edited

Loading

Paic commented Mar 29, 2018 •

edited

Loading