Since v0.8.0
Solr Clouds are complex distributed systems, and thus any operations that deal with data availability should be handled with care.
Since cluster operations deal with Solr's index data (either the availability of it, or moving it), its safest to only allow one operation to take place at a time. That is why these operations must first obtain a lock on the SolrCloud before execution can be started.
- Managed Rolling Updates
- Scaling Down with Replica Migrations
- Scaling Up with Replica Migrations
- Balancing Replicas Across Pods
- This is started after a Rolling Update with Ephemeral Data or after a ScaleUp operation.
The lock is implemented as an annotation on the SolrCloud's StatefulSet
.
The cluster operation retry queue is also implemented as an annotation.
These locks can be viewed at the following annotation keys:
solr.apache.org/clusterOpsLock
- The cluster operation that currently holds a lock on the SolrCloud and is executing.solr.apache.org/clusterOpsRetryQueue
- The queue of cluster operations that timed out and will be retried in order after theclusterOpsLock
is given up.
If all cluster operations executed without any issues, there would be no need to worry about deadlocks. Cluster operations give up the lock when the operation is complete, and then other operations that have been waiting can proceed. Unfortunately, these cluster operations can and will fail for a number of reasons:
- Replicas have no other pod to be placed when moving off of a node. (Due to the Replica Placement Plugin used)
- There are insufficient resources to create new Solr Pods.
- The Solr Pod Template has an error and new Solr Pods cannot be started successfully.
If this is the case, then we need to be able to stop the locked cluster operation if it hasn't succeeded in a certain time period. The cluster operation can only be stopped if there is no background task (async request) being executed in the Solr Cluster. Once cluster operation reaches a point at which it can stop, and the locking-timeout has been exceeded or an error was found, the cluster operation is paused, and added to a queue to retry later. The timeout is different per-operation:
- Scaling (Up or Down): 1 minute
- Rolling restarts: 10 minutes
Immediately afterwards, the Solr Operator sees if there are any other operations that need to take place while before the queued cluster operation is re-started. This allows for users to make changes to fix the reason why the cluster operation was failing. Examples:
-
If there are insufficient resources to create new Solr Pods
The user can decrease the resource requirements in the Pod Template.
This will create aRolling Update
cluster operation that will run once theScale Up
is paused.
TheScale Up
will be dequeued when theRolling Update
is complete, and can now complete because there are more available resources in the Kubernetes Cluster. -
Scale Down is failing because a replica from the scaled-down pod has nowhere to be moved to
The user can see this error in the logs, and know that the scale down won't work for their use case.
Instead they will have to scale the SolrCloud to the number of pods that theStatefulSet
is currently running.
Once theScale Down
is paused, it will be replaced by aScale Up
operation to current number of running pods.
This doesn't actually increase the number of pods, but it will issue a command to Solr to balance replicas across all pods, to make sure the cluster is well-balanced after the failedScaleDown
.
If a queued operation is going to be retried, the Solr Operator first makes sure that its values are still valid.
For the Scale Down
example above, when the Solr Operator tries to restart the queued Scale Down
operation, it sees that the SolrCloud.Spec.Replicas
is no longer lower than the current number of Solr Pods.
Therefore, the Scale Down
does not need to be retried, and a "fake" Scale Up
needs to take place.
When all else fails, and you need to stop a cluster operation, you can remove the lock annotation from the StatefulSet
manually.
Edit the StatefulSet (e.g. kubectl edit statefulset <name>
) and remove the cluster operation lock annotation: solr.apache.org/clusterOpsLock
This can be done via the following command:
$ kubectl annotate statefulset ${statefulSetName} solr.apache.org/clusterOpsLock-
This will only remove the current running cluster operation, if other cluster operations have been queued, they will be retried once the lock annotation is removed.
Also if the operation still needs to occur to put the SolrCloud in its expected state, then the operation will be retried once a lock can be acquired.
The only way to have the cluster operation not run again is to put the SolrCloud back to its previous state (for scaling, set SolrCloud.Spec.replicas
to the value found in StatefulSet.Spec.replicas
).
If the SolrCloud requires a rolling restart, it cannot be "put back to its previous state". The only way to move forward is to either delete the StatefulSet
(a very dangerous operation), or find a way to allow the RollingUpdate
operation to succeed.