Shards in a down state after an HPA scale up / scale down event. #682
Labels
autoscaling
Autoscaling of Solr Nodes using the HPA and Solr APIs
bug
Something isn't working
cloud
networking
Related to Services or Ingresses
Milestone
I installed the solr operator 0.8.0 with solr image 9.4.1 on AKS.
Using a a guideline this video : Rethinking Autoscaling for Apache Solr using Kubernetes - Berlin Buzzwords 2023
The setup uses persistent disks.
I created 2 indexes and put some data in it.
index test: 3 shards and 2 replica's
index test2: 6 shards and 2 replica's
I configured an HPA and stressed the cluster a bit to make sure the cluster would scale up from 5 to 11 nodes.
Scaling up went fine. Shards for the 2 indexes got moved to the new nodes.
During scaling down, however , some shards get a lot of "down" replica's
The HPA mentioned it would scale down to 5 pods, but there kept 6 running.
The logs offcourse reveal
In the overseer there are items still in the work queue.
On the disk for the given shards , i can see the folders of the shards
They all seemed empty though.
So i suspect something wrong with the scale down/up / migration of the shards.
Every pod gets restarted during the downgrade......
What could be the issue for the number of down shards to be so huge.
PS i did the same test on a Kind cluster with the same results.
The text was updated successfully, but these errors were encountered: