Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Waiting for kafka? #64

Closed
bclouser opened this issue Apr 23, 2018 · 15 comments
Closed

Waiting for kafka? #64

bclouser opened this issue Apr 23, 2018 · 15 comments

Comments

@bclouser
Copy link

bclouser commented Apr 23, 2018

On Master, after install pre-reqs, ubuntu 16.04, Upon running 'make start' I get:

192.168.99.100
Using cluster from kubectl context: minikube

serviceaccount "weave-net" unchanged
clusterrole.rbac.authorization.k8s.io "weave-net" configured
clusterrolebinding.rbac.authorization.k8s.io "weave-net" configured
role.rbac.authorization.k8s.io "weave-net" unchanged
rolebinding.rbac.authorization.k8s.io "weave-net" unchanged
daemonset.extensions "weave-net" configured
Using cluster from kubectl context: minikube

daemonset.apps "pull-images" unchanged
pull-images-82rswUsing cluster from kubectl context: minikube

namespace "ingress-nginx" configured
deployment.extensions "default-http-backend" unchanged
service "default-http-backend" unchanged
configmap "nginx-configuration" unchanged
configmap "tcp-services" unchanged
configmap "udp-services" unchanged
serviceaccount "nginx-ingress-serviceaccount" unchanged
clusterrole.rbac.authorization.k8s.io "nginx-ingress-clusterrole" configured
role.rbac.authorization.k8s.io "nginx-ingress-role" unchanged
rolebinding.rbac.authorization.k8s.io "nginx-ingress-role-nisa-binding" unchanged
clusterrolebinding.rbac.authorization.k8s.io "nginx-ingress-clusterrole-nisa-binding" configured
deployment.extensions "nginx-ingress-controller" unchanged
service "ingress-nginx" unchanged
Using cluster from kubectl context: minikube

configmap "kafka-config" unchanged
configmap "kafka-shared" unchanged
statefulset.apps "kafka" unchanged
persistentvolumeclaim "kafka-pvc" unchanged
persistentvolume "kafka-pv" configured
service "kafka" unchanged
configmap "mysql-config" configured
statefulset.apps "mysql" unchanged
persistentvolumeclaim "mysql-pvc" unchanged
persistentvolume "mysql-pv" configured
secret "mysql-secret" unchanged
service "mysql" unchanged
configmap "zookeeper-config" unchanged
statefulset.apps "zookeeper" unchanged
persistentvolumeclaim "zookeeper-pvc" unchanged
persistentvolume "zookeeper-pv" configured
service "zookeeper" unchanged
Waiting for kafka
Waiting for kafka
Waiting for kafka
Waiting for kafka
Waiting for kafka
Waiting for kafka
Waiting for kafka
Waiting for kafka
Waiting for kafka
Waiting for kafka
Waiting for kafka
Waiting for kafka
Waiting for kafka
Waiting for kafka
...

So I guess the kafka docker image is never starting? Is this common? Has anyone seen this before?

Thanks,
Ben

@taheris
Copy link
Contributor

taheris commented Apr 23, 2018

Hi Ben. The Kafka template now waits for Zookeeper before starting to avoid restarts. Can you check your zookeeper_host in the infra.yaml config? Thanks

@bclouser
Copy link
Author

bclouser commented Apr 23, 2018

Hi Taheris,

Output of cat config/infra.yaml

---
env_prefix: ce
ingress_acl_whitelist: 0.0.0.0/0
ingress_dns_name: ota.local
ingress_proxy_size: 300m
kafka_host: kafka
kafka_topic_suffix: ce
mysql_host: mysql
persistent_volumes: true
storage_class_name: ""
zookeeper_host: zookeeper

Should 'zookeeper_host' not be zookeeper?

Also, I haven't modified anything, I am just attempting to run this as a "black box" example.

Thanks,
Ben

@taheris
Copy link
Contributor

taheris commented Apr 24, 2018

The default DNS name should be fine, but for some reason the kafka init container can't resolve that zookeeper hostname. Can you try removing the initContainers section entirely from kafka.tmpl.yaml file and re-running make start?

@marceleng
Copy link

Hi all. I have the same issue with the unmodified latest tree. Removing the initContainers section from kafka.tmpl.yaml did not solve the problem.

@taheris
Copy link
Contributor

taheris commented Jun 5, 2018

Hi @marceleng. Can you post the output from kubectl logs kafka-0 and kubectl describe pod kafka-0? Thanks.

@marceleng
Copy link

@taheris kubectl logs kafka-0 finishes without any output. For the other one:

# kubectl describe pod kafka-0
Name:           kafka-0
Namespace:      default
Node:           <none>
Labels:         app=kafka
                controller-revision-hash=kafka-7d777cbd56
                statefulset.kubernetes.io/pod-name=kafka-0
Annotations:    <none>
Status:         Pending
IP:             
Controlled By:  StatefulSet/kafka
Init Containers:
  kafka-init:
    Image:      busybox:1.28.0
    Port:       <none>
    Host Port:  <none>
    Command:
      sh
      -c
      until nc -z -w2 zookeeper 2181; do sleep 2s; done
    Environment:  <none>
    Mounts:
      /var/run/secrets/kubernetes.io/serviceaccount from default-token-7jbp9 (ro)
Containers:
  kafka:
    Image:      confluentinc/cp-kafka:4.0.0
    Port:       9092/TCP
    Host Port:  0/TCP
    Command:
      sh
      -c
      unset KAFKA_PORT && \
export KAFKA_ADVERTISED_HOST_NAME=${POD_IP} && \
export KAFKA_ADVERTISED_LISTENERS=PLAINTEXT://${POD_IP}:9092 && \
export KAFKA_BROKER_ID=${HOSTNAME##*-} && \
export KAFKA_LOG_DIRS=/opt/kafka/data && \
/etc/confluent/docker/run

    Requests:
      cpu:      80m
      memory:   400Mi
    Liveness:   tcp-socket :9092 delay=30s timeout=5s period=10s #success=1 #failure=3
    Readiness:  exec [kafka-topics --zookeeper zookeeper:2181 --list] delay=30s timeout=5s period=10s #success=1 #failure=3
    Environment Variables from:
      kafka-config  ConfigMap  Optional: false
    Environment:
      POD_IP:   (v1:status.podIP)
    Mounts:
      /opt/kafka/data from kafka-claim (rw)
      /var/run/secrets/kubernetes.io/serviceaccount from default-token-7jbp9 (ro)
Conditions:
  Type           Status
  PodScheduled   False 
Volumes:
  kafka-claim:
    Type:       PersistentVolumeClaim (a reference to a PersistentVolumeClaim in the same namespace)
    ClaimName:  kafka-claim-kafka-0
    ReadOnly:   false
  default-token-7jbp9:
    Type:        Secret (a volume populated by a Secret)
    SecretName:  default-token-7jbp9
    Optional:    false
QoS Class:       Burstable
Node-Selectors:  <none>
Tolerations:     node.kubernetes.io/not-ready:NoExecute for 300s
                 node.kubernetes.io/unreachable:NoExecute for 300s
Events:
  Type     Reason            Age               From               Message
  ----     ------            ----              ----               -------
  Warning  FailedScheduling  59s (x8 over 2m)  default-scheduler  pod has unbound PersistentVolumeClaims

@taheris
Copy link
Contributor

taheris commented Jun 5, 2018

Kafka isn't starting because the Persistent Volume Claim is still unbound. Can you post the output of kubectl get pvc and kubectl get sc as well? Thanks.

@marceleng
Copy link

I'm not very familiar with Kubernetes. I assume this means that some storage space was not allocated for the services?

Here are the commands you asked for. My setup is running on Ubuntu 16.04 with minikube v0.27.0

# kubectl get pvc
NAME                          STATUS    VOLUME    CAPACITY   ACCESS MODES   STORAGECLASS   AGE
kafka-claim-kafka-0           Pending                                       standard       12m
mysql-claim-mysql-0           Pending                                       standard       11m
zookeeper-claim-zookeeper-0   Pending                                       standard       11m
# kubectl get sc
NAME                 PROVISIONER                AGE
standard (default)   k8s.io/minikube-hostpath   10m

@taheris
Copy link
Contributor

taheris commented Jun 5, 2018

Thanks. The claims are created but stuck in pending. What does the output of kubectl describe pvc kafka-claim-kafka-0 look like?

@marceleng
Copy link

# kubectl describe pvc kafka-claim-kafka-0
Name:          kafka-claim-kafka-0
Namespace:     default
StorageClass:  standard
Status:        Pending
Volume:        
Labels:        app=kafka
Annotations:   volume.beta.kubernetes.io/storage-class=standard
               volume.beta.kubernetes.io/storage-provisioner=k8s.io/minikube-hostpath
Finalizers:    [kubernetes.io/pvc-protection]
Capacity:      
Access Modes:  
Events:
  Type     Reason                Age                From                         Message
  ----     ------                ----               ----                         -------
  Warning  ProvisioningFailed    20m (x7 over 21m)  persistentvolume-controller  storageclass.storage.k8s.io "standard" not found
  Normal   ExternalProvisioning  14m (x8 over 16m)  persistentvolume-controller  waiting for a volume to be created, either by external provisioner "k8s.io/minikube-hostpath" or manually created by system administrator
  Normal   ExternalProvisioning  3m (x41 over 13m)  persistentvolume-controller  waiting for a volume to be created, either by external provisioner "k8s.io/minikube-hostpath" or manually created by system administrator

Thanks a lot for the help btw @taheris!

@taheris
Copy link
Contributor

taheris commented Jun 5, 2018

No problem. It looks like the minikube is stuck waiting for the provisioner. Taking a look at their Github issues it seems a few other people are having this issue as well: kubernetes/minikube#1783

Can you take a look at some of the suggested solutions in that thread and see if that helps?

@marceleng
Copy link

marceleng commented Jun 6, 2018

I've looked a bit and I cannot reproduce the issue you referenced on my machine - it seems to have been fixed in recent minikube releases. However, looking at the log from kubectl describe pvc kafka-claim-kafka-0 it seemed that something was missing from the infrastructure definition: storageclass.storage.k8s.io "standard" not found

I created such a class and loaded it manually to minikube and it solved the waiting for Kafka problem:

# cat storage.yml
apiVersion: storage.k8s.io/v1
kind: StorageClass
metadata:
  name: standard
provisioner: k8s.io/minikube-hostpath
parameters:
  type: pd-standard
# minikube start --vm-driver kvm2 --cpus 4 --memory 16000
[Output redacted]
# kubectl apply -f storage.yml
storageclass.storage.k8s.io "standard" created
# make start

Probably that definition should be integrated in the bootstrap process. However the process still hangs and crashes afterwards on Waiting for root.json, I'm looking into it to see whether I can pinpoint the problem.

@taheris
Copy link
Contributor

taheris commented Jun 6, 2018

@marceleng: it seems there was a bug with older versions of httpie when redirecting to /dev/null while outputting to a file, which PR #75 should fix. Can you try again?

@marceleng
Copy link

This fixed it for me thanks :). Also regarding the previous problem (hanging on Waiting for kafka), I think it is mostly related to the time taken by the cluster to create the persistent volumes and bind them to the right pods. It might be due to my setup, iotop shows that disk writes might be the bottleneck. I'll try running everything on an SSD.

@taheris
Copy link
Contributor

taheris commented Jun 7, 2018

Good to hear. And yes, running all these services on one box can be disk IO heavy. I'll close this issue for now but you can open another if you run into further problems.

@taheris taheris closed this as completed Jun 7, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants