Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

piraeus-operator v1.10.6 piraeus-cs-controller crashLoop error #513

Open
Icedroid opened this issue Aug 2, 2023 · 5 comments
Open

piraeus-operator v1.10.6 piraeus-cs-controller crashLoop error #513

Icedroid opened this issue Aug 2, 2023 · 5 comments
Labels
v1 This affects only Operator v1

Comments

@Icedroid
Copy link

Icedroid commented Aug 2, 2023

I use helm install piraeus-operator v1.10.6, piraeus-cs-controller crashLoop with error as follow:

Operating system:   Linux, Version 4.19.90-24.4.v2101.ky10.x86_64
Environment:        amd64, 125 processors, 30688 MiB memory reserved for allocations

System components initialization in progress

Loading configuration file "/etc/linstor/linstor.toml"2023-08-02T14:13:02.081967097+08:00 
06:13:02.869 [main] INFO  LINSTOR/Controller - SYSTEM - ErrorReporter DB first time init.
06:13:02.872 [main] INFO  LINSTOR/Controller - SYSTEM - Log directory set to: '/var/log/linstor-controller'
06:13:02.919 [main] INFO  LINSTOR/Controller - SYSTEM - Database type is Kubernetes-CRD
06:13:02.919 [Main] INFO  LINSTOR/Controller - SYSTEM - Loading API classes started.
06:13:03.446 [Main] INFO  LINSTOR/Controller - SYSTEM - API classes loading finished: 526ms
06:13:03.446 [Main] INFO  LINSTOR/Controller - SYSTEM - Dependency injection started.
06:13:03.464 [Main] INFO  LINSTOR/Controller - SYSTEM - Attempting dynamic load of extension module "com.linbit.linstor.modularcrypto.FipsCryptoModule"
06:13:03.464 [Main] INFO  LINSTOR/Controller - SYSTEM - Extension module "com.linbit.linstor.modularcrypto.FipsCryptoModule" is not installed
06:13:03.465 [Main] INFO  LINSTOR/Controller - SYSTEM - Attempting dynamic load of extension module "com.linbit.linstor.modularcrypto.JclCryptoModule"
06:13:03.476 [Main] INFO  LINSTOR/Controller - SYSTEM - Dynamic load of extension module "com.linbit.linstor.modularcrypto.JclCryptoModule" was successful
06:13:03.476 [Main] INFO  LINSTOR/Controller - SYSTEM - Attempting dynamic load of extension module "com.linbit.linstor.spacetracking.ControllerSpaceTrackingModule"
06:13:03.477 [Main] INFO  LINSTOR/Controller - SYSTEM - Dynamic load of extension module "com.linbit.linstor.spacetracking.ControllerSpaceTrackingModule" was successful
06:13:04.769 [Main] INFO  LINSTOR/Controller - SYSTEM - Dependency injection finished: 1323ms
06:13:04.770 [Main] INFO  LINSTOR/Controller - SYSTEM - Cryptography provider: Using default cryptography module
06:13:05.101 [Main] INFO  LINSTOR/Controller - SYSTEM - Initializing authentication subsystem
06:13:05.589 [Main] INFO  LINSTOR/Controller - SYSTEM - SpaceTracking using K8sCrd driver
06:13:05.593 [Main] INFO  LINSTOR/Controller - SYSTEM - SpaceTrackingService: Instance added as a system service
06:13:05.594 [Main] INFO  LINSTOR/Controller - SYSTEM - Starting service instance 'TimerEventService' of type TimerEventService
06:13:05.595 [Main] INFO  LINSTOR/Controller - SYSTEM - Initializing the k8s crd database connector
06:13:05.596 [Main] INFO  LINSTOR/Controller - SYSTEM - Kubernetes-CRD connection URL is "k8s"
06:13:07.283 [Main] INFO  LINSTOR/Controller - SYSTEM - Starting service instance 'K8sCrdDatabaseService' of type K8sCrdDatabaseService
06:13:07.293 [Main] INFO  LINSTOR/Controller - SYSTEM - Loading security objects
06:13:07.490 [Main] INFO  LINSTOR/Controller - SYSTEM - Current security level is NO_SECURITY
06:13:07.621 [Main] INFO  LINSTOR/Controller - SYSTEM - Core objects load from database is in progress
WARNING: An illegal reflective access operation has occurred
WARNING: Illegal reflective access by com.sun.xml.bind.v2.runtime.reflect.opt.Injector (file:/usr/share/linstor-server/lib/jaxb-impl-2.2.11.jar) to method java.lang.ClassLoader.defineClass(java.lang.String,byte[],int,int)
WARNING: Please consider reporting this to the maintainers of com.sun.xml.bind.v2.runtime.reflect.opt.Injector
WARNING: Use --illegal-access=warn to enable warnings of further illegal reflective access operations
WARNING: All illegal access operations will be denied in a future release
Aug 02, 2023 6:13:09 AM org.glassfish.grizzly.http.server.NetworkListener start
INFO: Started listener bound to [[::]:3370]
Aug 02, 2023 6:13:09 AM org.glassfish.grizzly.http.server.HttpServer start
INFO: [HttpServer] Started.
[8.513s][warning][os,thread] Failed to start thread "Unknown thread" - pthread_create failed (EAGAIN) for attributes: stacksize: 1024k, guardsize: 0k, detached.
[8.514s][warning][os,thread] Failed to start the native thread for java.lang.Thread "grizzly-http-server-239"
06:13:10.138 [TaskScheduleService] INFO  LINSTOR/Controller - SYSTEM - LogArchive: Running log archive on directory: /var/log/linstor-controller
06:13:10.156 [TaskScheduleService] INFO  LINSTOR/Controller - SYSTEM - LogArchive: No logs to archive.
06:13:10.191 [Main] ERROR LINSTOR/Controller - SYSTEM - unable to create native thread: possibly out of memory or process/resource limits reached [Report number 64C9F3EE-00000-000000]

[8.570s][warning][os,thread] Failed to start thread "Unknown thread" - pthread_create failed (EAGAIN) for attributes: stacksize: 1024k, guardsize: 0k, detached.
[8.570s][warning][os,thread] Failed to start the native thread for java.lang.Thread "Logging-Cleaner"
time="2023-08-02T06:13:10Z" level=fatal msg="failed to run" err="exit status 199"
@WanzenBug
Copy link
Member

Your system seems to be overloaded, or you have set a restrictive resource limit:

[8.513s][warning][os,thread] Failed to start thread "Unknown thread" - pthread_create failed (EAGAIN) for attributes: stacksize: 1024k, guardsize: 0k, detached.

@WanzenBug WanzenBug added the v1 This affects only Operator v1 label Aug 2, 2023
@Icedroid
Copy link
Author

Icedroid commented Aug 2, 2023

overloaded, the controller will use how much memory?
Operating system: Linux, Version 4.19.90-24.4.v2101.ky10.x86_64
Environment: amd64, 125 processors, 30688 MiB memory reserved for allocations

how to know k8s node h whether have set a restrictive resource limit?

@WanzenBug
Copy link
Member

The machine/hardware itself seems fine. Did you set any

resources:
  limits:
    ...

on the Pod, or were they set automatically? Could you check the Pod YAML?

What seems more likely: your system has a too low /proc/sys/kernel/threads-max or /proc/sys/kernel/pid_max.

@Icedroid
Copy link
Author

Icedroid commented Aug 2, 2023

The machine/hardware itself seems fine. Did you set any

resources:
  limits:
    ...

on the Pod, or were they set automatically? Could you check the Pod YAML?

What seems more likely: your system has a too low /proc/sys/kernel/threads-max or /proc/sys/kernel/pid_max.

I am sure piraeus-cs-controller pod has no any resources.limits config.
the k8s node running pod :
$ cat /proc/sys/kernel/threads-max
4113123
$ cat /proc/sys/kernel/pid_max
4194304

@WanzenBug
Copy link
Member

You can check /proc/stat to get the number of processes running. But I guess this would not be the issue here. Perhaps a too low ulimit is set. Can you run ulimit -S -a and ulimit -H -a when the container starts up?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
v1 This affects only Operator v1
Projects
None yet
Development

No branches or pull requests

2 participants