Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Worker hung or process was killed (Allowed memory size exhausted) #2087

Closed
tiltroom opened this issue Apr 5, 2024 · 3 comments · Fixed by #2092
Closed

Worker hung or process was killed (Allowed memory size exhausted) #2087

tiltroom opened this issue Apr 5, 2024 · 3 comments · Fixed by #2092
Labels
Milestone

Comments

@tiltroom
Copy link

tiltroom commented Apr 5, 2024

Shlink version

shlinkio/shlink:4.0-roadrunner

PHP version

shlinkio/shlink:4.0-roadrunner

How do you serve Shlink

Docker image

Database engine

MariaDB

Database version

mariadb:10.8.3-jammy

Current behavior

When my shlink instance is under load it will often crash and requests will result in "Error 500, internal server error".
Looking at the logs it's something along the lines:

2024-04-05T09:10:49+0000 WARN server RoadRunner can't communicate with the worker
{"reason": "worker hung or process was killed",
"pid": 112,
"internal_event_name": "EventWorkerError",
"error": "sync_worker_receive_frame: Network:\n\tgoridge_frame_receive: validation failed on the message sent to STDOUT,
see: https://roadrunner.dev/docs/known-issues-stdout-crc/current/en,
invalid message: \nFatal error: Allowed memory size of 536870912 bytes exhausted (tried to allocate 20480 bytes) in /etc/shlink/vendor/symfony/cache/Marshaller/DefaultMarshaller.php on line 74\n"}

I get multiple of these with different PIDs, so assume the webserver is routing requests to dead processes.
This is running in docker on a VM with 16 cores and 32gb ram. At any given point there are 20+gb of free available ram and plenty of CPU.

Expected behavior

I think this is not supposed to happen.

Minimum steps to reproduce

This is my docker config.

shlink:
image: shlinkio/shlink:4.0-roadrunner
restart: always
environment:
- DEFAULT_DOMAIN=****
- IS_HTTPS_ENABLED='true'
- DB_DRIVER=maria
- DB_USER=shlink
- DB_PASSWORD=*****
- DB_HOST=docker_mariadb_1
- TIMEZONE=Europe/Rome
- ENABLE_PERIODIC_VISIT_LOCATE='true'
- REDIS_SERVERS=tcp://docker_redis_1:6379

@acelaya
Copy link
Member

acelaya commented Apr 6, 2024

Looks like the 512Mb of memory Shlink reserves are not enough.

I'm almost sure, this is memory reserved per worker, not shared between workers, but I need to verify this.

I should probably look for ways this value can be configured, as it is currently hardcoded. I remember increasing it the last time an error like this was reported some years ago.

If anything, maybe you could try defining a smaller amount of workers via WEB_WORKER_NUM env var. By default it creates one per available core.

@acelaya acelaya added this to the 4.1.0 milestone Apr 6, 2024
@acelaya acelaya moved this to Todo in Shlink Apr 6, 2024
@acelaya acelaya moved this from Todo to In Progress in Shlink Apr 7, 2024
@acelaya
Copy link
Member

acelaya commented Apr 8, 2024

I'm almost sure, this is memory reserved per worker, not shared between workers, but I need to verify this.

I can confirm this is correct. In fact, it would be better to set a higher amount of workers, not a lower one, as then, requests will be spread more evenly, and each worker will be able to consume up to 512Mb of RAM.

Next Shlink release will allow this value to be customized via env vars.

@acelaya acelaya changed the title worker hung or process was killed Worker hung or process was killed (Allowed memory size exhausted) Apr 9, 2024
@github-project-automation github-project-automation bot moved this from In Progress to Done in Shlink Apr 9, 2024
@acelaya
Copy link
Member

acelaya commented Apr 14, 2024

Shlink 4.1.0 has just been released, which allows the memory limit to be customized via MEMORY_LIMIT env var.

More info in the docs.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
Archived in project
Development

Successfully merging a pull request may close this issue.

2 participants