Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Bug]: mo killed by sys oom when run stability test for about 1 hour #7432

Closed
1 task done
aressu1985 opened this issue Jan 4, 2023 · 7 comments
Closed
1 task done
Assignees
Labels
Milestone

Comments

@aressu1985
Copy link
Contributor

Is there an existing issue for the same bug?

  • I have checked the existing issues.

Environment

- Version or commit-id (e.g. v0.1.0 or 8b23a93):8ec2c5cabfc687cb8481a435934b000118fdfb6f
- Hardware parameters:
- OS type:CentOS Linux release 8.3.2011
- Others:CPU 16 core, MEM 61G

Actual Behavior

when run stability test for about 1 hour , mo was killed by sys oom:

dmesg error log:
[Wed Jan 4 00:30:23 2023] oom-kill:constraint=CONSTRAINT_NONE,nodemask=(null),cpuset=/,mems_allowed=0,global_oom,task_memcg=/user.slice/user-1000.slice/session-3906.scope,task=mo-service,pid=1723184,uid=1000
[Wed Jan 4 00:30:23 2023] Out of memory: Killed process 1723184 (mo-service) total-vm:67329988kB, anon-rss:59584400kB, file-rss:0kB, shmem-rss:0kB, UID:1000 pgtables:123120kB oom_score_adj:500
[Wed Jan 4 00:30:26 2023] systemd[1]: Started /run/user/0 mount wrapper.
[Wed Jan 4 00:30:26 2023] systemd[1]: Created slice User Slice of UID 0.
[Wed Jan 4 00:30:26 2023] systemd[1]: Starting User Manager for UID 0...
[Wed Jan 4 00:30:26 2023] systemd[1]: Started Session 4248 of user root.
[Wed Jan 4 00:30:26 2023] systemd[1]: systemd-journald.service: Main process exited, code=dumped, status=6/ABRT
[Wed Jan 4 00:30:26 2023] systemd[1]: systemd-journald.service: Failed with result 'watchdog'.
[Wed Jan 4 00:30:26 2023] systemd[1]: systemd-journald.service: Service has no hold-off time (RestartSec=0), scheduling restart.
[Wed Jan 4 00:30:26 2023] systemd[1]: systemd-journald.service: Scheduled restart job, restart counter is at 3.
[Wed Jan 4 00:30:26 2023] systemd[1]: systemd-journal-flush.service: Succeeded.
[Wed Jan 4 00:30:26 2023] systemd[1]: Stopped Flush Journal to Persistent Storage.
[Wed Jan 4 00:30:26 2023] systemd-coredump[1780507]: Process 744338 (systemd-journal) of user 0 dumped core.
[Wed Jan 4 00:30:26 2023] systemd-coredump[1780507]: Coredump diverted to /var/lib/systemd/coredump/core.systemd-journal.0.12246ce9da934b58ac8c06ca1c0e1625.744338.1672763425000000.lz4
[Wed Jan 4 00:30:26 2023] systemd-coredump[1780507]: Stack trace of thread 744338:
[Wed Jan 4 00:30:26 2023] systemd-coredump[1780507]: #0 0x00007faf4053b276 journal_file_move_to_object (libsystemd-shared-239.so)
[Wed Jan 4 00:30:26 2023] systemd-coredump[1780507]: #1 0x00007faf4053d230 link_entry_into_array (libsystemd-shared-239.so)
[Wed Jan 4 00:30:26 2023] systemd-coredump[1780507]: #2 0x00007faf4053d8bd journal_file_append_entry_internal (libsystemd-shared-239.so)
[Wed Jan 4 00:30:26 2023] systemd-coredump[1780507]: #3 0x00007faf4053e99f journal_file_append_entry (libsystemd-shared-239.so)
[Wed Jan 4 00:30:26 2023] systemd-coredump[1780507]: #4 0x000056512f7aecfc dispatch_message_real (systemd-journald)
[Wed Jan 4 00:30:26 2023] systemd-coredump[1780507]: #5 0x000056512f7b745f server_process_syslog_message (systemd-journald)
[Wed Jan 4 00:30:26 2023] systemd-coredump[1780507]: #6 0x000056512f7ae2ad server_process_datagram (systemd-journald)
[Wed Jan 4 00:30:26 2023] printk: systemd-coredum: 6 output lines suppressed due to ratelimiting
[Wed Jan 4 00:30:26 2023] oom_reaper: reaped process 1723184 (mo-service), now anon-rss:0kB, file-rss:0kB, shmem-rss:0kB

Expected Behavior

No response

Steps to Reproduce

run stability test:
1、bvt loop run:
  nohup ./run.sh -n -g -p  $GITHUB_WORKSPACE/head/test/distributed/cases -e ddl -t 60 > bvt.log &

2、TPCH 1G loop run:
  nohup ./run.sh -q all -s 1 -t 100 > tpch.log &

3、TPCC 10w-50termimals 10 hours run:
  nohup ./runBenchmark.sh props.mo > tpcc.log &

4、sysbench mixed cases 10 hours run:
  nohup ./start.sh -c cases/sysbench/mixed_10_100000/ > sysbench.log &

Additional information

No response

@LeftHandCold
Copy link
Contributor

Running bvt tpch sysbench for a long time, goetty buf Alloc can take up a lot of memory

@LeftHandCold
Copy link
Contributor

WX20230105-095942@2x

@LeftHandCold
Copy link
Contributor

image

@LeftHandCold
Copy link
Contributor

A large number of clients have accumulated in the RoutineManager and have not been released

@w-zr
Copy link
Contributor

w-zr commented Jan 6, 2023

Should be fixed in #7480.

@aressu1985
Copy link
Contributor Author

fixed

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

6 participants